Loading Data into Local Systems

Last updated
Save as PDF

After retrieving CAS application data from Liaison's platforms, your next step is to load it to your local systems. Setting aside any necessary transformations you might need to perform on the data (see Working with Data Transformations), you have two options for loading data to local systems:

Use the local system's standard tool for loading data.
With direct access to the local system's database, write directly to the relevant tables.

Option 1: Standard Loading Tool in Local System

Student Information Systems (SIS)/Enterprise Resource Planning systems (ERPs), Customer Relationship Management systems (CRMs), and Document Management Systems (DMSs) always provide tools for loading new data. These standard loading tools are a convenient way to get CAS application data into your local system.

Is this approach possible for you?

Find the expert on campus in your target system and ask the following questions:

Does the local system have a standard tool for loading data/documents?
Is the standard tool already used to load data/documents from other sources?
Are there people on campus who can help configure the standard tool?
If so, will they be able to help in time to meet my project timeline?

If you get a "No" in response to any of these questions, that doesn't mean you can't pursue this approach. If using the standard loading tool for your target system is new on campus, the responsibility for figuring out how to use it will fall to you and your team. Your local system will have resources to help you do that.

Why should I choose this approach?

Think carefully before choosing an approach for loading data to your local system. Consider both short-term goals (e.g., getting your CAS application live and accepting applications) and long-term goals (e.g., maintaining your data integration for years to come) when choosing an approach.

Pros

Accessible: the standard tool can often be configured and used on a day-to-day basis by admissions staff.
- That means that you and your team can control the loading of data/documents to your target system. Should there be any changes to your business process, you can accommodate those changes quickly in your data integration because you control the loading of data/documents.
- Furthermore, you can likely get the standard tool configured faster than if you were to require assistance from another department (such as IT).
Standard Process: the standard tool, designed and built by the creator of the target system, will load data/documents so that they fit seamlessly into existing workflows you've already established.
- You can count on the standard loading tool to produce records that behave as expected and just as they would if they were created by hand within the local system itself.

Cons

Not Flexible: the standard tool will likely have well-defined limitations on the format/type of data/documents it can successfully load.
- You'll likely have to do some work to make the extracted data/documents match the required format of the standard tool.

Configure the Standard Tool

Your target system's standard tool will fall into one of these two categories:

Requires a fixed format for incoming data/documents and has predefined destinations.
Can accommodate a variable format for incoming data/documents and allows the user to define destinations.

Requires a fixed format for incoming data/documents and has predefined destinations.

If your local system's standard load tool requires a fixed format and has predefined destinations, you'll need to manipulate the extract to make sure it meets those requirements. You can accomplish some of the necessary transformations through configuration. You can define some aspects of your programs in the CAS application to make it easier to load to local systems. For example, you can assign export codes to custom question response options that line up with your local system's codes. See resources on configuring programs compiled in the Collecting Your Data section. For any required transformations that you can't take care of in configuration, you'll need to transform the extract before loading it with your fixed-format standard tool. See the Working with Data Transformations section for further guidance.

Can accommodate a variable format for incoming data/documents and allows the user to define destinations.

If your local system's standard load tool can accommodate variable formats and allows you to define destinations, you'll be able to load the extract as-is. The standard tool will have a way to define where and how to load the incoming data/documents. For data, you'll define the following:

Field Mapping: define which fields in your local system incoming fields should load to (e.g., "firstName" should load to "First Name").
Value Mapping: for defined-option fields, specify what values in the local system should be assigned to incoming values (e.g., "AK" should be loaded as "Alaska").
Static Mapping: define any local system variables that should be set the same way for all records arriving from this source (e.g., "EngineeringCAS" set as the source for all records from EngineeringCAS).

For documents, you'll define the following:

Record Matching: define which metadata fields will allow your local system to match the document to the appropriate person and application records.
Document Type: define which metadata field(s) indicate what type of document you are loading.
Document Details: define which metadata field(s) provide relevant details about the document you are loading.

Deliver Extract to the Standard Tool

Depending on your local system's standard tool, you'll either deliver the extract as-is or manipulate it prior to delivery (see above). No matter which type of standard load tool your system provides, you'll have to figure out how to deliver data/documents for loading. Common on-campus systems provide a variety of delivery channels: user interface (manual), SFTP, and API. You'll almost always be able to manually load a file through your local system's user interface (think an Upload File button). On top of the out-of-the-box delivery channels, your institution may have established standard delivery channels: a shared directory or an intermediary database. Understanding how to deliver data/documents to the standard loading tool will be essential to building a successful data integration.

If your local system's standard loading tool accommodates flexible formats, all you'll need to do is deliver the extract to the standard tool for loading. If, on the other hand, you have to transform the extract before delivery, you have two options for doing so:

Transform the extract programmatically in memory.
Use an intermediary database.

Transform the extract programmatically.

For this approach, you'll likely use a single script to extract, transform, and deliver the data/documents to the standard loading tool. In brief, you'll represent the original extract as an object in memory in your code and manipulate that object to produce a file that meets the requirements of your local system's standard loading tool. You may want to retain the original extracts for troubleshooting.

Use an intermediary database.

For this approach, you'll load the raw extract to a database (or set of tables) separate from the final destination for the data. Once you've loaded the raw data to this intermediary database, you can write queries to extract it from there in exactly the format required by your local system's standard loading tool.

Option 2: Write Directly to the Target System's Database Tables

Local systems—SIS/ERP, CRM, DMS—usually provide direct access to the databases behind the scenes. This access allows institutional users to make changes directly in the database instead of using the user interface. The department most likely to have this kind of access is the IT unit responsible for supporting and maintaining the system. Additionally, many third party integration middleware companies operate at this level of access. This level of access presents an alternative method for loading CAS application data to your local system.

Is this approach possible for you?

Find the IT unit responsible for supporting your target system and ask the following questions:

Do you have write-access to the target system's database?
Do you load data from other sources to the target system by writing directly to the database?
Are there people on campus who can write code to do that?
If so, will they be able to help in time to meet my project timeline?

If you get a "No" in response to any of these questions, you will not be able to proceed with this approach.

Why should I choose this approach?

Pros

Versatile: with this approach, you can load anything in any way to any location in your target system.
- All of the necessary transformations are built into the load. With direct database access, data can be written to any location in the local system, so you're not restricted in what data points you can load.

Cons

Requires Software Developer: you can't pursue this approach without dedicated software engineering resources.
- In addition to the upfront requirement, any changes or regular maintenance will require involving a software developer.
- Involving other departments often extends timelines. Adding to this potential delay is the fact that at many higher education institutions, turnaround time on software development projects is very slow.
Unexpected Behavior in Local System Possible: bypassing the target system's user interface and standard loading tool can result in records that behave in unexpected ways.
- The user interface and standard loading tool are designed to create records that fit seamlessly into the local system; often, they do a lot of extra little things when creating new records. Replicating all those extra little things is hard, and missing just one can result in problematic behavior that interrupts your admissions business process.

Configure Load to Target System's Database

Your first step to loading directly to the target system's database will be to identify the relevant tables. You'll want to identify the base tables where the most important data will be loaded (e.g., person table or application table). Beyond that, you'll also want to identify any subsidiary tables that are automatically updated when similar actions are performed in the user interface or with the standard loading tool. For example, creating a new person record in the user interface might automatically record that transaction, with essential details like the user who performed the operation and when it occurred, in a separate transaction table; for consistency, you'd likely want to replicate that record when writing directly to the database.

Once you've identified the relevant tables in the target system's database, you'll want to match the data fields in the extract to fields in the local database. While much of this matching will be straightforward (e.g., first name to first name), some data elements are a little more complex, involving nesting and different record scopes. It's important to take the time to thoroughly understand the structure of the extract data and the target system's database in order to load data properly. For example, data about an applicant's previously attended colleges may be delivered in the extract as separate columns in a single row for the application, but stored in the local database as multiple rows (i.e., one row per college attended) in a single table, separate from the application table. As another example, the extract could be delivered as a nested JSON string, understanding the parent-child node relationships will be critical to successfully loading the data.

With the relevant tables identified and the fields cross-walked, you'll next want to understand what sort of database operation to perform: insert, update, or delete. To make this decision, you'll need information from the extract and the target database. The extract may tell you what kind of operation is required (e.g., data is sent upon update to an application or deletion of an uploaded document from an application). Beyond that, comparing what is present in the target database with the incoming extract will allow you to determine whether to insert or update. Modern databases support an "upsert" operation that is smart enough to recognize when a matching record exists and update if it does or insert if it doesn't.

Now that you've figured out how to get the data into your local system, you'll have to work out how to extract, transform, and deliver the data for loading. Your understanding of the extract's data structure and your target database's structure will be of great use in determining how the extracted data should be transformed. You can handle transformation in one of two ways:

Transform the extract programmatically.
For this approach, you'll likely use a single script to extract, transform, and load the data/documents to your target database. In brief, you'll represent the original extract as an object in memory in your code and manipulate that object to pull out elements to load to each table in the target database. You may want to retain the original extracts for troubleshooting.
Use an intermediary database.
For this approach, you'll load the raw extract to a database (or set of tables) separate from the final destination for the data. Once you've loaded the raw data to this intermediary database, you can write queries to move it to the final destination tables. You'll either use queries to operate directly in a single target database (with the raw data loaded to staging tables), or you'll use queries to retrieve data from a separate intermediary database and store those as objects in your program that will subsequently be loaded to the target database.

With the transformed data, you'll then load it to the relevant tables. You can perform the loading either using code to interact with the database or using queries to act within the database itself, depending on whether you load the raw data to staging tables in the target database.

Once you've done the planning work—found the relevant target database tables, cross-walked the fields, understood the structure of the extract data, identified the appropriate database operations, and planned out the extract, transform, and delivery steps—your last step is to capture that logic in a programming language. This choice is entirely at your discretion. Most IT shops will have some standards that developers adhere to. Some SIS/ERPs have standard languages where most development is handled.

Standard Loading Tools for Common On-Campus Systems

Name	Type	Load Tool	Accepted Formats for Data	Documents?	Accepted Document Formats
Banner	SIS/ERP	SRTLOAD	Flat File
PeopleSoft	SIS/ERP	File Parser	Flat File
Slate	CRM	Source Format	Flat File	Yes	.zip archive with PDFs and index file