Once you've retrieved CAS application data and documents, you'll want to load them to your target local system so that you can continue your business process. Before loading, you may need to transform the data and documents so that your local system can load it properly. A transformation is a change to the format or contents of data or documents being transferred from one system to another. When thinking about transformations, it's necessary to consider both how data and documents are delivered from Liaison's platform and how your local systems load data and documents.
Do I need to do any transformations?
To determine if transformations are required for your data integration, you'll need to:
- Understand how to load data/documents to your target local system.
- Understand how data/documents are delivered from your chosen Liaison tool.
Start by understanding your target local system's requirements for uploading data or documents. With those requirements in mind, review the Liaison extraction tools available to you. The ideal situation involves no transformations: you're able to extract the data or documents exactly as your local system requires for upload. Read on for more details on what to look for when researching load and extract tools.
What kinds of transformations are there?
Broadly, there are two categories of transformation, dealing with two aspects of the data or documents being transferred:
Format transformations deal with the container that the data or documents are shipped in. The first possible transformation point is the transfer method: will the data/documents be sent to the local system as files or as digital objects? For example, data could be delivered as a flat file (tab-delimited text, CSV, etc.) or as a digital object (JSON, XML, etc.). Liaison's tool can deliver data and documents in any of these ways; the operative question is: what transfer method works best for your local system?
Once the transfer method has been determined, the next transformation point is to determine the format of the files or objects used to transfer the data/documents. If going with files, what file format is required by the target local system? Common file formats for transferring data include CSV and delimited text files (delimited by tabs, pipes, or other character). Common file formats for transferring documents include PDF and TIFF. Additionally, documents can be delivered in batches in a ZIP archive. Some systems can ingest ZIP archives, while others require a specifically formatted ZIP archive including a manifest file (called DIP format), while still others can only ingest individual document files. Consider what file format is delivered by Liaison's tools, and what file format is required by the target local system.
If you choose to transfer digital objects instead of files, what type of object is required by the target local system? Common digital objects for transferring data include JSON and XML objects. Standard object notation languages are well understand by major coding languages; it will be straightforward to transform one type of object to another using a major coding language. Again, consider how data is delivered by Liaison's tools, and what type of object is required by the target local system.
Since Liaison's extract tools are flexible, format transformations can often be avoided. For example, if your target local system can only import CSV files, it's possible to configure an extract from Liaison's platform that is delivered as a CSV file. Plan your approach carefully to minimize friction during transfer.
Content transformations deal with what the data looks like. This type of transformation isn't relevant to document transfer. One significant transformation point is the layout of the data, especially if the local system requires a flat file upload. Some major SISs require that imported files match very specific file layouts, with certain columns in specified positions within the file. Furthermore, some local systems require that the headers on imported files match specified names. Liaison's tools are versatile, so you will likely be able to generate a file that meets these file layout requirements. Recognize that if you retrieve data as a digital object but must load a flat file, you will have to create a flat file matching your local system's file layout requirements in between the extract and load steps.
A second, and more subtle, kind of file layout transformation involves the structure of the data. Generally, databases store major genres of information separately, joining them together when presenting them to the user. To give a higher education admissions example, prior education information is usually stored separately from contact information, and those pieces of information are tied to the relevant person and application records, also stored separately. The way that these major genres of information are delivered by Liaison's tools may not match the way your local system expects to receive them. For example, if Liaison's tool delivers previously attended college information as separate columns with a single row for each applicant, but your local system requires that previously attended college information be delivered with one row per college, you'll have to pivot that data from columns to rows.
The next major kind of content transformation to consider concerns the actual data itself. Your local system may require that values for a certain field match a specified list of possible values. For example, your local system might require that "male" and "female" be uploaded as "M" and "F". Liaison's tools offer some ability to transform values at the point of extraction, but if you're unable to do so with your chosen Liaison tool, you'll have to perform this transformation in between the extract and load steps.
Continuing with data content transformations, your local system likely has data storage restrictions on its fields. Fields in your local system may be limited to containing values that conform to specified data types. For example, a field containing GRE Quantitative scores may only be able to store integer-type values, in which case, loading a string type value would cause some sort of problem (either the load will fail or the data loaded will be incomplete). Additionally, fields may be limited to the amount of data they can store. For example, a street address field may be limited to containing 50 characters, in which case, loading values with more than 50 characters will cause some sort of problem (either the load will fail or the data loaded will be incomplete).
Finally, you'll want to determine whether your local system requires that certain data types be formatted in specific ways. Formatting requirements of this type often appear in fields containing dates and times. For example, your local system may require that dates conform to the format "yyyy-mm-dd" (e.g. "2019-10-01"), while the format of delivered dates is "m/d/yy" (e.g. "10/1/19"). Again, Liaison's tools offer some ability to transform values at the point of extraction, but if you're unable to do so with your chosen Liaison tool, you'll have to perform this transformation in between the extract and load steps.
What transformations does my local system require?
Armed with information about what kinds of transformations you may have to perform and with a solid understanding of both your local system's loading capabilities and your chosen Liaison tool's extraction capabilities, you can determine what kinds of transformations you'll need. Again, the ideal situation requires no transformation at all: the data or documents are extracted in exactly the way your local system requires for upload. The key here is to start by thoroughly reviewing how to load data/documents to your target system. Those requirements will inform your choice of Liaison extract tool and determine the transformations necessary.
How do I perform the necessary transformations?
Not considering transforming the data/documents within Liaison's tool at the point of extraction, there are two basic approaches for transforming data/documents:
- Transform the extract programmatically in memory.
- Use an intermediary database.
Transform the extract programmatically
For this approach, you'll likely use a single script to extract, transform, and deliver the data/documents to your local system for upload. For more guidance on loading data, see the "Loading Your Data" section of this documentation.
Use an intermediary database
For this approach, you'll load the raw extract to a database (or set of tables) separate from the final destination for the data. Once you've loaded the raw data to this intermediary database, you can write queries to extract it from there in exactly the format required to load to your local system. For more guidance on loading data, see the "Loading Your Data" section of this documentation.