Extract, Transform and Load using Ingress Connector

Supercharging data migration and transformation at CASA

Download a detailed case study

Situation

The Civil Aviation Safety Authority (CASA) faced significant records management challenges. Legacy systems housed hundreds of thousands of images and records that were poorly managed, non-compliant with government regulations, and plagued by poor data quality. Manual metadata transformation consumed months of staff time and involved updating hundreds of separate import files. CASA required a streamlined solution to automate data transformation and integration into their records management system, OpenText Content Manager (CM). To address these issues, CASA engaged iCognition, experts in data migration, system integration, and transformation.

Task

CASA’s requirements for iCognition included:

  1. Deliver a solution to process massive volumes, including five-terabyte batches and high-resolution files up to a gigabyte each.
  2. Transform poor-quality metadata into a CM-importable format.
  3. Validate data prior to import.
  4. Match and update existing records and creating new records where no matches were found.
  5. Import files with hierarchical metadata, resulting in structured records in CM.

The overarching goal was to simplify bulk migration, enhance efficiency, and ensure compliance with governance standards.

Action

iCognition expanded their Ingress Connector into a comprehensive ETL tool, tailored to CASA’s needs:

  1. System assessment and solution design
    iCognition assessed CASA’s data quality and transformation needs and designed a solution to augment Ingress Connector. The aim was to integrate capabilities for transforming metadata and enhancing CM import functionality.
  2. Enhancing import capabilities
    The Ingress Connector was expanded to support:

    • Matching records using barcodes, foreign barcodes, and external IDs.
    • Validating imports to identify and address issues before production.
    • Archiving processed import files to document errors and allow reprocessing.
  3. Developing a transformation tool
    A transformation component was developed with key features:

    • Template-based transformations to standardise data manipulation across multiple CSV files.
    • Mapping CSV fields to multiple container and document records.
    • Splitting CSV field contents for multiple uses and deriving fields from combined data.
    • Replacing outdated metadata values with updated ones to prevent import errors.
  4. User Acceptance Testing (UAT) and training
    A UAT program ensured the solution met CASA’s operational needs. Feedback was used to refine the tool for optimal usability and compliance.

ETL use cases

CASA successfully implemented the solution for various projects:

  1. Data census
    A national census updated CM metadata based on local stocktakes. Paper records were assessed, metadata matched or created, and hierarchical record structures were established. The transformation component resolved data quality issues and enabled multi-step destruction processes.
  2. A0 map digitisation
    A stocktake of aerodrome maps required the upload of huge electronic records into CM. The solution handled data transformation, created hierarchical structures, and applied revisions with attached TIFF and PDF files.
  3. Registration certificates
    Over 300,000 airport registration certificates were imported. The tool created hierarchical records using unique keys derived from multiple CSV fields and resolved data quality issues.
  4. Learning Management System (LMS) data migration
    Historical LMS data was backed up and imported into CM. Parent-child relationships between people and documents were established, with transformation capabilities resolving data quality challenges.
  5. Resentencing
    Exported records were updated with new sentencing information using the transformation component and re-imported into CM, enhancing metadata quality.

Result

The expanded Ingress Connector delivered transformative results for CASA:

  1. Improved compliance: Streamlined data migration ensured quicker, error-free imports and higher metadata quality.
  2. Enhanced metadata: External identifiers minimised duplicates, improved search results, and supported hierarchical imports.
  3. Pre-import validation: Early error detection increased efficiency and reduced production environment issues.
  4. Automated templates: Template-based transformations ensured consistent, high-quality outputs with fewer errors.
  5. Metadata reuse: Reusing CSV fields across records improved metadata richness and search capabilities.
  6. Reduced storage costs: Effective record matching enabled paper resource destruction, lowering storage expenses.
  7. Increased efficiency: Automation freed staff to focus on core responsibilities, while retiring legacy systems reduced management overhead.

Testimonial

“The Ingress Connector supercharged our data management, enabling us to process massive volumes, including five-terabyte batches and high-resolution files up to a gigabyte each, with ease. It streamlined processes that once took months, maintaining metadata integrity while optimising system performance. By efficiently validating and organising data, we cleared up to eight terabytes of storage post-validation, reducing strain and improving agility. Its flexibility even supported off-peak uploads, transforming complex, large-scale data handling into a smooth, productive process. All this was done with a small team that achieved remarkable efficiency using the Ingress Connector.”
~ CASA representative

Conclusion

iCognition’s expansion of the Ingress Connector into a robust ETL tool revolutionised CASA’s data management. The solution streamlined legacy imports, enhanced compliance, and improved operational efficiency. CASA’s collaboration with iCognition demonstrated the value of strategic data transformation, delivering a scalable, efficient, and compliant records management solution.