Legacy Decommissioning (2/2)- Data Archival and Retrieval
In continuation to the earlier blog of decommissioning legacy systems where I explained the overall decommissioning process, this blog will address the data archival and retrieval approach in the design and execution phase of the legacy decommissioning.
As discussed in Blog 1, many organizations feel forced to keep aging legacy applications running, way beyond their life because they contain critical historical data that must stay accessible. This information may be needed for customer service or other operational reasons, or to comply with industry regulations. Yet keeping obsolete systems alive just to view the data, puts a real strain on resources. These applications steadily consume IT budget in areas such as maintenance charges, staffing and data center costs and in many cases it is over 50% of overall IT budget.
Data Archival during application decommissioning is the most cost-effective and simplest solution for keeping legacy data accessible for continuation of the business. Archiving complete legacy data at once is one of the best practice suggested during application decommissioning. This archived data can then be swiftly accessed online and can provide different views of the data for analytics or can be exported into different format when needed. During data archiving the data can be extracted from any legacy system and store it in a secure, centralize online archive. The data is easily accessible to end users, either from screens that mimic the original application or in a new format chosen by the business. The new infrastructure built for legacy data archival should be capable of moving all legacy data components, whether it is structured, unstructured, semi-structured, or even raw metadata, into a comprehensive online cloud based self-managed centralized repository. Infosys has proven methodology for archiving the data in this central repository.
Application decommissioning and data archival requirement are unique for every organization, Infosys with its proven framework and tools can help customers with below:-
1. Building a strong understanding of the current application data model2. Building data retention policies and retention requirements through strong domain knowledge.3. Building the innovative archival data model.
The above approach is largely adopted for mainframe legacy applications which had accumulated large volumes of data over many years in form of documents and images. Typical examples are
1. Billing records,2. Financial transactions3. Customer history. Etc.
The generic framework for a Data Archival process consists of following steps
* Data Extraction - Collect and extract the data from source data base into interim storage area i.e. Staging Area and perform data transformation where required to map the data format of the Target state e.g. EBCIDIC to ASCII, Alphanumeric to Numeric, Date format changes etc.
* Validation and Cleansing - Validate the schema of the target database such as Tables with all the constraints, indexes, views, Users along with their roles and Privileges are migrated as defined in business rules. Validate the contents of the data migrated to confirm the referential integrity is maintained in the target definition. If required, data cleaning is also performed for generation of golden records, removal of duplicate records, Cleansing of special char, spaces and emails.
* Transformation - Transform the data from the source to the target as defined in data mapping rules and lookup tables.
* Data Migration - Load the data into the target database using the data loader utilities / scripts and programs generated for loading the incremental data, multi lingual data and recovery of failure data.
Data archival solution designing for extracting and migrating the identified data source is usually done collaboratively with customer DBA's and SME's. Complete analysis of the application is done to identify the following information:
* Understand the current data Model in the legacy application
* Identify the unknown data relationships in current model.
* Creating retention policies of the data identified for data archival
* Data extraction with required transformation in the application context
* Identifying the key fields for indexing to search the required data efficiently
* Validating the target database schema and data contends of the archived data.
* Creating interface to access archived data and reports independent of the application
* Applying the application, entity, records level retention policies based on organizational requirements.
Based on above analysis the target data model will be generated and each target entity will be identified as business object and will have the data for the object. The business object is defined keeping the regulatory and business needs into consideration.
Based on organizations requirements and details collected during impact analysis, Infosys will suggest best approach to archive the data by evaluating multiple factors such as quantity of data to be archived and actual requirement of the data archival, two options are
1. Complete data archival process at once2. Partial data archival over through multiple releases.
Data retrieval is to ensure the archived data is accessible anytime for the audits and analytics for normal business operations. Also the data is secured and can be accessed as per the user roles and privileges. Two most common strategies are Hot retrieval and Cold retrieval of the data. In Hot retrieval data is accessed immediately based of few keywords. In Cold retrieval data is accessed via reports and service request when specific view of data is needed.
Following are few ways through which we can retrieve the archived data:-
- Data retrieval through keyword search and business objects to give full application context.
- Custom generated reports using Data integration capabilities.
- Data retrieval using standard interfaces to the databases such as ODBC/JDBC and SQL
- Using enterprise reporting tools such as Crystal Reports
- Data archival and retrieval using third party products such as IBM Optim, Infomatica etc.