Testing Services provides a platform for QA professionals to discuss and gain insights in to the business value delivered by testing, the best practices and processes that drive it and the emergence of new technologies that will shape the future of this profession.

« Thinking the Unthinkable: An AI approach to scenario prediction | Main | Testing Traditional Databases vs NoSQL Databases »

Allure of Cloud may fade away without proper data validation

Naju D. Mohan, Delivery Manager, Data Services, Infosys Validation Solutions

The need to validate integrity, correctness and completeness of data, residing in the cloud is increasing every second with the penetration of mobile devices and the inter connection of computing devices through internet. Cloud seems to be establishing itself as the best possible alternative to meet these data storage and processing demands. Data management capabilities of traditional data stores are revamped to meet the demands for huge volume and variety of data in cloud storage. This calls for new testing techniques and tools for ensuring 100% data validation.

The mad rush for capturing data generated by customers, opportunities to improve business decisions through data driven insights and the spurt in data storage costs are some of the driving factors for companies to move towards cloud. The companies have now started thinking about when and how to migrate to cloud rather than whether to move to cloud. The adoption of cloud by companies would be based on multiple factors. Presence of a proper QA strategy for data validation during cloud migration would be a primary deciding factor, which would help companies to retain a sustainable and competitive edge.

Triggers for cloud adoption and big data validation needs

·         Legacy modernization


Digital transformation is pushing companies to move away from legacy applications, most of which lack the agility to support the modern day consumer demands. A lot of these legacy applications require daily firefighting just to keep the business functioning. Once the companies decide to make a cloud transition, they might go for a hybrid strategy and retain some of the existing functionality with the legacy application and migrate some to the cloud. They may migrate only that functionality to the cloud, which requires interoperability with other cloud applications or the functionality which requires a total overhaul. This would make testing very tricky for legacy to cloud migration and the primary focus of data validation should be on data integrity testing.


o    Data integrity testing should verify the compatibility of existing data with new hardware, operating system and the new interfaces which are implemented for cloud integration

o    Data integrity should be ensured by doing unauthorized data access validations

o    The entire data and data files should be tested for integrity, as a subsystem within the old application functionality


·         Data Warehouse

It requires a total mind shift for companies to move the data stored within the walls of their organization in traditional data warehouse, to data warehouse on cloud, due to security and data migration concerns. Today most of the leading data management companies provide options for data warehouse on cloud like Amazon's Redshift, Microsoft's Azure SQL Data Warehouse, Teradata's Teradata Cloud and IBM's dashDB. A data warehouse hosted on cloud helps companies to reduce the setup and maintenance efforts compared to an on-premise data warehouse. The ease of distribution of data to geographically widespread departments within the organization and ability to derive quick analytical insights also prompt companies to migrate their on-premise data warehouse to cloud. All enterprises who have adopted a cloud infrastructure for hosting their data warehouse, would definitely require a testing strategy that takes care of validating the data movements to and from cloud.

o    Data migration testing during movement to a cloud data warehouse should try to identify the business logic implemented in stored procedures used in the legacy data stores. These should be converted into business rules in test cases and used to validate the complex data transformations.

o    Data Analytics and Visualization testing on cloud need to consider data integration nitty-gritties between on-premise and cloud data stores

o    Data Ingestion testing becomes critical as it requires to validate the merging of unstructured incoming data along with structured data for deriving valuable insights


·         Machine Learning and Analytics


Usage of Machine Learning to analyze data, find patterns and make predictions is fueling the race to store data which is getting generated every millisecond. This increases the demand for data stores to store this huge variety and volume of data. Enterprises move transactional data to cloud to overcome the challenges associated with collecting, analyzing and storing big data. The big players on cloud like Google, AWS and Microsoft provide cloud-based Machine Learning solutions. The need for data integration between cloud and on-premise becomes acute to get a complete picture of data patterns and utilize the machine learning solutions. Innovative test strategies have to be devised to meet the needs for machine learning and analytics.

o    Proper data quality testing has to be done to ensure the completeness and correctness of data before determining patterns

o    Artificial Intelligence based validation techniques have to be deployed to validate predictive models



Cloud adoption is more than just technology upgrade. It requires detailed planning and a phased approach to prioritize the business use cases and the associated functionality for migration to cloud. A few important points that would come in handy while handling cloud data validations are listed below

  • ·      Applications with higher regulatory and data privacy needs would need multiple iterations of testing and this additional time for testing needs to be factored in during the planning phase.
  • ·         Automated testing utilities with appropriate connectors for handling validation of special data file formats specific to cloud
  • ·         Appropriate validation strategy for integrated business processes spread across on premise and cloud 

Post a comment

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)

Please key in the two words you see in the box to validate your identity as an authentic user and reduce spam.

Subscribe to this blog's feed

Follow us on

Infosys on Twitter