Testing Services provides a platform for QA professionals to discuss and gain insights in to the business value delivered by testing, the best practices and processes that drive it and the emergence of new technologies that will shape the future of this profession.

« March 2017 | Main | October 2017 »

July 12, 2017

Allure of Cloud may fade away without proper data validation

Naju D. Mohan, Delivery Manager, Data Services, Infosys Validation Solutions

The need to validate integrity, correctness and completeness of data, residing in the cloud is increasing every second with the penetration of mobile devices and the inter connection of computing devices through internet. Cloud seems to be establishing itself as the best possible alternative to meet these data storage and processing demands. Data management capabilities of traditional data stores are revamped to meet the demands for huge volume and variety of data in cloud storage. This calls for new testing techniques and tools for ensuring 100% data validation.

The mad rush for capturing data generated by customers, opportunities to improve business decisions through data driven insights and the spurt in data storage costs are some of the driving factors for companies to move towards cloud. The companies have now started thinking about when and how to migrate to cloud rather than whether to move to cloud. The adoption of cloud by companies would be based on multiple factors. Presence of a proper QA strategy for data validation during cloud migration would be a primary deciding factor, which would help companies to retain a sustainable and competitive edge.

Triggers for cloud adoption and big data validation needs

·         Legacy modernization


Digital transformation is pushing companies to move away from legacy applications, most of which lack the agility to support the modern day consumer demands. A lot of these legacy applications require daily firefighting just to keep the business functioning. Once the companies decide to make a cloud transition, they might go for a hybrid strategy and retain some of the existing functionality with the legacy application and migrate some to the cloud. They may migrate only that functionality to the cloud, which requires interoperability with other cloud applications or the functionality which requires a total overhaul. This would make testing very tricky for legacy to cloud migration and the primary focus of data validation should be on data integrity testing.


o    Data integrity testing should verify the compatibility of existing data with new hardware, operating system and the new interfaces which are implemented for cloud integration

o    Data integrity should be ensured by doing unauthorized data access validations

o    The entire data and data files should be tested for integrity, as a subsystem within the old application functionality


·         Data Warehouse

It requires a total mind shift for companies to move the data stored within the walls of their organization in traditional data warehouse, to data warehouse on cloud, due to security and data migration concerns. Today most of the leading data management companies provide options for data warehouse on cloud like Amazon's Redshift, Microsoft's Azure SQL Data Warehouse, Teradata's Teradata Cloud and IBM's dashDB. A data warehouse hosted on cloud helps companies to reduce the setup and maintenance efforts compared to an on-premise data warehouse. The ease of distribution of data to geographically widespread departments within the organization and ability to derive quick analytical insights also prompt companies to migrate their on-premise data warehouse to cloud. All enterprises who have adopted a cloud infrastructure for hosting their data warehouse, would definitely require a testing strategy that takes care of validating the data movements to and from cloud.

o    Data migration testing during movement to a cloud data warehouse should try to identify the business logic implemented in stored procedures used in the legacy data stores. These should be converted into business rules in test cases and used to validate the complex data transformations.

o    Data Analytics and Visualization testing on cloud need to consider data integration nitty-gritties between on-premise and cloud data stores

o    Data Ingestion testing becomes critical as it requires to validate the merging of unstructured incoming data along with structured data for deriving valuable insights


·         Machine Learning and Analytics


Usage of Machine Learning to analyze data, find patterns and make predictions is fueling the race to store data which is getting generated every millisecond. This increases the demand for data stores to store this huge variety and volume of data. Enterprises move transactional data to cloud to overcome the challenges associated with collecting, analyzing and storing big data. The big players on cloud like Google, AWS and Microsoft provide cloud-based Machine Learning solutions. The need for data integration between cloud and on-premise becomes acute to get a complete picture of data patterns and utilize the machine learning solutions. Innovative test strategies have to be devised to meet the needs for machine learning and analytics.

o    Proper data quality testing has to be done to ensure the completeness and correctness of data before determining patterns

o    Artificial Intelligence based validation techniques have to be deployed to validate predictive models



Cloud adoption is more than just technology upgrade. It requires detailed planning and a phased approach to prioritize the business use cases and the associated functionality for migration to cloud. A few important points that would come in handy while handling cloud data validations are listed below

  • ·      Applications with higher regulatory and data privacy needs would need multiple iterations of testing and this additional time for testing needs to be factored in during the planning phase.
  • ·         Automated testing utilities with appropriate connectors for handling validation of special data file formats specific to cloud
  • ·         Appropriate validation strategy for integrated business processes spread across on premise and cloud 

July 6, 2017

Thinking the Unthinkable: An AI approach to scenario prediction

Every now and then QAs are confronted with the uncomfortable situation where a defect is overlooked and it makes its way to the higher environments. This happens despite QA team having supreme understanding of the system under test. Due to the tremendous complexity of the real world applications and the sheer lack of resources (especially time), many flows and behaviors inherent in the application may not be tested and the authenticity of such behaviors remains dubious.  Also, curiously novice users are able to find defects that expert users cannot. Expert users of the application suffer from a syndrome that can at best be described as hindsight bias, wherein they tend to be blind towards the possibilities in the application that they are not used to.

One way to deal with the above mentioned scenario is to hire more and more QAs so that there are more pair of eyes looking at the application hence trying out and finding out more hidden behaviors that can potentially be defects. Due to the cost issues involved, this approach is not very practical. Another approach is that if we can try to simulate the users with different thinking patterns. This approach puts us in the realm of Artificial Intelligence.  We have developed a Genetic Algorithm (GA) for black box software testing that can do just that.

Genetic Algorithms are the heuristic method of optimization that simulates survival mechanics of genes during the course of evolution. They are based on the mechanics of survival such that these string structures that are linear yet randomized, are able to exchange information to form a search algorithm. An initial group of random individuals (population) evolve according to a fitness function that determines the survival of the individuals. The algorithm searches for those individuals that lead to better values of the fitness function through a selection, mutation and crossover genetic operations.

Brief overview of System Under Test:

Our system under test was an incident logging system. There were few requisites  that needs to be fulfilled before user can successfully create an incident. Few of them are:

1)      User Rights and Associations: User with correct rights and associations should be able to create the incident only in their associated domains and should not infringe outside their domains.

2)      Routing mechanism: Un-authorized users should not be able to access create incident screen either through user action on menu items or directly through URL.

3)      Presence/Absence of Data: Presence of data should be there for mandatory fields before user can create an incident. However, there should not be any checks like this for optional fields.

4)      Dependent fields: Dependent fields should not get the data before their parent fields being populated.

5)      Authenticity of Data: User should not be able to create the incident without the correct data in the fields that have look-up value checks.

Synopsis of the Solution:

 Our algorithm placed the factors outlined above as the bits on the string much like DNA. Each factor was recognized by the place it holds on the string structure.

The objective of the GA was to highlight any false positives i.e. the flows in the application that lead to successful creation of the incident when they should not have been created. For this purpose, we gave higher weights to the factors that were less likely to create an incident. For example, a user (A) with CREATE rights is more likely to create an incident successfully then a user (B) that have just READ rights. So, in this case user B will get significantly higher value of weight than user A as he/she is less likely to create an incident. Same kind of weight distribution was done for other factors as well.

Finally, an equation was created that will calculate the fitness value of the individual string structure. The equation was designed to give higher values of fitness to the individuals that resulted in successful creation of incident even though when the odds of such a thing happening was less. The fitness value of first generation of 10 strings were calculated and then mutation and cross-over events were performed to create the offspring of 10 strings. If the parent was having higher value of fitness then it was retained else child was selected. Once this selection process was over, again the same steps were performed to generate next generation. This process was repeated till the n­th generation. After which human analyst can look at the resultant strings and verify whether these flow really are defects.

Advantages of GA based software testing over traditional testing:

  1.  Able to explore various scenarios hence improved coverage of the system, many of the scenarios may not even cross the mind of the human QAs.
  2. Increasing the Defect Detection Efficiency leading to improved Quality of the system.
  3. Easy to change the directionality of search to find new defects by changing the weights associated with the factors.
  4. Substitute for human testers with the advantage to work on non-business hours including weekends.
  5. High risk area in the application can better be covered by increasing the population size and number of generations.

Associated Challenges:

  1. Technical expertise required to build the framework.
  2. Clear understanding of factors affecting the application is required. It is still not Subject Matter Expert (SME) agnostic. The bias of the SMEs may affect the outcome of the runs.
  3. Hardware intensive. Procuring the advanced computation resources may not be easy in the client environment.

This is a humble attempt to increase the software quality and provide end users with less grief due to missed defects without escalating the project costs.  In the last few years there has been significant involvement of Artificial Intelligence related technologies in most of the aspects of software yet software testing seems to have been eluded by the riches it provides. I sincerely hope that software testing has lot to gain from the advances in the AI making software solutions and products more reliable and efficient over the passage of time.