Testing Services provides a platform for QA professionals to discuss and gain insights in to the business value delivered by testing, the best practices and processes that drive it and the emergence of new technologies that will shape the future of this profession.

« Hadoop based gold copy approach: An emerging trend in Test Data Management | Main | Starwest 2016- Infosys is a Platinum sponsor »

Trends impacting the test data management strategy

Author: Vipin Sagi, Principal Consultant

Test data management (TDM) as a practice is not new. It has been around for a few years now. But only in the last decade, has it evolved at a rapid pace with mature IT organizations ensuring that it is integrated into the application development and testing lifecycles. The key driver for this has been technology disruptions, compelling IT organizations to deliver software faster with high quality and at low cost.

TDM as a practice is being impacted by three major trends - the digital revolution, increase in the adoption of data analytics, and pressure from the business to delivery software faster. Test data architects and consultants should have a measure of these changes while creating and implementing enterprise test data strategies. Some key trends and solution patterns that must be considered while creating effective test data strategy are agile, DevOps, big data, cloud, service virtualization with TDM, domain-specific solutions and accelerators, and automation to reduce time-to-market and improve efficiency. These trends are very visible in product vendors roadmaps and is an expectation from the IT organization during their TDM journey. Many product vendors are investing heavily or acquiring niche players to offer capabilities and solutions on these trends while service providers are focusing on building domain solution and accelerators from their vast experience to deliver services faster and cheaper.


In agile delivery methods, the software should be delivered faster in short delivery cycles [Sprint] and with high quality. The typical sprint planned is for about 3-4 weeks and testing gets approximately two weeks. In each sprint, the test teams create, run and automate tests for which the test data management team may need to perform data refresh and provide the test data needed to execute the tests, both manual and automated. Test data architects have to ensure optimal data sets with maximum test coverage and accelerated solution to refresh data and provision test data in the environment for testers to complete testing on time. Frequently, situations arise where the test data strategy should be able to support multiple data refresh and provisioning in parallel.
Computer Associates (CA)offers a wide range of solutions which can be integrated with the CA test data maker to support agile capabilities.


The adoption of DevOps (agile methodology) practice mandates tighter collaboration of various teams involved in the software delivery lifecycle including development, testing, operations, and release management. Program or product managers have to continuously deliver software which is faster and of high quality by continuous integration.

Test data architects needs to focus on the below factors to enable TDM with DevOps:

  • Leverage database virtualization in TDM for faster provisioning of non-production versions of databases virtually
  • Self-service capability for testers to refresh and provision test data from test data warehouses based on their need and thereby, increasing their productivity 
  • Solutions to integrate test automation tools with test data management to enable the association of test cases with the right test data and seamless execution of regression tests using automation after every build in a continuous integration model

Test data management solution from Delphix and Actifio offers creation of virtual databases with masking and self-service capability to enable continuous integration.

Big data

Big data is a visible technology-driven movement and its strategic relevance is increasing day by day. As the technology supporting big data is evolving rapidly and adoption is increasing, test data architects and consultants will need data privacy and security solutions, which will be a challenge as these platforms deal with high volume heterogeneous data sets.

In fact, IT organizations have broadened their data privacy and regulatory compliance initiative by including non-production environments (test and development) into the scope due to growth in breach cases and complexity that big data applications induce with its data variety and volume. In the current state, testing a big data application itself is a challenge due to lack of proven test frameworks and tool sets. Hence, architects must manage test data and regulatory compliance in their test data strategy. Most organizations are creating small clusters of Hadoop environments for development and testing and the test data is being managed by using data integration platforms and custom solution build using scoop, HiveSQL, and Pig Latin scripts.

lBM and informatica offers test data management solutions for Hive.


The typical test data management objectives remain the same for applications hosted on cloud and on-premise. However, cloud may introduce additional challenges which test data architects must consider during their strategy creation or enhancement. These include:

  • Non availability of TDM tools support for SaaS platforms / solutions
  • Network latency that needs to be accounted for during data provisioning for cloud applications

Test data architects need to take care of two aspects: one, in-house applications hosted on cloud and two, third-party SaaS services. SaaS service providers restricts direct access to the database layer. Here, TDM tools may not come in handy. TDM has to take help from SaaS vendors to extract and import data which needs to be accounted for while creating the test data strategy. In case of an in-house application hosted on the cloud, the TDM team needs to provision large volumes of data to perform load and stress testing. This will induce network traffic and latency challenges which the test data architects need to address through custom solutions that run on cloud.

Test data management products vendor such as Actifio, Delphix, Informatica, Solix, IBM and CA offers product versions to run on cloud as SaaS solutions. Also, cloud platform providers offer on-demand solutions for test data refresh and provisioning which can also be looked at as part of the test data strategy.

Test data management with service virtualization

Testing today's complex applications need realistic, reliable, complete test environments which can be provisioned faster on demand. In the world of agile and DevOps, this is needed more while being more challenging. Testers have to perform integration tests on the application with different code versions and test data combinations, and with other applications (in-house or vendor), which is not available currently. Hence, test data architects may need solutions to virtualize application behavior with predefined test data combination (input and output) and orchestrate it with the application under test. Testers will have to create automated tests and run them as part of their sprint execution or continuous integration. Many organizations are leveraging service virtualization with proper test data configuration as an approach to minimize costs in the non-production environment. CA leads this category by offering an integrated platform of CA service virtualization with CA test data manager.

Domain solutions and accelerators

Test data architects in general, focus more on test data services such as data masking, data sub setting, data creation, data provisioning, and more, along with tools and process aids to support the creation of the enterprise test data strategy. However, domain knowledge is key to determine efficient approaches for identifying data subset criteria, masking methods, and data creation techniques to ensure data integrity. Test data consultants should focus on building domain-specific solutions such as:

  • Pre-defined patterns to identify sensitive fields based on the data privacy and regulatory compliance requirements in a specific industry (HIPPA Regulation in healthcare)
  • Pre-delivered data-packages for master data domain in a specific industry (patient, prescriber, pharmacy, etc. in healthcare) for masking and synthetic data generation, which can be integrated with TDM tools for faster and efficient test data delivery

In fact, many service providers are offering joint TDM solutions along with tool vendors. Test data architects should focus on building out-of-the-box TDM solutions for packaged products such as Oracle ERP, SAP, etc. and products and platforms such as Gin insurance, Nasco and RxClaim, etc. in healthcare.

A few product vendors such as Oracle offer tools and packages which support and accelerate enterprise test data management strategy for Oracle products. Similarly, service providers - Infosys, Accenture, Cognizant, TCS, Wipro, and more - offer data masking solutions and accelerators as part of their implementation services.

In a nutshell, TDM solution architects and consultants must be aware of the impact of these trends - agile, DevOps, big data, and cloud - on their existing test data strategy. Armed with this knowledge, they must fine tune their test data strategy with effective solutions such as service virtualization, domain-centric solutions and accelerators for products and platform to incorporate a futuristic vision into their TDM strategy. The underlying objective is to deliver value to clients by increasing automation through efficient solution patterns and innovative methods across the TDM lifecycle.


Vipin, Nice blog and very informative blog on TDM strategy. The way it’s categorized by different technologies and solutions is quite impressive. Thanks for sharing and looking for more from you on your Enterprise wide TDM implementation experience.

Good One


Good One Vipin. Very well articulated

Post a comment

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)

Please key in the two words you see in the box to validate your identity as an authentic user and reduce spam.

Subscribe to this blog's feed

Follow us on

Infosys on Twitter