Testing Services provides a platform for QA professionals to discuss and gain insights in to the business value delivered by testing, the best practices and processes that drive it and the emergence of new technologies that will shape the future of this profession.

« July 2017 | Main | December 2017 »

October 25, 2017

Testing Traditional Databases vs NoSQL Databases

Author: Surya Prakash G., Delivery Manager, Infosys Validation Solutions

When was the last time you heard the words "Big data", "Traditional Database", and "NoSQL database". These words are heard every second now and are widely used terms in every IT organization and across different industry verticals. These terms signify the way technology changes are happening due to huge data that is getting generated through various means (IOT, Social media, Sensors etc.).

Are traditional databases getting replaced? Yes, as there is increase in the amount of data due to advancement in technology (social media, web logs) and increased need to analyze the unstructured data for better decision making. Storage of these data is essential which cannot be handled by the traditional forms of database as data increases multi fold within a millisecond. These relational databases add to the cost of increased maintenance which paved way for newer kinds of databases - NoSQL databases.

Why NoSQL databases are coming into technology stack? With unstructured data coming into foray, NoSQL databases offer more flexibility to host multiple hundreds of nodes and allow it to store in a physical folder kind of structure. It's more imperative that, storing data in bulk requires extra processing effort and more storage than highly organized SQL data. This has led to launch of new set of NoSQL databases like MongoDB, CouchDB etc.

For validation, some common things between SQL and NoSQL have to be followed like Metadata validation, Data conversions, Data Comparisons, Data quality techniques and Data consistency.

As part of the validation strategy for applications based on NoSQL, traditional testing approaches on NoSQL are based upon sample data record sets, which is fine for unit testing activities. However, the challenge comes in determining how to validate an entire data set consisting of millions, and even billions of records. For NoSQL, Testers need to focus on only two of the three attributes getting the right output - Consistency (checking if all users get same data when queried), Availability (checking if all users can read/write the data) and Partition Tolerance (checking if can work across network to access data)

For NoSQL testing, few activities can be considered (which are not limited) as part of the strategy

  • Pre-processing validation : Checking the format and consistency of the source data formats
  • Data Ingestion & Data Extraction Testing: Data Conversions and comparison testing
  • Data Quality Analysis : Understanding state of system and consistency checks
  • Data Visualization testing with NoSQL database: Output reports compared with database values

To conclude, though there are similarities between NoSQL testing and Relational database testing, execution mindset is different and below are four points to remember:

F - Understanding of different File formats that will be used (JSON, Parquet, AVRO etc)
A - how data can be Accessed (as they will be distributed across nodes/clusters)
S - Tester needs to have clear Test Strategy in terms of data conversions & comparisons using the right tools
T - NoSQL Technology knowledge on how data will be stored (Columnar, Document, graph based, key value pair etc.)

I would love to discuss this topic and on big data testing in general. Looking forward to your comments.