Testing Services provides a platform for QA professionals to discuss and gain insights in to the business value delivered by testing, the best practices and processes that drive it and the emergence of new technologies that will shape the future of this profession.

« August 2015 | Main | December 2015 »

September 18, 2015

Assuring quality in Self Service BI

Author: Joye R, Group Project Manager

Why self-service BI?

Organizations need access to accurate, integrated and real-time data to make faster and smarter decisions. But, in many organizations, decisions are still not based on BI simply due to the challenges in IT systems to keep up with the demands of businesses for information and analytics.

Self-service BI provides an environment where business users can create and access a set of customized BI reports and analytics without any IT team involvement.

self-service-bi-testing.png

In Self-service BI, the IT team involvement is limited to creating the semantic layer (useful report data) and business users can dynamically slice and dice different data and look at multiple views of the same data to make better informed decisions.

The key objectives of self-service BI are:

  • Fast-to-deploy and easy-to-manage data warehouse.
  • Simpler and customizable end user interfaces.
  • Reduced IT Costs.
  • Easy access to the source data needed for analysis and reporting.
  • Easy-to-use BI tool that helps in supporting data analysis.

Given the advantages, self-service BI is becoming popular in many organizations and this is disrupting the traditional BI implementation which needs major involvement from IT. Quality assurance plays a key role in successful implementation of self-service BI.

Self Service BI - QA Challenges:

Quality assurance needs a customized approach for self-service BI as compared to traditional BI validation. The reasons are:

Traditional BI testing

Self-service BI Testing

Fixed number of reports, fields and metrics with limited combination of attributes and metrics which are easy to identify for validation

Dynamic report creation and access by business users with large number of possible combinations which are difficult to identify for validation

Report layout is mostly fixed and validation of report layout is easy

Report layout is dynamic due to variety of report creation

Reports are created for end-users and anyone without data knowledge can use the reports

Reports need to be created by users to extract useful information and hence usability of attributes and metrics becomes a key consideration


Due to the above factors, self-service BI creates addition QA challenges as given below-

  • Ensure that the attributes and metrics data provided is usable and easy to understand for business users and that users can generate correct reports as per their custom reporting needs.
  • Ensure that the metrics and attribute data provided to business users is complete and will meet the business need of target business users.
  • Ensure accuracy of attribute and metric data. There are too many combinations of metrics and attributes that need to be verified.
  • The reporting performance is difficult to assess as the combination of reports are too many. Testing needs to replicate the business user behaviour in test systems.
  • Testing has to ensure that the system is user-friendly and intuitive for the business users to create and analyze the correct reports and data

Quality assurance solution:

The quality assurance solution has to address the specifics of self-service BI considering the ad-hoc data modelling, variety and volume of reports by different users considering real time reporting needs.

QA has to focus on multiple dimensions given below.

Validation

Detail

Benefit

Structure validation

ü  Validation of correct folder location for different metrics and attributes

ü  Validation of unrelated metric and attribute combinations to ensure that the metric and attribute combinations are correctly reported

ü  Better Usability

ü  Better quality

Data validation

ü  Validation of correct attribute values

ü  Validation of metric calculation

ü  Validation of dashboard data

ü  Better report data accuracy

Functional validation

ü  Validation of commonly-used attribute and metric combination to ensure data accuracy

ü  Validate the metrics across all levels of aggregation to ensure correct data aggregation

ü  Export, print, email scheduled reports

ü  Drill down validation

ü  Better report data accuracy

ü  Better quality system

Data security

ü  Ensure right access to right roles

ü  Validate that user roles without access to certain data cannot access the data

ü  Better data security

Performance

ü  Ensure report generation within expected time considering the production usage in self-service

ü  Report performance as expected

Test Automation

ü  Automated data validation for attributes and metrics

ü  Automated business rule validation

ü  Improved quality

ü  Reduced regression effort/timeline


Summary:

The QA challenges and solution in self-service BI are different from traditional BI solutions and will need self-service-specific test strategy, planning, preparation and execution for successful self-service BI implementation. So, usability, structure, data, security and functional validation approaches are specifically designed for self-service BI. 


September 7, 2015

Role of Open Source Testing Tools

Author: Vasudeva Muralidhar Naidu,Senior Delivery Manager

Quality organizations are maturing from Quality Control (Test the code) to Quality Assurance (building Quality into the product) to Quality Management. In addition to bringing Quality upfront and building Quality into the product, Quality Management also includes introduction of devOps principles in testing, optimization of Testing Infrastructure (Test environments and tools).

Even though Quality organizations were the primary consumers of test environments, test data and tools, it had limited control on its optimization and rationalization. The usage of open source testing tools was not only restricted by lack of skilled resources, but also by enterprise architects in approving the tools to be installed in the network. The need for higher Quality product is increasing in the ever increasing complex software topology, cost of Quality increase is driving test managers to evaluate and use open source tools for their projects.

Organizations continue its struggle in finding practical ways to improve test effectiveness at reduce cost and improved Quality. Before, I could consider open source tools as one of the potential means; I wanted to first understand if "Testing Tools" is a qualified parameter for reducing cost of Quality.

How much do we really spend on tools as compared to returns? How many tools do we really need to operate efficiently? Once you start listing, the list really looks big. You would need: 

  • Test Management tool (Test Case repository and defect management) - This is an enterprise level, globally used tool with more than 1000+ users
  • GUI test automation tool - Atleast 2 tools in many organizations
  • Requirements management tool (Not many organizations use it)
  • Performance testing tools
  • Data testing tools
  • Data provisioning tools
  • Mobile testing tools
  • Middelware testing tools (SOA)
  • Custom tools 

Many organizations have all the above tools. But have we utilized them well? Can they talk to each other seamlessly? What is the % utilization of your tool license in a given year? What % of tool features do we really use as compared to what is provided by tool vendors?

Now, we might qualify "Testing tools" has having good potential to reducing cost of Quality. The next obvious question is the "Buy vs Build". My focus should be more on my application quality rather than building a framework using open source software. My investment in vendor licensed tools is justified as they help me keep my primary focus intact. I will also not be bothered with the need for highly skilled tool specialists.  This thought process has become old school now. The reasons are obvious:

  • It is more than 10 years since open source community became popular. The community has grown and significant contribution has been made to mature the same
  • Widely tested and validated by multiple user community
  • Cost pressures has forced many organizations to adopt and contribute towards improving open source frameworks and building skilled resources
  • Quality organizations have heavily focused on capability uplift of their resources, many programmers have moved into the testing profession
  • Many enthusiastic sponsors in the organizations promote open source testing tools due to their engineering background

The popularity of some of the open source frameworks has grown significantly over the years. With the advent of Agile and its increased adoption has further pushed open source adoption. The open source test automation tools will also force automation engineers to develop scripting skills and understand the underlying technology better. This will improve the success factors of your automation projects. Also read my (http://www.infosysblogs.com/testing-services/2010/08/need_for_effective_use_of_tech.html) blog 'Need for effective use of technology in testing projects".

I have listed few frameworks/tools below in the table which has already been extensively used and has been accepted as enterprise standard tools. Several frameworks are readily available:

Tool

Features and Strength

Selenium

A power web testing automation tool. Available since 2004. Many new libs and utilities has been added over years. Database utility has also been added. The framework is highly extensible and suitable for today's extreme automation needs

Jenkins

Jenkins is a java based continuous integration tool and works across platforms. It is mainly used for monitoring jobs. He has many testers friends features and helps to achieve seamless integration for end to end automation

Cucumber

Cucumber is a popular and a powerful automation framework for BDD.  Over years it has extended its support to many languages and platforms including Ruby, Java and .NET.

Watir

"Web application testing in Ruby" is a popular automation framework which is a collection of ruby libraries. It uses ruby as the scripting language which makes it power browser automation tool. It is widely used in Agile projects as it support Behavior driven development 

soupUI

Freeware web services testing tool with rich graphical interface, with ability to support functional testing of services, service simulation, Security testing, load testing and many more. The tool has a huge community and good support. If your need is to test only web services, this is the tool.

 This is just a sample of several automation frameworks that are available and built for specific use. If you have already noticed, the common characteristic of each of these open source frameworks is that there is limited IDE and involves good amount of scripting. This helps you in customizing the framework to your needs and integrates with multiple technologies seamlessly. This is the need of the hour whether you are automating a web page or a database. The two basic limitations for commercial usage is availability of skilled resources and dedicated technical support for enterprise usage. Once the user base increases, both the challenges will have its natural death.

Conclusion:

The popularity and capabilities of open source frameworks is increasing with every passing day. Software services organizations are creating dedicated user groups to support the framework to increase its commercial usage. With extreme automation being seen as the only solution for catching up with the pace of technology adoption, open source frameworks will play a major role in achieving continuous integration and hence end to end automation. It will also help reduce cost of Quality in a noticeable way. If you have not already starting looking at open source test automation tools for your organization, it is the right time as you are already a late starter in open source adoption.

September 1, 2015

Extreme Automation in ETL testing

Author: Sudaresa Subramanian Gomathi Vallabhan, Group Project Manager

End-to-End Data Testing can be time consuming-given the various stages, technologies and huge volume of data involved. Each stage of ETL testing require different strategy/type of testing - one-to-one comparison, validation of migrated data, validation of transformation rules, reconciliation, data quality check and front end testing of BI reports. 

With the advent of Big Data across organizations, there is an increased need for automating the ETL testing as well as reports and business intelligence tools. There are various accelerators, excel macros and open source automation used by the testing teams to accelerate the testing at various stages. While they get the job done during individual phases of testing, it doesn't result in end-to-end automation because of the following reasons:
·  Integrating all the tool sets and executing the tests end-to-end is a challenge due to technical and infrastructure limitations.
·  Effort spent in developing and maintaining the automation utility is high, given the vast technology landscape -of ETL.
·  Delay in working with huge volume of data - while smaller utilities can work with limited sets of data, working with huge volume of data can be challenging.
·  Difficulty in Integration with Test Management tool to provide end-to-end traceability.


ETL-testing-stages.jpg

Figure 1: Various stages in ETL and testing involved


Stage

Challenges in Automation

Parameters for achieving Extreme Automation

Data Source

· Heterogeneous data sources-A typical ETL warehouse will have at least 5 different combinations of source systems which feed data.

·   Limitations of utilities such as excel based macros - occupies memory, needs maintenance and have data sample size limitation in comparing beyond a million rows.

· Automated comparison of huge volume of data

· Ability to compare data from heterogeneous systems - File to File, Database to File, Database to Database.

·  Ability to create a temporary table to store the intermediate execution results and delete after execution is completed.

ETL (Extract Transform Load)

· Determining validation strategy - Exhaustive or Sampling validation changes the automation framework. One framework does not fit all needs.

· Automation of transformation rules is complex - one form of automation may fail if the database changes (Teradata, SQL, Netezza, Oracle etc.).

·   Ability to validate movement of bulk data.

·   Automated validation of transformation rules.

·  Tool agnostic transformation rule builder - for deskilling the users.

· Automated determination of validation strategy- Exhaustive, Sampling etc., and suggestion to the user based on execution type.

Datawarehouse and DataMart

· Understanding Data Quality rules and automating them is complex. There can be as many as 300+ Data Quality rule sets in compliance Datawarehouse programs.

· Automated verification of referential integrity check has multiple layers and complex to implement.

· Validation of end-to-end business rules - requires tool to maintain traceability from requirement till downstream.

· Automated metadata and referential integrity verification.

· Data Quality Analysis engine - to validate correctness, completeness and appropriateness of the data.

· Ability of automation tool to work based on database schema and metadata to maintain referential integrity check.

· Ability to maintain end-to-end traceability - from requirements to reporting.

BI Reports verification

· Reconciliation of report data with backend is difficult to automate - due to complex data transformation involved.

· Automated report layout verification.

· Automated validation of report functionality and format.

· Automated reconciliation with backend data.




Table 1:
Data Testing and Automation need

Can Extreme Automation be achieved in Datawarehouse testing?

An integrated automation platform which combines all stages of Datawarehouse testing will be the perfect solution to achieve extreme automation. Following is a diagram illustrating few of the components of this automation framework:

ETL-testing-extreme-automation.jpg

Figure 2: Integrated Automation Platform for ETL Testing

Robust Data Handling: Handling data movement and associated validation is the backbone for achieving extreme automation. So, the platform should:

  • Have its own database space for temporary execution so that tables can be built and collapsed quickly.
  • Ability to handle huge volume of data.
  • Focus on testing all aspects of data and its movement.
  • Maintain traceability of data across stages.
  • Provide options to user to select validation strategy.

End-to-end Automation: Since ETL testing traverses across multiple stages, an extreme automation solution should integrate Data testing and reports validation as follows:

  • Data testing platform.
  • Open source framework for reports validation and reconciliation of reports data with backend -such as Selenium and Eclipse IDE.
  • Wrapper script that communicates between data work bench and reports.

Unattended execution: Ability to perform execution on an unattended basis based on data loaded to a specific environment. Unattended execution can save the overall execution effort by more than 40% if it is able to detect code drop/build and start automatically. This can be implemented using Jenkins which monitors for any code drop or build to trigger unattended execution.

Robust Test Reports: Test reports configured to be sent directly to user's mailbox after execution. Ability to automatically drill down to finer levels of details with respect to data defects or comparison results.

Conclusion

Achieving Extreme Automation in ETL testing is very critical for testers to free up their bandwidth and get upskilled on futuristic technologies, Big Data & Analytics testing. Thankfully, ETL is a great candidate for achieving end-to-end automation across stages with tangible business benefits and effort savings.

  • As high as 50% effort saved in the individual stages of execution.
  • High quality and reliability of migrated data.
  • Fully automated data processing and anomaly reporting.