Discuss business intelligence, integration, compliance and a host of other SAP-related topics – implementation, best practices and resources to negotiate the world of SAP better!

« Why is it so difficult to use your tablet for work? | Main | Data Quality Cockpit using SAP Information Steward and Data Services - Part 2 »

Data Quality Cockpit using SAP Information Steward and Data Services - Part 1

 

Setting up Data Quality Cockpit - Understanding the data quality

In our engagement with Customer's we often come across data quality with master data as a major issue and they are keen to implement solutions to tackle this by implementing master data solutions. It is frequently believed that MDM/MDG solves all problems related to master data and implementing it is sufficient to manage data quality. While it has solved the problem to some extent in terms of managing and governing master data, data quality issues are not completely resolved. This is as tools meant for managing master data are not good at cleaning and enriching the data. SAP has introduced two other tools, SAP Information Steward and SAP Data Services, primarily to tackle this issue of data quality. These two tools can very well complement the master data solution or work independently in providing a comprehensive solution for data quality management.

Below is an example of a customer situation and how we helped them solve their problem. They had issues with Master data being unclean, redundant and consisting of duplicates and did not have industry standard master data package but a homegrown application to manage master data processes

Business User complained of

  1. Rules insufficient to identify bad data

  2. No holistic view of data quality in the productive database

  3. Data in productive database has gone bad over a period of time

  4. Period DQ checks on productive database not possible due to tool limitations

  5. Despite rules - duplicates exist

Technical Support complained of

  1. Lack of scalability and extensibility

  2. Lot of effort goes into building custom tools, no funding

  3. No periodic upgrades and chance of technology getting outdated

  4. No out of box features, everything has to be coded as needed

A standard MDM package would have done only half the job as MDM is meant to consolidate, centralize and harmonize master data between various connected systems, but lacks in-depth data cleansing, transformations, enrichment and ongoing data quality assessment capabilities. The crux of the problem was in enabling business to control, monitor and maintain the quality of data on an continuous basis and at the same time leverage existing master data setup of the Client.

The solution implemented was a combination of SAP's Information Steward and SAP's Data Services which could be leveraged in the existing client's landscape with minimal disruption to the established business processes. SAP Info Steward provided the right tools to understand the fundamental areas of a problem in order to know where to focus the solution. This along with ETL capabilities of SAP Data Services provided the complete solution.

The profiling features together with DQ Scorecard was the first step to solving the DQ problem. The out of box profiling feature was used to understand the quality of data in terms of fill rate, dependencies, address validity, uniqueness and redundancies in the source data files. Rules gathered from existing process and additional business rules provided the definition for what qualified as good quality data.  These rules were configured into the SAP IS rule engine using the rule definition and binding features. Weightage was provided to different quality dimensions like the completeness, uniqueness, conformity, etc. By connecting SAP IS to the data staging environment and running the rules on the source data we arrived at a score card for in stage As Is quality of data. The score was low with many failed records which did not meet the criteria defined for standardization and cleanliness. It also showed which dimension of data had the most issues.

This gave data stewards, analysts and information governance experts a very good understanding of where their data quality stands and where to fix to get maximum benefits.

Continuation Part 2: We will discuss in next blog how to fix the issues identified with data and get better ROI

This blog is posted on behalf of Arvind H. Shenoy, Lead Consultant, Retail, CPG & Logistics, Infosys.

Comments

Thank you for your post.It is very useful to me and everyone.
It is useful Especialy me and every beginner.
sap bods online training.


it is the so nice & awesome


Post a comment

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)

Please key in the two words you see in the box to validate your identity as an authentic user and reduce spam.

Subscribe to this blog's feed

Follow us on

Blogger Profiles

Infosys on Twitter