If CRM has been a struggle or a passion for you then Infosys’ CRM blogs is the place to be in. Come join us as we discuss the latest trends, innovations and happenings which will have a bearing on CRM.

« CRM Essentials for Group-Buying Sites | Main | Data profiling is all about CCC- AID? (Part 2) »

Data profiling is all about CCC- AID? (Part 1)

By Jairaj Asok Kumar

Catch phrase or some gimmick! Well, the easiest way to pen your thought in concepts around Master Data Management is to look around and see what problem you are facing and evaluate if the problem is repeatable in nature. Every data management project I have been involved in, the key problem that persists is around bad data quality. Most clients consider that data quality in their existing system is correct, right and true. So how do you define correct, right and true? This is where the term CCC-AID comes in. Read on.

CCC-AID is a quick acronym for the following six criterion that needs to be applied on data to access the data quality i.e. Complete, Conformance, Consistent, Accurate, Integral and Duplicate. Each of this refers to a block that is performed during a data profiling exercise.
Jairaj01a.jpg

When undertaking a data profiling exercise of the source systems, we could always use manual tools or automated tools. Typically in huge business transformation program or MDM enablement program, the toolset for data profiling is procured along with the MDM tool set selection. I.e. if a client is predominantly on Oracle stack, then the data profiling tool could be Oracle Data Quality - Profiling server. If the client uses an IBM stack, then the Infosphere Information Server- Information Analyzer toolset is ideal. However in a case when data profiling is to be done without having identified or procured an enterprise grade MDM tool, one has to opt for manual process. This is when it caught my fancy; why not utilize the open source route.  I have always being inspired by open source tool, and one of the key tools that bears mentioning is the open source Talend ™  open profiler tool. 

So, in the next part, we will try applying the above data principles onto a sample set of data using Talend ™ open profiler tool.

TrackBack

TrackBack URL for this entry:
http://www.infosysblogs.com/apps/mt-tb.cgi/5306

Post a comment

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)

Please key in the two words you see in the box to validate your identity as an authentic user and reduce spam.

Subscribe to this blog's feed

Follow us on

Blogger Profiles

Survey



Infosys on Twitter