Infosys’ blog on industry solutions, trends, business process transformation and global implementation in Oracle.

« Binding and its functions in Einstein Analytics | Main | OBIEE Integration with Tableau »

Data Quality Overview

  • In this blog we are profitable to debate round the Data Quality and to dazed the subjects data laying-off, data repetition, Inconsonance data, castoffs data etc.

  • Every organisation are careworn for total data quality - to produce and maintain top ailment, information that is fit for its projected business resolution.

  • Below diagram gives an idea about the data quality flow.

     

Principally, EDI comprises of architectural apparatuses such as data sources, Functioning Data Store (ODS), dis-approval, and area database. The area database should be yielding with the trade average SID basis.

Data Profiling

 

Data profiling will inform about the ailment of the data and grounded on the data ailment we have to create rules and apply on the data.

In below table we can see some incongruity with the data which are emphasized in blue, red and green colors are portrays standard examples of data differences.

Name

Age

DOB

Gender

Height

Anomalies

Bob

35

13-1-81

M

6

Nil

Rob

34

27-8-82

5.5

Lexis Error

Madan

34

15-1-82

M

5-9-2

Domain Format Error

Diana

33

Duplicates

Jim

0

12-7-88

M

5.1

Veracity Constrait Violation

~@$

^^

@

#

Missing Tuple

Data profiler supports data predictors to discovery data round a data factual .

 

Data which is having meager quality, unfinished and erroneous ca over come by doing data auditing.

A faultless data reviews detect key quality metrics, mislaid data, unbecoming values, spare records, and contradictions. When used in blend with Oracle Enterprise Data Quality Parsing and Calibration, it can deliver exceptional understanding of your data. 

 

In this part we will discuss how to rinse the data. Data cleansing task performs to remove replacements, alterations, inexact data from the font.

Data eradication is the scheme of pinpointing and fixing data irregularities by equating the sphere of values within the catalogue. Glitches are fixed routinely by ETL dispensation.

 

Data Quality apparatus finds that any data item "stands-out" (holds statistically momentous alteration from a mean residents), then the engine streamers it as an exception and stores it in the exception schema. Reliant on the category, omissions are communicated to:

1.  Data aspects to fix data irregularities at the source database.

2.  Eminence connoisseurs and business users to mount new superiority rules/remedial measures.

Quality Dimension

Accuracy

Uniqueness

Integrity

Consistency

Density

Completeness

Validity

Schema Comformance

Uniformity

Vocabulary Errors

Format Error

Irregularities

Missing Value

 

Designates Straight fall of the eminence measurement

 

Designates that the manifestation of this variance baskets the detectio of extra aomalies reduction the quality extent.

 

 

Post a comment

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)

Please key in the two words you see in the box to validate your identity as an authentic user and reduce spam.

Subscribe to this blog's feed

Follow us on

Blogger Profiles