Data Federation – A potent substitute of Data Warehouse?
From my past experiences, I have observed that we often build a data warehouse as a way of integrating multiple sources of data to gain effective business intelligence. This is both time and resource consuming and also can potentially disrupt the IT roadmap of an organization if not handled with utmost maturity.
In this rapidly changing world of technology, new paradigms have evolved which have the possibility of simplifying the process of aggregation of multiple sources. One such technology which I found very exciting is Data Federation technology which is also known as Information-as-a-service, Data Virtualization or EII(Enterprise Information Integration).
Heart of this technology is a “virtual database” or a Federated Database as was defined by McLeod and Heimbigner long back in 1985. Simply speaking, a virtual database is storage of data definitions and not the data itself. The virtual database will have information about the location of the data.When a single call is made to a virtual database, the technology ensures multiple calls to underlying databases and is also responsible for meaningfully aggregating the returned result sets.
Primary benefit of the above approach is that data need not be moved from the source systems for analysis.It also saves the cost of building and maintaining a permanent warehouse. Since data is not being moved, this enables quick and real time data delivery.
The biggest challenge that needs to be handled for such a system to deliver what it promises is the heterogeneity of the DBMS giving rise to naming, schema, domain, model conflicts. These can be typically handled by designing multiple stacked-up schemas which accurately translates
the data model,as visible to the user, to actual data models of the component DBMS.
Areas where this technology will have ready acceptance are the organization’s divisions (like fraud detection units) which heavily rely on real time intelligence from disparate systems to drive business. Data Architects may also find this approach very efficient for maintaining master dimensions which are typically time consuming at an enterprise level.
Vendors like Oracle, SAS, Informatica etc have already lined up extremely comprehensive solutions in the market. Now it’s on the Consultant and Architect community to go out there and propose solutions which are truly ‘out of the box’!
Last but not the least, what is the experts’ take on this?
Talking to some of the data warehousing connoisseurs, I felt people are divided on the appropriateness of this approach of replacing a data warehouse with federated architecture. Weighing in the pros and cons, I can safely conclude that at this point of industry maturity, this approach definitely merits an augmentation to a traditional data warehouse but we need to wait and watch how best it evolves to become the mainstream.