The Infosys Labs research blog tracks trends in technology with a focus on applied research in Information and Communication Technology (ICT)

« ISO 23026 -2006 | Main | Location Intelligence »

BI on ECM - Who says not possible?

BI aid business in the decision making by providing different set of analysis on the business data. The assumption till now was that the BI is only possible in the databases as you need structured data to do the analysis but the scenario has rapidly changed in last few years. Companies are using BI on top of ECM to perform different kind of analytics.

This Blog is from my Colleague - Sumit Sahota ( 

Databases are normally used by the applications which are built for some specific purposes/requirement like CRM, ERP, and Payroll etc but there is an ocean of data in the documents we create, mails we send, images we use, web pages we create etc. Total of 80% of the data resides in this unstructured format and this data requires ECM tool for a proper storage or management… BUT… how does this storage happens and how is this content retrieved? Are ECM systems providing some structure to unstructured content?

If we see, the content, which can be a document, presentation or image file, is either stored in a DB as a binary object or stored in a file system by the ECM tool. The files in the ECM will not be stored just as one more object  somewhere but it will be stored in some sort of structure which can be either created by the user in form of Taxonomy or can be created by the system with the tags provided which is known as folksonomy . This is first step to “structure” the “unstructured” data. There are content classifications tools to aid this process.

Secondly, the detail tags which are provided along with the content, is the one more way to add structure as the content will be stored anywhere but the metatags will be stored in DB. This metadata can be made rich using the automated metadata extraction tools. These tools scan through the document and provide user a list of tags which can be used. These metatags are used by the search engines for indexing and makes content retrieval faster.

Third way to provide structure is to keep track of the user activities. All the ECM tools have the capabilities to generate the audit trail reports. These reports are generated based in the log which is kept for each file. These logs can be as advanced as possible. It can list all the activities along with the time user is spending in a page, document etc. These kinds of logs are used by the web analytic tools also.

Forth way to add structure in this social networking and collaborative world is by adding the user feedback in terms of rating, tagging, labels, score, opinion etc. These can then be used for variety of applications for different kind of analysis.

If we see the unstructured content is not fully unstructured and there are lots of ways a BI tool can work on this. Text analytics is a simple example, Web analytics is another one. Voice of the Customer analytics is gaining lot of popularity. There are lots of reports which can be generated and this kind analytics can and will aid in decision making.


Hey Sumit... good to see your blog. Back in 2008, in my article titled "Convergence of BI and Search" i have discussed few architectural options for the implementation of the same. Please refer for more.

Post a comment

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)

Please key in the two words you see in the box to validate your identity as an authentic user and reduce spam.

Subscribe to this blog's feed

Follow us on