Text mining: Digging for gold within the syntax
This post is co-authored with Chantrelle Nielsen.
Data mining is so yesterday. Today, we're engulfed in text mining - the use of computer processing to understand patterns and trends in text - is becoming a more valuable business tool every day. Text mining technologies have developed at a rapid pace in the past decade and have become relatively sophisticated, but there is still no "out-of-the-box" text mining solution that plugs into existing business intelligence (BI) systems. That's because every business has unique discovery needs. And every business has its own internal unstructured data to consider. Most companies simply rely on software alone. The right solution is a mixture of human intelligence and effort along with software. Art and science. Thankfully, text mining can be applied to any available data source, provided it is made of or can be translated into text.
So where do you start? Analysis of customer support transcripts can not only help companies understand the details of issues that their customers face, but can help them standardize and streamline their customer support processes. For example, an online retailer might mine transcripts to identify the phrases used by the best customer support agents at resolving returns calls, and add these phrases to the standard scripts to replicate success.
We look at social media information as anything that's being said about a company outside the control of the marketing department. The most prominent venues include Facebook, Twitter, forums, blogs and media sharing sites such as Flickr and YouTube. Consumers are creating an explosion of data there, and smart companies have active "Social Media Listening" teams or technologies to tap into this data stream and act on it. Many companies, especially those with intensive sales, service or support processes, are launching social networks and forums on their own sites. These forums help company personnel structure their interactions with customers, enable reuse of content, and bring down costs by enabling customers to actually support each other. Most companies start out with comprehensive human monitoring of their forums, but as the activity levels grow, this often becomes too labor-intensive to scale. While social CRM platforms typically include some data structure, such as discussion categories and contributor ratings, mining the "meat" of the discussions can be rewarding. For example, a semiconductor manufacturer might choose to monitor its engineer forums for mention of competitor products. This could help it learn about features that its customers are interested in that the company doesn't currently offer, or even about customer accounts that might be in danger of being lost to competition.
There are many other scenarios where insights can be shared across other business units to reduce costs or provide better service. For example, insights from customer support can be used to drive product lifecycle; or research from competitive intelligence can drive R&D investment. Human Resources can learn what recruits and prospective employees say about the company in social media, and it can learn why employee turnover is on the rise. Marketing can test reactions to various price increases. Product development can learn which features customers are responding most to.
With this much opportunity, there should be a clear market leader in unstructured data analysis. Instead we have a host of competing tools and vendors, making the tool and services selection difficult.In general, the tools and services fall into three main buckets - Business Intelligence, Online Analytics and Predictive Analytics. The BI market is well established and dominated by Oracle, IBM and SAP. The online analytics space has been attracting the most attention (especially social media), but the rate of innovation in this area is slowing.
What about the Predictive Analytics space, crucial to many large businesses? Unfortunately, there is no "ERP for analytics." So how do we approach this issue? First, companies must try to articulate the business value that unstructured data would add; most attempts at a justification will be illustrative and not grounded in facts, so beware the tyranny of numbers here. Second, companies should hone in on the functional domains to be covered and define the business scenarios that would be impacted by such analytics. It is in this phase after the business scenarios have been articulated that we recommend the choice of technology vendors. The market for service providers and vendors is confusing. Most of the investment dollars in this area usually go to the safe areas (BI) or to the areas that have the most hype (Social Media analytics). Don't be like everyone else; look at some of the emerging opportunities as a way to differentiate your data-mining program. To make this all work, you may need to assign someone with cross-functional reporting relationships to own unstructured data for the entire company, or at least groups of related functional areas.
Companies most likely to succeed in this new age of "social business" will use analytics to drive competitive differentiation and tune up HR, marketing, sales and support. They'll start small, figure out where the business value really is, and make analyzing unstructured data someone's full time responsibility.