May 6, 2013

Social Media & Big Data - Declared preferences vs. Discovered preferences

The other day, I logged into a local eCommerce site - to buy a book. In India, it's a leader in eCommerce customer experience with very high Net Promoter Scores, a well-established metric with direct correlation to customer experience. I have not met anyone in India who has used this site and not recommended it to others. Like any forward-looking website, it allowed me to login using my Facebook/Gmail credentials, rather than asking me to create another username and password, which I would obviously forget. So my experience started on a positive note. Out of curiosity, I started to browse the "Recommendations For You" section to try and decode their algorithms. Next I tried the same process with Amazon, which did not allow me a Facebook login (or did I miss it?). And it dawned upon me that there are things that these e-tailers know about me because I told them that (the declared preferences) and there are things that they will infer about me based on my transactions with them (the discovered preferences). There would be two primary sources of declared preferences - my social media profile and any additions I make on their website to my profile like a phone number, an explicit addition to my "wish list" etc. And there would be another two primary sources on discovered preferences - my past transactions with them and my interactions with my social media website (including associated clickstreams). This is the perfect marriage of Social Media and Big Data.

Social Media

There are things that I do on my social media profile - let's take Facebook as an example - where I declare my preferences of music, my date-of-birth, my relationship status, my photos, etc. Some of this data is available to businesses, if I give a merchant access to my profile. These are explicitly declared by me and companies can use this data to substantially improve their interactions with me since they know that much more about me. So having separate logins - in my personal opinion - is useless . Whether you are a local e-Commerce site or even as complex as a bank, if you are not exploring ways to use social media logins (Facebook, Twitter, Gmail, LinkedIn etc.) to your site, you are really not sincere about knowing your customers and serving them optimally, no matter how much you harp upon "We live to serve" in your print or TV ads. Social media sites are called that because they help customers be social. And so should businesses - all of them, not just for lip service but for transaction-enablement.

Big Data

Next comes all the status updates, 'Likes', comments, check-ins that I do on Facebook, activities which reveal a little bit about myself every time I interact with these sites. Facebook graph search and Facebook Home, of course, have now opened an even bigger Pandora's Box in my opinion. Add these to the customer transactions with your company, the clickstreams of the customer on your website, install the processing power to do statistical analysis around the combination of all this structured and unstructured data and you are well on your way to your big data analytics strategy. But how much Big Data is useful and how is it useful? What can companies do better with Big Data Analytics that they could not before?


Part of the answer is in the problem itself - how intrinsically predictable something might be?

(And as far as human predictability is concerned, just think about your spouses, kids and parents before answering. That should give you an idea of how predictable your customer is going to be.)

So what's the point?

How can you create value for me - your customer? Big Data, Social Media, Predictive Analytics, any technology on which a company invests money, it must create value both for the company and its customers. And ideally it must achieve this in a way whereby it improves the quality of the overall customer experience and reduces the cost of operations for the business. So how can you create value for your customer and yourselves?

Firstly the businesses must have end objectives in mind - not something as broad as "revenue increase by 2%" but something more well-defined "revenue increase by 2% from existing customers through existing products". The key word(s) here is "existing". If you are talking customer acquisition or new product launches, you might need different approaches than what I am about to talk next. The intrinsic predictability of your problem is vital to finding a solution for that problem.

When you have defined the problem as clearly possible with potential for predictability - you start looking for points of commonality between your product line and your existing customers. Now all of a sudden it begins to make sense to learn more about your existing customers and how they consume your existing products. To study that you start mining the Big Data of your company's enterprise datawarehouses, transactional systems and of course the social media profiles of your customers. And you can start the journey to discover preferences of your customers in levels of granularity that makes it meaningful to establish relationship between customer segments and product segments with higher degree of correlation. An improved matching of product profile with customer profile since you now have more data points - both about the customer and the product. So both the declared preferences and discovered preferences are valuable

Discovering these preferences and patterns of your customers and products is meaningful only if you are confident that you are leaving that 2% money on the table. A product like a book in unlikely to be bought by the same customer again but a perishable or fast moving consumer good (FMCG) would have a typical consumption period after which it can be recommended yet again. So if you know when the customer last bought it, you can recommend again after a certain period.

Finally, as usual I want to state that business is always about ROI and if you have an opportunity to invest your money in something that can give substantially higher returns with the same investment, then forget big data, forget social media... But till then happy analysing social media and happy discovering customer preferences... and to start with let your customers become your friends on Facebook and login to your website with their Facebook credentials...

February 20, 2013

Big Data, Multimedia, Sentiment Analysis & Monetization

As I mentioned in my previous post, Big Data must create value - real and tangible economic value for it to be meaningful. And it must do more than just what a traditional Datawarehouse could achieve. Taking structured transactional data, putting it in a Datawarehouse and mining it for statistical analysis to obtain customer insights is nothing new or revolutionary for all of us to spend so much time talking about. The maximum amount of data (and one with largest growth rate) that is being generated nowadays is unstructured and with cameras in every phone, a lot of this data consists of multimedia like video and pictures. Tagging these videos with metadata and annotations tries to put some structure around these large files. For example, when you watch a video on Youtube, how does it know what else to "recommend", what to "feature" and what to "suggest"? Youtube's search ranking algorithm tries to constantly stay ahead of the game by delivering the most relevant and engaging content, so as to optimise the return-on-investment for its advertisers. Similarly, TripAdvisor's ability to put structure around the large volume of its unstructured data (namely hotel reviews, pictures, ratings, etc.) is proving to be financially successful for both itself and its partners. One reality of all this large volume of data being created rapidly is how do you stay ahead of the game and continuously innovate to make more money than your competitors?








For purposes of illustration, we can look at Hotel industry, its adoption of Sentiment Analysis and how it has used the technology to positively influence its room pricing abilities. Hotel industry is abuzz with something called Online Reputation Management. Companies like Radian6 (SalesForce Subsidiary in Canada), ReviewPro (Spain), Brand Karma (US) Hootsuite (US), ViralHeat (US) and SocialNuggests (US), ClaraBridge, SentiRate and TrustYou(with US based subsidiary called Review Analyst) along with the field's largest player TripAdvisor are all trying to tame the large volume of data consisting of hotel reviews in meaningful manner. Additionally, it is creating substantial all-round economic value. For hotels it's creating the ability for them to charge higher premiums compared to their competitors. For customers it is giving them the ability to get maximum value for their money. For technology providers, it is creating ability to offer a host of new age services both to the businesses like hotels as well as directly to customers. Larger technology players like Tripadvisor are minting money multiple ways. From offering value added services to customers by becoming the default site for hotel value comparisons (both price and reputation comparisons), they are able to sell digital real estate to the highest bidder for ad space as well as charging commissions to hotels for bookings generated from their website. In the end both the customer and the hotels get exactly what they are looking for -increased value for their money....

Monetization of Big Data is one of biggest challenge that the technology players are constantly working to solve and when done right, it unlocks tremendous economic potential for everyone...

Next imagine a world of video reviews (like Youtube) and picture reviews (hint: Pinterest) all getting organised and sorted to provide meaningful analysis for customers shopping the market for their ideal honeymoon destination or their dream vacation or just a weekend getaway... Tremendous opportunity with lots of money for everyone to make...

February 14, 2013

What actually is Big Data? - The different definitions

When I first heard the term, it resonated strongly with me. It was probably because Database Management, Relational Databases, Database schemas, DataWarehousing and Data mining had always been my field of interest, right from the days when I first started my career as Business Analyst in Singapore Airlines (SIA) several years ago. Even back in the day (late 90s, early 2000s), SIA used to get competitor fare data from MIDT, among other sources to try and optimise the potential fares that it could charge on the various sectors (and combinations of sectors) that it used to fly to. In Airlines, this practise of using historical data (both your own and competitors) to optimise future fares is known as Yield Management and/or Revenue Management. The practise has been officially in existence since the mid-80s when American Airlines then CEO - Robert Lloyd Crandall invented and named it. It later spread to other related sectors like hotels and hospitality. Revenue management in hotels is an equally sophisticated field today. Top organisation like IATA (International Air Transport Association) have started offering courses for "Airline Revenue Management" and Cornell University's School of Hotel Administration offers similar programs for hotels' revenue management professionals. It has spawned an entire sub-industry within the IT Products sector catering to Revenue management for Airlines like Sabre's Revenue Manager, Amadeus's Altea Revenue Management among others. Revenue Management for hotels gave rise to IT  product companies like IDeaS and EasyRMS. So with such sophisticated IT products and business analytics that have existed in these sectors since almost the advent of Internet, what is changing now, how is Big Data affecting it and what are the opportunities that Big Data is creating for Airlines and Hotels?

But before we get into those details, I wanted to establish the baseline/benchmark about what exactly is Big Data? So I scoured the Internet to find the multiple viewpoints that exist on its definition and try to reconcile all that into a single coherent and useful definition.  The definitions I encountered from reliable sources are verbatim as per below (intentionally leaving out Wikipedia as Wikipedia is an information aggregator and not a creator or original content):

Gartner:  Big Data in general is defined as high volume, velocity and variety information assets that demand cost-effective, innovative forms of information processing for enhanced insight and decision making.

Forrester: Big Data is the frontier of a firm's ability to store, process, and access (SPA) all the data it needs to operate effectively, make decisions, reduce risks, and serve customers

IDC: Big data technologies describe a new generation of technologies and architectures, designed to economically extract value from very large volumes of a wide variety of data, by enabling high-velocity capture, discovery, and/or analysis.

McKinsey Global Institute: "Big data" refers to datasets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyse. This definition is intentionally subjective and incorporates a moving definition of how big a dataset needs to be in order to be considered big data--i.e., we don't define big data in terms of being larger than a certain number of terabytes (thousands of gigabytes). We assume that, as technology advances over time, the size of datasets that qualify as big data will also increase. Also note that the definition can vary by sector, depending on what kinds of software tools are commonly available and what sizes of datasets are common in a particular industry. With those caveats, big data in many sectors today will range from a few dozen terabytes to multiple petabytes (thousands of terabytes).

Oreilly: Big data is data that exceeds the processing capacity of conventional database systems. The data is too big, moves too fast, or doesn't fit the strictures of your database architectures. To gain value from this data, you must choose an alternative way to process it.

Microsoft: The increasingly large and complex data that is now challenging traditional database systems

Oracle: Big data is the data characterized by 4 key attributes: volume, variety, velocity and value

IBM: Big data is the data characterized by 3 attributes: volume, variety and velocity

If I look at the myriad definitions above and try to create something that is relevant and meaningful to my cause, I would define Big Data as follows:

Big Data is high-volume, high-velocity and high-variety information assets  which require reasonable levels of veracity in turn creating substantial economic value  and helping in effective operations, revenue enhancement, decision-making, risk management and customer service.

Agree? Disagree? Suggestions for improvement? 

I look forward to hearing from you, as I further pen my thoughts on Big Data and its application in the ever so dynamic and exciting field of revenue management in Travel & Hospitality sectors,specifically ... Stay tuned...

Subscribe to this blog's feed

Follow us on

Blogger Profiles

Infosys on Twitter