Infosys Information Services Industry, Leadership Meet 2010
The meet organized by Infosys Technologies for select clients from the Information Services practice provided a comprehensive roundup on the emerging opportunities in the Information Industry from the perspective of an analyst [IDC], an industry player [Elsevier] and a technology provider [Infosys]. I was one of key speakers presenting Infosys' viewpoint and our insights in this space along with Susan Feldman from IDC who needs no introduction and Mirko Minnich who is SVP, Product Technology Strategy at Elsevier representing the publishing industry.
I. Susan Feldman Research Vice President at IDC was the keynote speaker at the event. Susan directs the Content Technologies Group at IDC, specializing in search and discovery software and digital marketplace technologies and dynamics. Susan spoke on the market opportunities within the digital marketplace and how semantic tools fit in to the challenges and opportunities of in this space.
According to IDC, the Digital Marketplace has reached a tipping point and will see double-digit growth rates until at least the end of this decade. The opportunity is about the kind of information and products and search results needed for individualized content. There's uncertainty about revenue streams, but publishers are experimenting. Since more clicks to content mean more revenues for content, relevancy ranking in search results is gaining more attention from publishers. This requires information systems to help people to work in groups to find answers using these tools. Metadata management is booming as a result, as is developing taxonomy tools, discovery applications for researchers, verifying information, sales prospect generation and lead generation. Semantic tools pay for themselves many times over if they're deployed for ecommerce and used to ease the requirement for hiring. But information vendors are slow to adopt these tools. They need to stop thinking about content being at the center of their products and to think about the information surrounding it.
Basic technologies include inference engines, text analytics, reporting tools, BI, data mining, moving up into image search, sentiment extraction, fact/event extraction, relationship extraction, geo-tagging, concept extraction, entity extraction, multilingual support, categorization and browsing, speech tagging, speech to text, search and relevance ranking.
Success in the digital marketplace is based on
· Acquire content - possible by either aggregating from other sources or relying on tools to obtain user-generated content form a community focused on domain knowledge.
· Provide tools that help to find relevant information at the right granularity.
Information finding tools via filtered search, browsing and analysis tools are also key for content success. For example, ediscovery via lawyers needed to find information for litigation took off quickly, people can package up this information easily for use and sharing.
· Getting the right context for answers is very important.
· The "too much information" challenge means that people can no longer afford to hire people to understand information at the rate that it comes in, creating an opportunity for the publishing industry.
· Extracting Leads targeted for a specific industry according to Susan is also an under-utilized opportunity.
· Corporate Users especially are interested to see up special views of their content instead of queuing up to their IT departments. Publishers have content that can fit into this landscape, especially with tools that provide more visually-oriented solutions.
· Trend-spotting is key to hedge funds, as they look for emerging trends that they can analyze and monetize through securities investments. Analytics are important, but have to include unstructured streams such as emails. Businesses need trusted information, they need more competitive intelligence, they need the tools to understand vast amounts of information via findability tools as end users push beyond analysts and researchers to user search tools themselves.
· The need to understand who is the user, what is the task, what location, what will the user do with the search results. They need real-time information, and a re-evaluation of the value that is being provided. Is it the taxonomy that's more valuable to sell, or the targeted content? Is the information about personal relationships in a social network more important?
Points that Susan mentioned that technology vendors need to consider
· Search and discovery technologies - what can they do beyond the search box?
· "Fuzzy matching," which enables larger collections of information to be more usable by surfacing good results quickly, even if they're not entirely inclusive answers. She mentioned that we're moving towards surfacing information about better matches, related matches and so on.
· Other techniques such as understanding sentence, word and paragraph structure, relevance ranking, and supporting ad hoc information access.
· Tools that can help them get contextual answers - what are related actions that they'd like to provide, are they geo-specific requests, device specific, content by category/entity, are all important.
· Multi-lingual extraction is also an important feature for any large information provider.
· Manual versus automated tagging trade-offs - manual approach to tagging provide high precision for low volumes, vice versa for automated. There is a certain "golden mean" in which a combination of automated and manual tagging support can manage high-volume accuracy (comment: think of Dow Jones' new Consultant product, which uses search experts to fine-tune queries of content from Factiva databases for specific topic domains, problems and opportunities.
Solutions of interest mentioned by Susan
· Illumin8 is an example Sue mentioned from Elsevier that allows innovation professionals to extract information about opportunities for innovation and to insert it into spreadsheets.
· Information management companies like Iron Mountain are looking at extracting the "atoms" of information, mix them around using advanced search indexes and to make them more reusable. Tagging them thoroughly up front is key to this process, so that they can "talk" across and in applications. For example, if you have a customer, there is information about that customer in many repositories that need to have this information combined and atomized.
· Bing's Powerset search enrichment tools helps to provide personally contextual information such as restaurants in the area of a location or weather when they understand that a query refers to a location (Google also, obviously). It may also list flights to and from that location, and so on.
· Temis entity extraction enables this in publications like Nature Publishing Group's online Chemistry publication, providing not only document references but embedded content such as the chemical structure of referenced compounds
· Attensity360 provides sentiment/opinion extraction to see what people are thinking about products and services. Also important for monitoring traditional media to see the fluctuation of both mentions and sentiment.
· Autonomy Explore shows clusters of data in a graphical map and tag clouds.