The Power Of Speech
The competition to be your personal digital assistant [Source: https://www.youtube.com/watch?v=KSQx0AA3vek]
Lee Iacocca's legendary turnaround of the Chrysler Corporation in the late 1970s and early 1980s is regarded as one of the greatest management stories of all time. Of the many quick and decisive actions Iacocca took at the then near-bankrupt company was to bring technology to Chrysler automobiles that would make consumers wake up and take notice - and make competitors squirm. One such gadget was the Electronic Voice Alert System (EVA). Mind you, most consumers of that era didn't know what a personal computer was - Macintosh was not yet released! Yet, a consumer was sold on a Chrysler that would tell her if she left her keys in the ignition or warned her that it was time for an oil change.
Many years have passed since then. But, the progress surprisingly has been slow. That was perhaps because applications and systems did not demand that kind of interaction. Technology was still evolving . In today's era with Cloud first, Mobile first, sensors everywhere - speech and gestures are in (with a bang)! We have an interesting confluence of technologies and applications. Hands-free operations and the need to access information at the point of action has led to an amazing array of speech integration.
Amazon's digital assistant Alexa is expected to increase affinity for Amazon's products and services. Alexa has a neutral accent and, more importantly, the ability to understand my Indian accent. Gone are the days when I would get into the car and say, "Go Home", only to see "Golf Course Icons" on my in car display! Today I can say whatever I like to digital assistants from different companies like, "Play some Boney M on Pandora", to start the app (Pandora) and point me to the right song or, "Hey Siri, wake me up at 5am" to set the alarm, or "Okay Google, will it rain on Sunday?" for the weather forecast of that Sunday.
There are apps that read out credit card statements, or show a personalized video that incorporates the explanation of my telephone bill charges for the month. And indeed, with rapidly evolving speech technologies, phones, tablets, devices are increasingly able to support speech interfaces.
We have moved on from the monotone voice that told drivers 35 years ago that their door was ajar, or that they needed to fasten their seat belts, to Siri or Alexa that at the very least have soothing, lifelike voices. They seem to know humor and Siri, at times, outright ridicules you! In short, these digital assistants have a personality.
Supporting vocabulary and varied voice/accent recognition has evolved exponentially. From the days of 11 spoken messages of the Chrysler car, we have context aware speech system, ones that can remember last couple of interactions in their context - fuelled by A.I. The ability to work with context will be paramount as we integrate sensor information, sounds, location and gestures in technologies. At Infosys, some of our experiments point to excellent results in creating a knowledge base that stores video and spoken expert discourses and help newbies learn those topics with spoken, textual and image queries. More on that later. But, for now - let's revel in what we have achieved so far!