When William Orton, president of Western Union Telegraph, set his eyes (and ears) on a newfangled invention called the telephone back in 1868, he declared, “That invention is practically worthless. It will never amount to anything.”
A century and a half later, the world is not only smartphone obsessed, but embracing a slew of new, artificially intelligent voice technologies. Deloitte reports that global sales of connected smart speakers saw an astonishing one-year growth rate of 187 percent in 2018, with a worldwide installation base of 250 million units projected by the end of 2019.
With these numbers, it’s worth wondering what this mass adoption of smart voice technology will look and sound like within the next few years.
Here’s a glimpse of that future courtesy of Al Lindsay, a pioneer of “talking tech” who literally helped Alexa find her voice. Now Amazon’s VP of Alexa Engine Software, he built and managed the Amazon team that created Alexa, Echo, and Dot.
He predicts voice technology will become more…
“Voice [technology] is still a little bit command-and-control. Where I see it growing is being a lot more conversational,” Lindsay told the audience at a recent RBC Disruptors event.
Within the next two years, Lindsay expects users to “be surprised – delighted, really – by how conversational (devices) will be. I’m gonna have a bit more of a conversation with my (digital) assistant, the way I would with a human assistant.”
That means voice tech that understands long, multi-layered sentences containing various ideas, meanings and requests instead of one limited question or command.
Gartner Research VP Annette Zimmerman believes virtual assistants will even be able to “read” human emotions.
In a 2018 report, she defined “emotional A.I.” as “emotion-sensing capabilities [that] will enable VPAs (virtual personal assistants) to analyze data points from facial expressions, voice intonation and behavioral patterns.”
Zimmerman predicts that “by 2022, your personal device will know more about your emotional state than your own family …This technology can be used to create more personalized user experiences, such as a smart fridge that interprets how you feel and then suggests food to match those feelings.”
Gartner also reports that personal assistant robots (PARs) from Qihan and Softbank Robotics are being trained to “distinguish between, and react to, humans’ varying emotional states. If, for example, a PAR detects disappointment in an interaction, it will respond apologetically.”
By responding appropriately to the user’s emotions – and proactively making recommendations based on those feelings – the device shifts from a smart command center to more of a connected concierge.
Lindsay predicted voice tech will basically be everywhere; an omnipresent tool to manage parts of our public environment in an integrated, always-on sort of way. Lindsay also suggested “there will emerge a large number of helpful voice agents you interact with on a daily basis,” from your doctor’s office to your dry cleaner.
“Which industries should be thinking about voice? All of them,” Lindsay said.
Although he predicted every sector will eventually deploy voice technology, Lindsay said healthcare will be one of the earliest adopters. He sees VPAs replacing patient call buttons in hospitals, or allowing patients to self-manage more of their care at home.
He said voice will also gain early traction in the hospitality sector, with diners making requests like “bring me another Coke” to the VPA installed at their restaurant table.
Select hotels in the Marriott chain are already offering a virtual voice concierge. Instead of calling the front desk, guests can request things like room service, housekeeping and dinner recommendations via an in-room Amazon Echo device.
And listen up businesses: Lindsay said every company, in every sector, will have to create its own distinct voice personality.
“You’re going to take your desired brand attributes and try to communicate that [through its virtual voice]. Do you want it to be serious or sort of fun? You can experiment with your customers, test and learn.”
As with any technology, voice will inevitably suffer some hiccups. For example, while Lindsay pointed out users can review and delete anything Echo records in the cloud, all voice-based service providers are grappling with ongoing issues around data collection, usage and privacy.
Voice’s biggest future challenge, in Lindsay’s expert opinion, is “to get to the point where talking to a computer feels like talking to a human.”
To get there, machines must move beyond learning the vocabulary of each word, to decoding the intention, context and spirit of a conversation.
“You have to figure out the meaning,” Lindsay explained. “Knowing the words is never enough.”
As power pop duo Extreme so presciently sang in their 1990 ballad:
More than words
Is all you have to do
To make it real.
Don’t know that song? Don’t fret. Just turn to your virtual assistant and say, “Play More Than Words by Extreme.” You’re welcome.