A Brief History of Voice Recognition

If you are too busy to make that call to book your salon or restaurant, Google’s new updated virtual assistant just might be able to help you out. This sophisticated technology can make complicated calls, complete with hesitated, ‘mmhmms’ and ‘errs’.

Google made the announcement at Google I/O, which is a yearly event held since 2008 to share new tools and strategies with creators of products that work with Google software and hardware.

Looking into the history of voice recognition actually takes us back to 1952, when Bell Laboratories created Audrey, who was able to understand digits, but only from one voice. In 1962, IBM came out with Shoebox, which apart from understanding the digits from 0 through 9, could decipher 16 spoken words.

In the 1970s, Carnegie Mellon University created the Harpy speech-understanding system. With the vocabulary of an average three-year old, Harpy could understand around 1,011 words. It was also designed with a better search approach, which led to looking at searches by including an entire sentence rather than just a word, in other words, the importance of context.

Google Play to have a `Made in India` section

The 1980s saw some advances, but the poor processing power of computers still held back any major breakthroughs, for example, Bell increased the number of voices that could be recognized, but that was it. Only words spoken slowly and one-at-a-time were understood.

In the 1990s, Dragon Systems came out with DragonDictate, which came at a steep price of $9000. While it used ‘discrete speech’, where the user must pause between speaking each word, an improved version, called Dragon Naturally Speaking, was introduced after seven years. At a 100 words per minute, it was able to recognize words faster, but still needed some training, and the cost, now at $695, was still steep.

We can thank the 2000s, because that is when Google put in some priority research into voice recognition, and when a Stanford Research Institute spin-off, led by Dag Kittlaus, Adam Cheyer and Tom Gruber, began to show results, Apple bought it in 2010, and Siri was born. As the voice recognition built-in to Apple products, Siri has propelled our generation into voice-based web search.

Dag Kittlaus, Adam Cheyer and Tom Gruber, in their turn, founded Viv Labs, and have designed a product called ‘the Global Brain’, which is working towards something similar to what Google has now delivered.

According to Grand View Research, a US consultancy, the global market for voice recognition will reach $127.58 billion by 2024. While the tech giants Google, Apple, and Microsoft continue introducing voice-based products, keyboards, switches, and buttons might soon disappear, making interaction with machines seamless. Apart from Google Home’s Assistant which helps control appliances, Amazon Echo has Alexa, who helps users shop from home.

Listening, processing, and decoding language, are some of the highest functions performed by the human brain, and getting AI to do all three has been a long drawn challenge, as well as a learning process. Every language comes with dialects, accents, and pronunciations. Moreover, background noise, context, multiple voices, together, further complicate things.

Though the technology of voice recognition is advancing, invading spaces, such as smart devices, cars, and homes, it is not prevalent when it comes to using it. For example, only a third of smartphone users use speech to get about. However, with the advent of Google’s new product, this could change. Especially, when it comes to securing online identification, biometric measures or passwords might become the safest way. However, considering user data breach is still a current topic, voice recognition software can raise questions for user privacy, big data, and misinterpretation.

On the home front, India, with its myriad languages and dialects, might be much slower on the uptake, when it comes to adopting voice recognition software into daily life. However, efforts are being made towards this end. Four years ago, IITians, Subodh Kumar, Kishore Mundra, and Sanjeev Kumar started a speech-recognition software company called Liv.ai. The company supports 10 regional languages and is working on adding more. Its speech-recognition application program interface is being used by 500 business-to-business (B2B) and business-to-consumer (B2C) developers.

Navanwita Bora Sachdev

Navanwita is the editor of The Tech Panda who also frequently publishes stories in news outlets such as The Indian Express, Entrepreneur India, and The Business Standard

Next CORSAIR Launches New SPEC-OMEGA RGB »

Previous « Lithium-Sulphur Batteries Will Trump Lithium-Ion in the Rechargeable Battery Market

Why legacy systems are the real AI bottleneck in the mid-market

Artificial intelligence is often framed as a technology race defined by cutting-edge models, governance frameworks,…

40 minutes ago

Gadgets & Apps

Geek Appeal: New gadgets & apps on the block

The Tech Panda takes a look at recently launched gadgets & apps in the market.…

1 day ago

Headline

Policing childhood online: Will age-gated social media actually protect the next generation?

As Australia implemented a social media age restriction law requiring platforms like TikTok, Instagram, and…

4 days ago

Renewables & Environment

The ePlane Company builds the Digital Twin of India’s first electric air taxi with NVIDIA Omniverse Libraries

The ePlane Company, Indian developer of electric Vertical Takeoff and Landing (eVTOL) aircraft, announced that…

4 days ago

Funding & M&A

Funding alert: Tech startups that raked in moolah this month

The Tech Panda takes a look at recent funding events in the tech ecosystem, seeking…

4 days ago

Digital Transformation

Ness Digital Engineering names Sudip Singh CEO to drive next phase of AI-led enterprise growth

Ness Digital Engineering, a global provider of intelligent data and software engineering services, today announced…

6 days ago

A Brief History of Voice Recognition

Related Post

Recent Posts

Why legacy systems are the real AI bottleneck in the mid-market

Geek Appeal: New gadgets & apps on the block

Policing childhood online: Will age-gated social media actually protect the next generation?

The ePlane Company builds the Digital Twin of India’s first electric air taxi with NVIDIA Omniverse Libraries

Funding alert: Tech startups that raked in moolah this month

Ness Digital Engineering names Sudip Singh CEO to drive next phase of AI-led enterprise growth