One of the most fundamental problems with phone-based voice assistants is how awkward it can be to use them in public. As much as I use my Harman/Kardon Invoke at home to control smart lights, get calendar info, and create Cortana reminders, I pretty much never use her on my Android phone for one pretty simple reason, it’s just a bit … weird, to do so in public at least. Apparently, Microsoft agrees, as the company has patented a module that can detect “silent” voice commands.
As described by the company’s patent filing, the “silent” input method can detect whispers and extrapolate voice commands from the airflow created while mouthing words. The module can be placed in a range of devices, including smart watches, phones, a smart “ring,” regular headset microphones, and even a TV remote.
Although performance of voice input has been greatly improved, the voice input is still rarely used in public spaces, such as office or even homes. This is mainly because the voice leakage could disturb and even annoy surrounding people in quiet environment. On the other hand, there is still a risk of scattering private information to unintended audiences. These are not technical issues but social issues. Hence there is no easy fix even if voice recognition system performance is greatly improved.
Implementations of the subject matter described herein provide a silent voice input solution without being noticed by surroundings. Compared with conventional voice input solutions which are based on normal speech or whispering that use egressive (breathing-out) airflow while speaking, the proposed “silent” voice input method is performed by using opposite (ingressive or breathing-in) airflow while speaking. By placing the apparatus (e.g. microphone) of the apparatus very close to the user’s mouth with an small gap formed between the mouth and the apparatus, the proposed silent voice input solution can capture stable utterance signal with a very small voice leakage, and thereby allowing the user to use ultra-low volume speech input in public and mobile situations, without disturbing surrounding people. Besides of air flow direction (ingressive and egressive), all other utterance manners are same as our whispering, so that proposed method can be used without special practice.
As usual, note that patents don’t necessarily translate into products, but there have been a few rumors floating around recently that Microsoft isn’t done thinking about Cortana-focused hardware. We’ll just have to wait and see.