In the voice-assistant wars, Apple’s Siri has something that Google’s Assistant, Amazon Alexa, and Microsoft Cortana do not. It can speak 21 languages localized for 36 countries. It’s a “very important capability in a smartphone market where most sales are outside the United States,” according to Reuters.
Despite being the oldest voice assistant on the market, Siri has often been criticized for slipping behind the competition. Oren Etzioni, chief executive officer of the Allen Institute for Artificial Intelligence, for example, says Apple has squandered its lead when it comes to understanding speech and answering questions.
When it comes to language learning, however, Siri is still in the lead. While Siri can speak 21 languages, Cortana knows eight languages tailored for 13 countries. By contrast, Google’s Assistant can only speak four languages, while Amazon Alexa users are stuck with only two, English and German.
What's Apple's Secret?
Alex Acero, the head of the speech team at Apple, told Reuters some of the secrets behind Siri’s success. For one, the company brings in “humans to read passages in a range of accents and dialects, which are then transcribed by hand so the computer has an exact representation of the spoken text to learn from.” The company also captures a range of sounds in a variety of voices. It then uses a language model that tries to predict words sequences.
The company also captures a range of sounds in a variety of voices. It then uses a language model that tries to predict words sequences.
Then Apple deploys “dictation mode,” its text-to-speech translator, in the new language, Acero said. When customers use dictation mode, Apple captures a small percentage of the audio recordings and makes them anonymous. The recordings, complete with background noise and mumbled words, are transcribed by humans, a process that helps cut the speech recognition error rate in half.
After enough data has been gathered and a voice actor has been recorded to play Siri in a new language, Siri is released with answers to what Apple estimates will be the most common questions, Acero said. Once released, Siri learns more about what real-world users ask and is updated every two weeks with more tweaks.
I find Apple’s process of adding a new language to Siri’s repertoire fascinating. The company’s commitment to adding localized languages is something I find most interesting.
While a company like Amazon, for example, struggles with adding French to its voice assistant, for example, Apple’s working on introducing Shanghainese to Siri. This special dialect of Wu Chinese is only spoken around Shanghai.
Now, if only Siri could do a better job at understanding speech and answering questions …
What type of features would like added to Siri? Let us know using the comments below.