Amazon Alexa will become much smarter and interact with much more natural and fluid discussions while performing ever more complex tasks. In idea, it’s as if you were talking with ChatGPT.
An assistant smarter than ever that allows you to chat orally with ChatGPT. Here is a shortcut that is a bit crude, of course, but describes quite well what Amazon Alexa will become on connected speakers. In any case, this is the future we are heading towards if we are to believe the promising presentation made by David Limp, vice-president at Amazon and responsible for the company’s products and services.
The leader has in fact mentioned a new major language model – large language model (LLM) in English — on which the Alexa voice assistant would now rely. An LLM “ specifically optimized for voice“. So, not only will you be dealing with a more intelligent voice assistant, but the latter will also seem much more natural, almost human, in the way it expresses itself and interacts with you.
In the demonstration video below, we even see that we can interrupt the AI and that this does not disrupt it to continue to understand the conversation and continue it fluidly.
According to Amazon, you will be able to have longer conversations (with longer sentences) and request relatively complex tasks similar to those that you can already do with powerful conversational agents like ChatGPT or Google Bard.
“Alexa, let’s chat”: a more natural discussion
Another demo video illustrates the future capabilities of Amazon Alexa quite well. In one of the passages, we see a woman addressing the voice assistant by saying “Alexa, let’s chat“. This could be translated as “Alexa, let’s chat» (but in a slightly more relaxed tone). The person then asks the artificial intelligence to imagine an invitation message to participate in a dinner centered around a police investigation theme and specifies, with fairly natural phrasing, that they want a fairly mysterious text.
Alexa does so and recites the first draft he prepared. This suits the lady who follows up directly by asking to send it to her phone. Without transition, she continues to speak to her voice assistant, asking it to lower the curtains on the living room windows. Finally, the conversation ends with the human asking Alexa to quote a famous jazz piece. The AI responds and the woman reacts by simply saying: “cool, play it“.
This demonstration is part of an obviously promotional video, without hazards. To Amazon’s credit, we should nevertheless note that our colleagues from greet an even more impressive on-stage demonstration, “with a real conversation worthy of two humans“. To achieve this, Amazon notably claims to have studied at length all the subtleties of an oral discussion.
In any conversation, we process tons of additional information, such as body language, knowing the person you’re speaking with, and eye contact. To enable this with Alexa, we merged data from an Echo’s sensors — the camera, voice data, its ability to detect presence — with AI models that can understand these nonverbal signals. We also focused on reducing latency so that conversations flow naturally, without pauses, and responses are of a voice-appropriate length, not the equivalent of a paragraph read aloud.
The capabilities unlocked by the new large language model will be available from 2024. In the United States, certain users will be able to have the chance to discover it in preview. In France, we will undoubtedly have to wait a little before seeing the new LLM master the official language of France.