Launched in beta last January, the voice generation tool from ElevenLabs outlines a future where any voice can be created from scratch or even cloned in a few clicks. And inevitably, such an innovation is not viewed favorably by creative and entertainment professionals.
Generative AI is slowly creeping into the world of audio. Launched at the beginning of the year, the start-up ElevenLabs has developed its own audio AI models within a galaxy of tools with dizzying possibilities. The company thus enables anyone to create voices from scratch and transform any text into speech, in more than 30 languages. Think text-to-speech, but with the power of machine learning.
After a fundraising of 19 million dollars last June, the tool is officially launched for the general public. ElevenLabs is however in the crosshairs of several industries related to entertainment, but also to education. Founded by Mati Staniszewski, who worked at Palantir, and Piotr Dabkowski, a former engineer at Google, the start-up has suffered a lot of criticism related to its flagship feature: voice cloning.
AI narrates The Great Gatsby.
Listen to a fragment from the classic by F. Scott Fitzgerald. Narrated by a fully AI-generated voice. No corrections were made. pic.twitter.com/vQdorBjQK6
— ElevenLabs (@elevenlabsio) January 29, 2023
If it is thus possible for you to train the model with your own voice and to hear yourself speak Portuguese perfectly, other users have exploited it for more unfortunate results.
Voice cloning, a controversial feature
Who would have thought that cloning anyone’s voice would cause overflow? In reality, everyone. Because it took less than a month before malicious users of the 4chan platform diverted its use. Several users have thus been able to clone the voices of celebrities such as Emma Watson, Joe Rogan or even the maligned Ben Shapiro declaring racist remarks.
ElevenLabs then stepped up to the plate by proposing several measures to prevent this type of hijacking: reserving voice cloning for paid subscriptions (which currently starts at 1 dollar per month on the occasion of a launch promotion), offering tools to detect AI-generated audios or even greater moderation on its own platform.
The entertainment industry holds its breath
Music is not spared, as we have seen recently with many users who have been able to make Squeezie, Freddie Mercury or even Frank Sinatra sing on covers of Dua Lipa, Michael Jackson and even… Snow Queen. On this subject, Asia is one step ahead: HYBE, the company behind the k-pop phenomenon BTS, bought the company Supertone IA last October in order to clone the voices of the group’s various singers for ” digital content that expresses comfort and emotion to fans “.
Overall, the entire creative and entertainment industry is holding its breath. The power of such a tool will thus make it possible to create new dubbings or even create audio books without the work of dubbing actors. According to the website Motherboard, more and more actors are being pressured to give up the rights to their voices, a subject at the center of the current strike of the profession in Hollywood. And we understand them: the time dedicated to the localization of voices in dozens of countries could be drastically reduced in the field of video games or cinema. While this is the art of an entire profession.
With this AI tool, cloning a voice is even easier 🎙️