When you are in the lead, it is easy to be overtaken, because you do not have visibility on those behind you and their pace. OpenAI, however, is taking a step ahead with the latest features added to its flagship chatbot, ChatGPT. Until now, most conversational assistants have been primarily limited to text, but OpenAI will integrate voice and image into its conversational AI. You will now be able to have a voice conversation with ChatGPT, show it objects or images and even get real-time advice for cooking or helping your child with their homework.
Let’s imagine that you are at home at the end of the day, perplexed by your open refrigerator. You wonder what you could cook with the available ingredients. Instead of searching for recipes online hoping they match what you have, you can simply take a photo of the inside of your refrigerator and another of your pantry, then send them to ChatGPT.
This upgrade comes after other new features, including the launch of Dall-E 3 and its integration with ChatGPT. The chatbot will now be able to see, hear, and speak, a first in the world of conversational AI. The goal is simple: make the user interface as intuitive as possible, allowing you to communicate with the machine in the most natural way.
One of the major improvements in this update is the introduction of voice conversations. OpenAI has made this feature easily accessible from the mobile app settings, where users will be able to choose from five different voices. It is not known whether the French language will be taken into account at the beginning. Once that’s done, all you have to do is press the headset button located in the upper right corner of the home screen to initiate a voice chat.
This step forward is not only a technological feat, it also has a practical impact. You can talk with ChatGPT on the go, request a bedtime story for your child, or even settle a debate at the dinner table. OpenAI promises voice that sounds surprisingly realistic. To do this, the AI specialist worked with voice actors to create these voices, in order to prevent potential abuse that could result from overly convincing synthetic voices.
Interacting with photos and images
If voice chats were the only update, that would already be impressive. But OpenAI doesn’t stop there. From now on, ChatGPT will also be able to interact with images. Imagine you are at home and wondering what you can cook for dinner. You can simply take photos of your fridge and pantry, and ask ChatGPT for recipe suggestions.
The model, capable of “seeing” the images you have sent, identifies the available ingredients and suggests one or more recipes that you could make with them. For example, if ChatGPT identifies eggs, cheese, and vegetables, it might suggest a vegetable and cheese omelette. And if you have follow-up questions, like the best way to beat eggs or the optimal cooking time, the chatbot can provide you with step-by-step instructions, making the dinner preparation process more seamless and intuitive.
This ability to interact with multimedia content opens up a myriad of possibilities, especially in the education sector. You can help your child solve a math problem by taking a picture of the problem, circling it, and asking ChatGPT for clues.
In short, OpenAI is not resting on its laurels. With these improvements, the American company seeks to maintain its lead. Plus and Enterprise users will get their first look at these new features in the coming weeks, and other user groups, including developers, will follow soon after. Beyond the technical aspects, these new features aim to create a richer and more intuitive user experience, thus bringing artificial intelligence closer to real human interaction.