November 25, 2024

OpenAI: OpenAI rolls out voice, image capabilities for ChatGPT: All the details

[ad_1]

We are soon approaching the first anniversary of ChatGPT. In the last 10 months or so since its debut, OpenAI has regularly rolled out new features to its AI chatbot. Now, OpenAI has announced a couple of new features that will make ChatGPT more smarter. In a blog post, OpenAI announced that voice and image capabilities are coming to ChatGPT. “We are beginning to roll out new voice and image capabilities in ChatGPT. They offer a new, more intuitive type of interface by allowing you to have a voice conversation or show ChatGPT what you’re talking about,” said the company. ChatGPT users can click images of their fridge and pantry to figure out what’s for dinner (and ask follow up questions for a step by step recipe).
“We’re rolling out voice and images in ChatGPT to Plus and Enterprise users over the next two weeks. Voice is coming on iOS and Android (opt-in in your settings) and images will be available on all platforms,” said OpenAI.


Get ChatGPT to talk to you

Users can simply activate ChatGPT with voice prompts and engage in a back-and-forth conversation with the assistant. The new voice capability is powered by a new text-to-speech model, capable of generating human-like audio from just text and a few seconds of sample speech. “We collaborated with professional voice actors to create each of the voices. We also use Whisper, our open-source speech recognition system, to transcribe your spoken words into text,” said OpenAI in the blog post.
Show images and get ChatGPT to answer
You can now show ChatGPT one or more images. To focus on a specific part of the image, users can can use the drawing tool in the mobile app. Image understanding is powered by multimodal GPT-3.5 and GPT-4. These models apply their language reasoning skills to a wide range of images, such as photographs, screenshots, and documents containing both text and images, as per OpenAI.



[ad_2]

Source link