OpenAI Expands ChatGPT with Voice and Image-Based Abilities

The development marks a notable evolution in the generative AI field, as OpenAI integrates voice-overs assistant features with its powerful large language models.

By Sahil Pawar

September 27, 2023

OpenAI has announced significant enhancements to its popular generative AI assistant, ChatGPT, expanding its capabilities beyond text-based interactions. ChatGPT, known for generating essays, poems, and summaries from text prompts, is now set to support voice conversations and image-based searches.

ChatGPT can now see, hear, and speak. Rolling out over next two weeks, Plus users will be able to have voice conversations with ChatGPT (iOS & Android) and to include images in conversations (all platforms). https://t.co/uNZjgbR5Bm pic.twitter.com/paG0hMshXb
— OpenAI (@OpenAI) September 25, 2023

This development marks a notable evolution in the generative AI field, as OpenAI integrates voice-based assistant features with its powerful large language models (LLMs). Users can now engage in voice conversations with ChatGPT, asking it questions or requesting spontaneous tasks like crafting bedtime stories with vocal prompts.

Use your voice to engage in a back-and-forth conversation with ChatGPT. Speak with it on the go, request a bedtime story, or settle a dinner table debate.

Sound on 🔊 pic.twitter.com/3tuWzX0wtS
— OpenAI (@OpenAI) September 25, 2023

The voice functionality is powered by a new text-to-speech model capable of producing human-like voices from text inputs. OpenAI collaborated with established voice actors to create five distinct voices and utilized the open-source Whisper speech recognition system to transcribe spoken words into text.

In addition to voice capabilities, ChatGPT users can utilize image-based queries. For example, they can upload an image and ask ChatGPT to provide explanations or instructions related to the image.

These new features will roll out to paying Plus and Enterprise subscribers over the next two weeks. To activate voice features, users must navigate to the app’s “settings” menu, select “new features,” and opt-in to voice conversations. They can then choose their preferred voice by tapping the headphone button in the top-right corner.

Initially, voice capabilities will be available in the ChatGPT Android and iOS apps on an opt-in beta basis, while image search will be accessible by default on all platforms. This expansion signifies OpenAI’s commitment to enhancing user interactions with ChatGPT and making it a more versatile and interactive AI assistant.

OpenAI Expands ChatGPT with Voice and Image-Based Abilities

LEAVE A REPLY Cancel reply

Most Popular

OpenAI Expands ChatGPT with Voice and Image-Based Abilities

Subscribe to our newsletter

RELATED ARTICLES

AI Made Cyberattacks Faster Than Patches. Mandiant’s Data Proves It.

Cerebras Systems IPO: What the $26.6B Nasdaq Listing Means for AI Chips

SpaceX Secures $60 Billion Option to Acquire Cursor as Musk Bets on AI Coding

LEAVE A REPLY Cancel reply

Most Popular