ChatGPT Introduces Voice and Image Capabilities for Enhanced AI Interaction

September 26, 2023: OpenAI, led by Sam Altman, has unveiled exciting new enhancements to ChatGPT, offering users the ability to engage with the AI chatbot through voice and images. These additions provide a more intuitive and interactive interface, allowing users to hold voice conversations and share visual context with ChatGPT.

The introduction of voice and image capabilities is part of OpenAI’s ongoing efforts to enrich the AI chatbot’s functionalities. According to a statement from the company, these features are set to be rolled out to Plus and Enterprise users over the next two weeks, enhancing the user experience.

“Voice mode and vision for ChatGPT! Really worth a try,” Altman shared on X. “Voice is coming on iOS and Android (opt-in in your settings), and images will be available on all platforms,” the company confirmed.

The new voice capability is powered by a cutting-edge text-to-speech model capable of generating remarkably human-like audio from text input and a brief sample of speech. OpenAI collaborated with professional voice actors to create a range of distinct voices for this feature. Additionally, the Whisper open-source speech recognition system is employed to transcribe spoken words into text accurately.

For image understanding, ChatGPT leverages multimodal GPT-3.5 and GPT-4 models, enabling it to analyze various types of images, including photographs, screenshots, and documents containing both text and visuals. These enhancements open doors to a plethora of creative and accessibility-focused applications.

However, OpenAI acknowledges the new risks that these capabilities introduce, such as the potential for malicious actors to impersonate public figures or commit fraud. To mitigate these risks, OpenAI has chosen to deploy this technology specifically for voice chat applications, collaborating closely with voice actors to ensure its responsible use.

One notable application of this technology is Spotify’s Voice Translation feature pilot, which assists podcasters in expanding their storytelling reach by translating podcasts into multiple languages, all while retaining the podcasters’ unique voices.

OpenAI has also implemented technical safeguards to limit ChatGPT’s ability to make direct statements about individuals, acknowledging the system’s occasional inaccuracies and emphasizing the importance of respecting individuals’ privacy.

As ChatGPT continues to evolve and adapt, it promises to offer users an even more immersive and versatile AI interaction experience.

Share this article
0
Share
Shareable URL
Prev Post

Apple to Launch macOS Sonoma: Check if Your Mac, MacBook, or iMac is Eligible for the Update

Next Post

India Women vs. Sri Lanka Women: Asian Games 2023 Gold Medal Match – How to Watch

Read next
Whatsapp Join