Home How to voice chat with ChatGPT: a guide to using new AI audio feature

How to voice chat with ChatGPT: a guide to using new AI audio feature

TLDR

  • ChatGPT now has a voice chat feature, allowing users to communicate via human-like audio.
  • Voice chat is rolling out to ChatGPT Plus users, featuring four preset voices created with actors.
  • OpenAI addressed concerns by ensuring the voices won't mimic celebrities or public figures.

Science fiction has taught us that anything is possible. From flame-throwing robot dogs to chatting with humanoids and AI, OpenAI’s ChatGPT is attempting to make the latter a reality with its new voice function, blurring the line between human and machine interaction.

Does ChatGPT have voice chat?

ChatGPT now has a voice chat mode so that users can communicate with its assistant. From requesting a bedtime story, or settling a dinner table debate, the feature can generate human-like audio from just text and a few seconds of sample speech.

At the end of July, the new advanced voice mode started rolling out for a select number of users who were subscribed to its premium membership, ChatGPT Plus, which was showcased at its GPT-4o launch event on May 13. However, it was delayed after it appeared that the voice sounded similar to actress Scarlett Johansson.

In Her, Theodore played by Joaquin Phoenix, falls in love with his phone’s operating assistant, voiced by the Hollywood star.

OpenAI then published a blog post setting out that “AI voices should not deliberately mimic a celebrity’s distinctive voice” as well as denying its ‘Sky’ voice tones were an imitation of the highest-grossing box office female talent of all time. Instead, the AI company claimed that it “belongs to a different professional actress using her own natural speaking voice.”

CEO Sam Altman stated on May 20, that “the voice of Sky is not Scarlett Johansson’s, and it was never intended to resemble hers.

“We cast the voice actor behind Sky’s voice before any outreach to Ms. Johansson. Out of respect for Ms. Johansson, we have paused using Sky’s voice in our products. We are sorry to Ms. Johansson that we didn’t communicate better.”

Despite the distance OpenAI attempted to make between its creation and the likeness to Johansson’s operator character in Her, Altman referenced the name of the movie when they unveiled the new model, causing suspicion.

According to The Verge, OpenAI spokesperson Taya Christianson said that ChatGPT’s new mode will only use four preset voices it made with voice actors, adding, “We’ve made it so that ChatGPT cannot impersonate other people’s voices, both individuals and public figures, and will block outputs that differ from one of these preset voices.”

The new mode is expected to be available for all ChatGPT Plus users in the fall, according to Christianson.

Can ChatGPT generate voices?

One of the key new features of ChatGPT is that it can understand and respond to context. This means that it can generate voice-over content that is geared to specific video genres, styles, and even specific people.

The model uses deep learning techniques to analyze and produce natural language text. In essence, it is trained on a massive amount of text data and uses that information to generate new text that is comparable to the query.

In terms of creating voice-over content, it can be fed script or a general idea of what is needed and it manufactures a voice-over that is similar to human voice. It can also be fine-tuned on a specific voice to generate a more human-like voice-over.

According to OpenAI, the firm collaborated with professional voice actors to create each of the voices.

How to activate ChatGPT voice

To get started with voice, go to Settings > New Features on the mobile app and choose voice conversations. Then, tap the headphone button located in the top-right corner of the home screen and choose your preferred voice out of five different voices.

The names of the voices are Sky, Juniper, Cove, Ember, and Breeze, all of which feature variations of an American accent.

The AI company revealed that in early 2023, it had partnered with independent, well-known, award-winning casting directors and producers. “We worked with them to create a set of criteria for ChatGPT’s voices, carefully considering the unique personality of each voice and their appeal to global audiences,” OpenAI said.

Examples of voices can be heard narrating a story about a cat and her kittens.

This is an image of a text excerpt for ChatGPT's voice function from a story. The text describes a scene in a tranquil woodland where a fluffy mama cat named Lila and her playful kitten, Milo, are relaxing under an old oak tree. Lila gently informs Milo that he will soon have a new playmate, a baby sister, which excites Milo as he begins to dream about the adventures they will share together. The text captures a warm, familial moment between the two cats.
The five voices may be down to four soon

Sky

Juniper

Cove

Ember

Breeze

OpenAI reports in its FAQs that it is working to pause the use of Sky due to the ongoing issues.

How to create a voiceover with ChatGPT

The new conversational feature is currently available only in the ChatGPT app for ChatGPT Plus subscribers on iOS and Android. The first step is to download and install the app on a phone. After installing, a new chat can be started by tapping the ‘New Chat’ button. If the button isn’t visible, tap the three horizontal lines, which can be referred to as a “hamburger” button, to access the app’s main menu.

In a new chat thread, the user must provide ChatGPT with the text to be read. The text can be self-written, sourced from existing materials, or generated by ChatGPT itself.

If they choose to use external text, it should be pasted into the chat with an instruction for ChatGPT to hold the text without processing it for the time being.

To record the audio, use the built-in screen recorder on devices such as a Samsung phone. The specific screen recorder may vary by brand, and other recording apps are available in the Google Play Store if the default app isn’t adequate.

Once recording has started, activate ChatGPT’s conversational mode by tapping the headphone icon at the top right of the app. The user should then instruct ChatGPT to repeat the provided text verbatim. It is important to phrase this instruction correctly because, in some instances, it may get confused. Asking ChatGPT to “read the text I provided out loud” might end up with different results, as it does not recognize that its text output is being converted into sound.

After saving the recording, the user has various options for using it. The video file can be imported into video editing software, where the video component can be removed, and the audio part can be kept.

What additional voice features does ChatGPT have?

GPT-4o real-time voice and vision will be rolling out to a limited Alpha for ChatGPT Plus users in a few weeks. It will be widely available for ChatGPT Plus users over the coming months.
GPT-4o real-time voice and vision will be rolling out to a limited Alpha for ChatGPT Plus users in a few weeks. Credit: OpenAI

There are some extra handy options with voice chat. To pause the conversation, tap the pause icon. If you need to interrupt the conversation while ChatGPT is speaking, you have two options: tap to interrupt or tap the stop icon.

To resume the conversation, tap the resume icon and begin speaking again.

If the conversation is muted, you can unmute it by tapping the corresponding icon.

When you’re ready to leave the voice conversation, tap the X icon. This will end the voice mode and return you to a text-based conversation with ChatGPT.

In terms of duration, a voice conversation can be paused, and there is no time limit imposed. However, you can only engage in one voice conversation at a time. You will remain in your current conversation until you either start a new one or switch to another existing conversation.

There is no volume limit for voice conversations as a setting in ChatGPT, as this is set on the device itself.

All users having voice conversations will see a banner after their voice conversation has ended. This feedback survey collects information on the experience of the voice call, not about the conversation or its contents.

Only users on Plus will see the options to rate with the thumbs up/down included in that banner.

Once you enter a voice conversation it is hands-free until you exit the voice conversation. There are manual controls allowing you to pause, resume, and exit the voice conversation.

Is ChatGPT voice free?

All ChatGPT users have access to voice chats through the mobile app, and it is available for free already. GPT-4o and GPT-4 are available for use in voice conversations, however, GPT-4 has message limits for Plus and Team plans.

Meanwhile, GPT-4o real-time voice and vision is expected to be rolled out to a limited Alpha for ChatGPT Plus users in a few weeks. The company states that it will be widely available for ChatGPT Plus users over the coming months.

Featured image: Canva

About ReadWrite’s Editorial Process

The ReadWrite Editorial policy involves closely monitoring the tech industry for major developments, new product launches, AI breakthroughs, video game releases and other newsworthy events. Editors assign relevant stories to staff writers or freelance contributors with expertise in each particular topic area. Before publication, articles go through a rigorous round of editing for accuracy, clarity, and to ensure adherence to ReadWrite's style guidelines.

Suswati Basu
Tech journalist

Suswati Basu is a multilingual, award-winning editor and the founder of the intersectional literature channel, How To Be Books. She was shortlisted for the Guardian Mary Stott Prize and longlisted for the Guardian International Development Journalism Award. With 18 years of experience in the media industry, Suswati has held significant roles such as head of audience and deputy editor for NationalWorld news, digital editor for Channel 4 News and ITV News. She has also contributed to the Guardian and received training at the BBC As an audience, trends, and SEO specialist, she has participated in panel events alongside Google. Her…

Get the biggest tech headlines of the day delivered to your inbox

    By signing up, you agree to our Terms and Privacy Policy. Unsubscribe anytime.

    Tech News

    Explore the latest in tech with our Tech News. We cut through the noise for concise, relevant updates, keeping you informed about the rapidly evolving tech landscape with curated content that separates signal from noise.

    In-Depth Tech Stories

    Explore tech impact in In-Depth Stories. Narrative data journalism offers comprehensive analyses, revealing stories behind data. Understand industry trends for a deeper perspective on tech's intricate relationships with society.

    Expert Reviews

    Empower decisions with Expert Reviews, merging industry expertise and insightful analysis. Delve into tech intricacies, get the best deals, and stay ahead with our trustworthy guide to navigating the ever-changing tech market.