Advertisement

ChatGPT’s new voice mode is giving ‘Her’ vibes

The OpenAI logo appears on a mobile phone in front of a computer screen with random binary data.
(Michael Dwyer / Associated Press)
Share via

The days of an interactive, almost-human virtual assistant could be coming sooner than you think.

Tech company OpenAI has unveiled the latest update to ChatGPT, which now includes a voice mode that allows users to communicate more conversationally with the AI system. In a video posted Monday on X by OpenAI Chief Executive Sam Altman, company officials ask ChatGPT to tell them a bedtime story involving robots and romance.

“Ooh, a bedtime story about robots and love?” ChatGPT responds in a cheerful female voice. “I’ve got you covered!”

Advertisement

We interviewed ChatGPT, a chatbot that has garnered widespread attention for its ability to mimic human conversation. Then we brought in experts in artificial intelligence and the arts to analyze ChatGPT’s responses.

The system proceeds to tell a story about a curious robot “in a world not too different from ours,” and then pivots to different voices when company officials periodically interrupt to ask it to speak more dramatically, in a robot-like voice or in a sing-song way.

The new update, known as GPT-4o, quickly received comparisons to the 2013 Spike Jonze movie “Her,” starring Joaquin Phoenix, in which a lonely man falls in love with his virtual assistant Samantha, voiced by Scarlett Johansson. Even Altman appeared to refer back to the film, saying in a blog post that it “feels like AI from the movies; and it’s still a bit surprising to me that it’s real.”

(But that movie isn’t entirely rosy about AI taking on the role of a human companion, cautioned Wired Executive Editor Brian Barrett in a column titled “I Am Once Again Asking Our Tech Overlords to Watch the Whole Movie.” In the column, Barrett notes that at least one OpenAI employee heeded that advice. He quoted a tweet in which the employee said that re-watching “Her” “felt a lot like rewatching Contagion in Feb 2020.” )

Advertisement

“Getting to human-level response times and expressiveness turns out to be a big change,” Altman wrote.

Previous versions of ChatGPT were text-based, with users typing questions to the system and receiving written responses instantly. Past attempts to make the system give more human-like responses, beyond simple fact regurgitation or rudimentary stories, were largely rebuffed by ChatGPT.

Though bedtime tales about robots and love seem benign, AI and its potential effect on jobs is a pressure point in Hollywood and played a major role in last summer’s dual strikes led by the Writers Guild of America and the Screen Actors Guild-American Federation of Television and Radio Artists.

Advertisement

OpenAI, in particular, has not been shy about courting the entertainment industry and has met with studio and talent agency executives to discuss another of its products, Sora, an AI tool that uses text-based prompts and turns them into visuals that can be cinematic in quality.

Recently, indie pop artist Washed Out used Sora, which is not yet publicly available, to create a four-minute music video for the song “The Hardest Part.” The music video zooms through scenes from a couple’s life that are completely AI-generated.

Beyond Hollywood, other industries are also flirting with AI, such as fast food operators. Those businesses are now looking to AI to run drive-through orders or walk-up self-service kiosks to reduce the financial effect of California’s new $20 minimum wage for restaurant workers in certain establishments.

Advertisement