What to know

  • GPT-4o’s System Card report outlines the fears that OpenAI about people getting emotionally dependent on ChatGPT’s Voice Mode.
  • The voices on ChatGPT’s voice mode are highly lifelike and give the illusion that the user is talking to a person.
  • The report highlights that voice mode could help some individuals while affecting others who depend on it extensively or over-rely on its answers.   

ChatGPT’s advanced Voice Mode created ripples when it was first announced. But according a recent safety report, OpenAI already fears that its advanced Voice Mode – which sounds eerily like a human voice – could potentially lead to users getting emotionally hooked to the AI chatbot.  

The advanced voice mode has been one of the most eagerly anticipated ChatGPT features. It can have real time conversations, make human sounds like ‘Umm’, ‘Ah’, and other interjections while talking, mimic sounds, sing songs, and even determine your emotional state based on your tone of voice. Simply put, it is the most lifelike AI assistant there is. But being such a powerful and capable feature, ChatGPT’s voice mode has a high potential for misuse. 

OpenAI mentioned in its report:

“During early testing… we observed users using language that might indicate forming connections with the model. For example, this includes language expressing shared bonds, such as “This is our last day together“.

Voice Mode also opens up new ways for people to ‘jailbreak’ ChatGPT to do all kinds of things, including impersonating people. Being a humanlike conversationalist could also make people more trusting of AI chatbots, even when their output is laden with errors and hallucinations.  

Interaction with ChatGPT’s Voice mode could benefit “lonely individuals” as well as help some overcome their fear of social interactions. But extensive use could also affect healthy relationships, and warrants further study, OpenAI notes in its report.   

Although the report signals growing transparency on OpenAI’s part, it still isn’t as open about its training data. Granted, OpenAI’s fears of what extensive use of its Voice mode could lead to are not unfounded. But over-reliance and dependence on tech is built into most pieces of tech. And being straight out of a movie, it’s not hard to see what the future may hold for Voice Mode and how users interact with it. 

Even at the demo event, voice mode was considered a little too flirtatious. Because of the resemblance of its Sky voice to that of Scarlett Johansson and the controversy that ensued, it’s hard not to invoke the movie Her, and the risks that come with the anthropomorphization of AI models.