ChatGPT Won’t Watermark Its Text for Fear of Losing Users

What to know

OpenAI already has a text watermarking method, though its release is being ‘considered’ and debated internally.
Recent reports suggest watermarking text doesn’t affect ChatGPT’s output quality and the method remains robust against local tampering and paraphrasing.
But OpenAI feels text watermarking will disproportionately impact non-English speakers, stigmatize the AI chatbot’s use, and turn off users from using ChatGPT.

Students using ChatGPT to write assignments is a nightmare for teachers and professors. But it appears OpenAI might have a way to tell if a given text was generated by ChatGPT or not.

According to The Wall Street Journal, OpenAI already has a system to watermark text generated by ChatGPT. But it’s considering whether to even release it, citing several issues and complexities involved, and potentially putting off uses who don’t want to be found out using AI-generated text.

In an updated blog post about its research on AI text detection, OpenAI states: “Our teams have developed a text watermarking method that we continue to consider as we research alternatives“.

Text watermarking methods are not without their issues and complexities and can negatively impact the AI generated content. But the company claims their text watermarking method, which is developed exclusively for ChatGPT-generated content, is very accurate and has very low false positive rates.

According to the WSJ, the “technology… can detect text written by artificial intelligence with 99% certainty,” so the watermarking method won’t affect ChatGPT’s output quality.

The text watermarking method works by slightly adjusting how the AI select words and creating predictable patterns of how words and phrases appear.

Although the method proves effective against paraphrasing and localized tampering, OpenAI notes “it is less robust against globalized tampering; like using translation systems, rewording with another generative model, or asking the model to insert a special character in between every word and then deleting that character – making it trivial to circumvention by bad actors.”

To address these issues, OpenAI has started work on another watermarking method that uses embedded metadata and produces no false positives, whereas text watermarking can lead to more false positives when applied to large volumes of text.

As per their research, OpenAI also feels that the text watermarking method can disproportionately impact some groups more than others and could lead to the stigmatization of AI chatbots among non-native speakers.

But more than that, the company fears many current users would use ChatGPT less if the watermarking methods were implemented. For now, the internal debate on whether or not to release the text watermarking method continues. As for the alternative metadata watermarking method, OpenAI is still in the early stages of exploration. So it’s too early to tell whether the approach will be effective or not.