OpenAI reportedly has had a text watermarking system built by ChatGPT and a watermark detection tool ready for about a year The Wall Street Journal. But the company is divided internally over whether to release it. On the one hand, it seems like the responsible thing to do; on the other, it could hurt its bottom line.
OpenAI’s watermarking is described as an adaptation of how the model predicts the most likely words and phrases to follow the ones that came before them, creating a detectable pattern. (This is a simplification, but you can check out Google’s more in-depth explanation of Gemini’s text watermarking for more information.)
Providing a way to detect material written by AI is a potential boon for teachers looking to discourage students from handing over writing assignments to AI. magazine reports that the company found that the watermark did not affect the quality of its chatbot’s text output. In a survey commissioned by the company, “people around the world supported the idea of an AI detection tool by a four-to-one margin,” the magazine he writes.
After magazine has published its story, OpenAI confirmed that it has been working on the text watermark in a blog post update today that was spotted by TechCrunch News. In it, the company claims that its method is highly accurate (“99.9% effective,” according to documents magazine saw) and resistant to “tampering, such as paraphrasing.” But it says techniques such as rephrasing with another template make it “trivial for attackers to circumvent.” The company also says it is concerned about the stigma surrounding the usefulness of AI tools for non-native speakers.
But OpenAI also appears to be concerned that using watermarks could alienate ChatGPT users it surveyed, nearly 30 percent of whom reportedly told the company they would use the software less if watermarks were implemented.
Despite this, some employees still believe that the watermark is effective. In light of the annoying feelings of users, however, the magazine says some have suggested trying methods that are “potentially less controversial among users but unproven.” In today’s blog post update, the company said it is “in the early stages” of exploring metadata embedding. It says it is still “too early” to know how well it will work, but that because it is cryptographically signed, there would be no false positives.