OpenAI has created a tool that could potentially catch students who cheat by asking ChatGPT to write their assignments, but according to the Wall Street Journal, the company is considering whether to actually release it.
In a statement to TechCrunch, an OpenAI spokesperson confirmed that the company is studying the text watermarking method described in the Journal article, but said it is taking a “deliberate approach” due to “the complexities involved and its likely impact on the broader ecosystem beyond OpenAI.”
“The text watermarking method we are developing is technically promising, but it presents significant risks that we are weighing as we research alternatives, including susceptibility to circumvention by malicious actors and the potential for disproportionate impact on groups such as non-English speakers,” the spokesperson said.
This would be a different approach than most previous attempts to detect AI-generated text, which have been largely ineffective. Even OpenAI itself shut down its previous AI text detector last year due to its “low accuracy rate.”
With text watermarking, OpenAI would focus exclusively on detecting the handwriting from ChatGPT, not other companies’ models. It would do this by making small changes to the way ChatGPT selects words, essentially creating an invisible watermark in the handwriting that could be detected later by a separate tool.
After the Journal article was published, OpenAI also updated a May blog post about its AI-generated content detection research. The update states that text watermarking has proven to be “highly accurate and even effective against localized tampering, such as paraphrasing,” but has proven to be “less robust against globalized tampering; such as using translation systems, rephrasing with another generative model, or asking the model to insert a special character between each word and then deleting that character.”
As a result, OpenAI writes that this method is “trivial for bad actors to circumvent.” OpenAI’s update also echoes the spokesperson’s point about non-English speakers, writing that the text watermark could “stigmatize the use of AI as a useful writing tool for non-English speakers.”