Update 08/05/24 at 11:10 a.m. ET: This post was updated to include a statement from an OpenAI spokesperson and more information from a Sunday blog post.
ChatGPT maker OpenAI has new search and voice features on the way, but it also has a tool at its disposal that’s reportedly pretty good at catching all those AI-generated fake articles you see on the internet nowadays. The company has been sitting on it for nearly two years, and all it would have to do is turn it on. All the same, the Sam Altman-led company is still contemplating whether to release it as doing so might anger OpenAI’s biggest fans.
This isn’t that defunct AI detection algorithm the company released in 2023, but something much more accurate. OpenAI is hesitant to release this AI-detection tool, according to a report from the Wall Street Journal on Sunday based on some anonymous sources from inside the company. The program is effectively an AI watermarking system that imprints AI-generated text with certain patterns its tool can detect. Like other AI detectors, OpenAI’s system would score a document with a percentage of how likely it was created with ChatGPT.
OpenAI confirmed this tool exists in an update to a May blog post posted Sunday. The program is reportedly 99.9% effective based on internal documents, according to the WSJ. This would be far better than the stated effectiveness of other AI detection software developed over the past two years. The company claimed that while it’s good against local tamping, it can be circumvented by translating it and retranslating with something like Google Translate or rewording it using another AI generator. OpenAI also said those wishing to circumvent the tool could “insert a special character in between every word and then deleting that character.”
Internal proponents of the program say it will do a lot to help teachers figure out when their students have handed in AI-generated homework. The company reportedly sat on this program for years over concerns that close to a third of its user base wouldn’t like it. In an email statement, an OpenAI spokesperson said:
“The text watermarking method we’re developing is technically promising, but has important risks we’re weighing while we research alternatives, including susceptibility to circumvention by bad actors and the potential to disproportionately impact groups like non-English speakers. We believe the deliberate approach we’ve taken is necessary given the complexities involved and its likely impact on the broader ecosystem beyond OpenAI.”
The other problem for OpenAI is the concern that if it releases its tool broadly enough, somebody could decipher OpenAI’s watermarking technique. There is also an issue that it might be biased against non-native English speakers, as we’ve seen with other AI detectors.
Google also developed similar watermarking techniques for AI-generated images and text called SynthID. That program isn’t available to most consumers, but at the very least the company is open about its existence.
As fast as big tech is developing new ways to spit out AI-generated text and images onto the internet, the tools to detect fakes aren’t nearly as capable. Teachers and professors and especially hard-pressed to discover if their students are handing in ChatGPT-written assignments. Current AI detection tools from companies like Turnitin have a failure rate as high as 15%. That company said it does this to avoid false positives.
And its not just teachers feeling the sting of AI text generation. Gizmodo previously reported about a number of writing professionals who were falsely accused of using AI to finish their work, and were subsequently fired. Researchers said the third-party AI detectors used in those cases are often far less reliable than advertised.