Turnitin’s AI writing–detection tool has a higher false positive rate than the company originally asserted, according to Annie Chechitelli, the company’s chief product officer. When the product was released in April, Turnitin promoted it has having a less than 1 percent false positive rate. Now, the company has not disclosed the new document-level false positive rate.
“We remain steadfast in our strategy to be transparent with the education community about our findings and progress,” Chechitelli wrote. Turnitin attributes the discrepancy to the difference between testing the company did in the lab and experiences users had in the real world.
When Turnitin’s AI-detection tool reports that a piece of writing has a less than 20 percent chance of having been written by a machine, it has a higher incidence of false positives, according to the statement. Now, the company will add an asterisk with a message casting some doubt on such results.
Turnitin’s AI-detection tool produces two statistics—one at the document level and one at the sentence level. The sentence-level false positive rate is approximately 4 percent, according to Chechitelli.
But the tool appears to have particular trouble with text that mixes AI-generated and human-written prose. For example, more than half (54 percent) of false positive (human-written) sentences are located right next to AI-written sentences, according to the statement. More than one-quarter (26 percent) of false positive sentences are located two sentences away from an AI-written sentence.
The company plans to experiment and test more in the coming months.
“As part of our continual improvement as [large language models] and AI writing continue to evolve, our metrics may change,” Chechitelli wrote. “We understand that as an education community, we are in uncharted territory.”