AI Detection Tools Don't Work and Are Dangerous
High False Positives of AI Detectors Can Cause Serious Harm
Bloomberg has a story “AI Detectors Falsely Accuse Students of Cheating—With Big Consequences”. Two-thirds of teachers use AI detection tools to check if their students’ submissions are written by AI. And when a submission is flagged as AI-generated, the consequences can be serious: if you’re lucky, you only get a 0 on that assignment, get a bad grade in that course, and permanently damage your relationship with that teacher. If you’re unlucky, things can get much worse: probation for academic misconduct and more.
Considering how bad the consequences of false positives can be, this should be one of those areas where the only acceptable false positive rate is 0%. A student should be innocent until proven guilty. However many experiences of students and experiments conducted by researchers show that the false positive rates are unacceptable. Even if the false positive rates are as low as a few per cent (as claimed by the companies selling these tools, but those claims are likely to not be entirely true), that still means these tools have the potential to destroy the lives of a few students in every class in every college.
And it is not just teachers who are using these. I have run into content creators who’ve gotten their work rejected by their customers because the customers used an AI detection tool and it falsely flagged the content as AI generated even though it wasn’t.
Foolproof detection of AI-generated work is not currently possible and I don’t believe it will ever be possible. Google has recently released Synth ID a tool that can embed watermarks in AI-generated content and thus makes it easy to detect AI-generated content in a foolproof manner. However, given how this technique works, it can rather easily be defeated by just passing the content through another Gen AI tool to rewrite/rephrase the content a bit. This would only work if all the Gen AI companies in the world agree to embed such watermarking techniques in all their output. For example, a similar technique, called “printer tracking dots” is used in many colour laser printers and copiers available today and can be used to detect which specific device was used to generate any printout. However, doing this with hardware like a printer is rather easier compared to doing it with software: given the open-source movement, it is highly likely that someone somewhere will make tools available to remove the “watermark” in Gen AI output if the use of the watermarks becomes widespread.
Unfortunately, most people who use AI detectors are not aware of the problem of false positives. It doesn’t occur to most people that a tool like this can make mistakes. This is an example of a more general bias that most humans have: we assume that results produced by machines are algorithms are always accurate. This is called automation bias and it is common enough to have its own Wikipedia page. For example, we all used pulse oxymeters during the Covid pandemic. How many of us stopped to wonder whether the pulse oximeter reading was accurate? Did you know that its accuracy decreases depending on the colour of your skin? I bet this thought hadn’t occurred to most of you until just now—this is automation bias.
In fact, a year-and-a-half back I pointed out that ChatGPT is making our programmers dumb. That was automation bias and since then I don’t see any sign of it slowing down. Our programmers continue to use ChatGPT to write programs and assume that the program generated by the AI is bug-free (which it often is not).
The reason we have automation bias is because our older tools, like thermometers and BP machines and weight machines, had very high accuracy. Over hundreds of years of machine use we have developed instincts which tell us that machines are more accurate and unbiased. Unfortunately, with the rise of AI, this is no longer true. More and more of our tools and machines have AI inside them and the bias and inaccuracies will increase. We need to start training our brains to be suspicious of the output of all machines.
Start by discontinuing the use of AI detector tools. A student or service provider’s work should not be flagged as cheating or rejected for being AI-generated. You can, of course, reject it for being bland and boring and soulless (which you should/would do regardless of whether it was AI-generated or handwritten.)