"Everyone Knows Claude Doesn't Show Up on AI Detectors"
So, I tested my son's statement by running the same prompt in ChatGPT, Bard, Bing and Claude, then checking them in Turnitin
I was sitting in my minivan next to one of my sons (I have a lot of sons, so you won’t know which one if I tell you this. Anyways, I was talking about how bad AI detectors are, and that they should all be banned and I was complaining that one of my students told me that his professor was actually marking down students if their work showed up in the AI detector in TurnItIn when, out of the blue, he offered the following: “Your student should use Claude AI. Everyone knows that Claude doesn’t show up on AI detectors.”
Really? I didn’t know that—so, obviously, EVERYONE doesn’t know that—but I was still intrigued. So, avoiding my grading one more time (is there anything I won’t do to avoid grading?) I made a prompt:
Write me a 600-word essay on the significance of the color yellow in "The Yellow Wallpaper."
I know that AI detectors work best with regular essays, and I thought this simple prompt might be useful in testing them. So here’s how I conducted my experiment:
I put that prompt into ChatGPT3.5, Bing, Bard, and Claude in order to generate the essay. Claude told me it couldn’t do the essay because of “copyright” concerns, so it gave me a “high level summary” instead, which I asked it to expand to 600 words.
I submitted all the essays into my TurnItIn Demo Student account in one of my classes.
I made a screenshot of the plagiarism score for the submitted essays.
I signed out of the Demo Student account and accessed the essays through TurnItIn in my instructor account and made a screenshot of the AI detector score.
These were the results of the plagiarism score for the “Yellow Wallpaper” essay. Bing couldn’t help plagiarizing 37% of the essay, but the rest of the AI papers were fairly clean. (What’s the deal, Bing?)
Next, I checked the AI detector scores by looking at each of the papers.
Bard made a 100% AI Score (It’s way down there on the far bottom on the right under “AI.”)
Bing, although it made that naughty 37% plagiarism score, made an AI score of 0%.
ChatGPT was caught red-handed with an AI Score of 100%.
Finally, Claude came through with flying colors with an AI score of 0%.
OK. So, I wasn’t convinced yet. I thought, “Well, there was that whole ‘high level summary’ thing that tainted the results. Maybe Claude was just lucky—and what about that Bing score, huh? So, I tried a creative essay next. Here was my prompt:
Write a 600-word creative essay from the point of view of a seagull just before a hurricane.
Then I followed the same process I followed previously.
I put that prompt into ChatGPT3.5, Bing, Bard, and Claude in order to generate the essay.
I submitted all the essays into my TurnItIn Demo Student account in one of my classes.
I made a screenshot of the plagiarism score for the submitted essays.
I signed out of the Demo Student account and accessed the essays through TurnItIn in my instructor account and made a screenshot of the AI detector score.
These were the results of the plagiarism score for the creative essay. All of the papers came through with a perfect score for plagiarism.
Next, I checked the AI detector scores by looking at each of the papers.
Bard made a 100% AI Score, again.
Bing didn’t score well this time. It also got a 100% score. So much for Bing. Sigh.
ChatGPT3.5 also scored 100% on this task.
But, again, Claude came out with an AI score of 0%. Wow!
So, it appears, doing this very short and not very intensive test, that my son is right—at least for now. Claude appears to beat the AI detectors!
Go Claude!