AI Plagiarism
update 2023-03-12 7:37 AM
The AI Content Detector at Copyleaks says it’s human text.
But note: when I take my original text and ask Bing AI to rewrite it (“to be more authoritative”, I said”), the detector also says it’s “human-generated”. In other words, the Bing AI LLM appears to rewrite human-authored text in a way that fools the detector.
Incidentally, an unrelated post of mine received a comment that the Content Detector says is AI-written and I have flagged it.
Update 2023-01-05: I ran the text through Edward Tian’s GPTZero plagiarism detector.
When I submitted just the two paragraphs below, it concluded that the first (real) one was written by a human (me!). The second was inconclusive, so it asked me to try again with more text. When I submitted a longer version, the plagiarism detector said it’s “likely human”:
See full results below
Somebody at hashtag3.medium took that #DeSci piece I wrote for NEO.LIFE, rewrote it slightly, and posted it to his LinkedIn account of 10,000 followers.
It got more than 400 reactions.
Every paragraph is just a rewrite. I wrote this:
In a DeSci world, the indelible nature of the blockchain closes off many sources of outright fraud. Smart contracts, by eliminating humans from the loop, can’t be bribed or intimidated, for example.
He writes this:
The indelible nature of the blockchain eliminates several sources of blatant #fraud in a #DeSci society. Smart contracts, by removing humans from the loop, cannot be bribed or intimidated.
The whole piece is like this!
Although at first I was shocked and angry, I was also flattered that somebody thought my piece was worth plagiarising in the first place. I mean, it’s one thing if he just ripped it off wholesale – which is evil of course but also easy to detect. But in this case, he went to the trouble to rewrite my piece, leaving the gist in place but with the words reorganized enough to avoid a simple search-for-these-exact-words plagarism detector. Going to that much trouble would indicate that somebody cared enough about my ideas that he took the considerable time required to rewrite it in his own words. That’s kinda flattering, right?
But now with the advent of GPT and the large language models being applied to automated text generation, I realize my plagiarist didn’t have to take any time at all. These tools let him point to a text article and ask for a slightly different version; the computer generates one at light speed.
In fact, why stop there? Send a web crawler through popular-ish sites (avoid the Big Media places that are likely to have lawyers on staff), scooping up content as you go. Rewrite it, and repost.
And those LinkedIn likes? Why can’t those be GPT-generated as well? Make a few hundred fake profiles, have them interlinked and connected with one another so it looks like they’re legit. Then have each profile like and comment on each other’s posts.
I have seen the future. Easy for the scammers. But not for the rest of us.
Plagarism Detector
Run the following text through the plagiarism detector at gptzero:
generates this (Perplexity=219):
Your GPTZero score corresponds to the likelihood of the text being AI generated: 59.95210209370932
Your text is likely human generated!
Here I try a longer version of my original (unplagiarized) text:
Result (Perplexity = 368):
Your GPTZero score corresponds to the likelihood of the text being AI generated: 92.67014843117718
Your text is likely human generated!