How to Find the Story Thieves

Site scrapers, plagiarists, “article spinners”…none of them are entitled to your hard work, unless you sell or license your words to them

After I wrote my previous story, I realized that I had glossed over an important point, and the question started coming in: “Yes, but how do we find these thieves? How can we even know when our content has been stolen?”

I’d addressed only the “what can I do to stop them?” but not “how do I even know they’re out there?” part of the equation. I was responding to the frustration of writers who already knew who was ripping them off.

So, today’s story deals with how to find the copyright infringers in the first place.

First, bookmark this site — it’s not mine, but it’s run by a friend I met over a decade ago while frustratedly fighting the good fight, all by myself, and losing:

In addition to the resources here, Jonathan Bailey posts about the current state of copyright law; interesting and challenging cases of infringement and how courts have decided them; and news you’ll find intriguing and useful. I can’t address this better or in more depth for you than he already has.

To find infringers, Jonathan has the following suggestions:

But one additional suggestion, from me: Set up some text-based alerts using Talkwalker’s free tools. But before I show you how easy it is to do this, let’s talk about ways to ensure success.

Bury an “Easter Egg” to Hunt For

For any search-based alert to work well as a plagiarism detector with minimal text and no need to enter all your work into someone else’s database, you may need to be a bit clever in how you set your text.

First, include a unique, short phrase in a concise bio blurb that you add to your text.

You don’t want this bio blurb to be ugly boilerplate that’s easy to search for and strip out. That will only detract from your posts. I suggest you have several different versions that you can change up slightly from time to time. Make sure that they are informative and effective in promoting you.

Don’t rely solely on your name or copyright statement. You don’t even have to include an explicit copyright statement — legally, your work is copyrighted the minute you record it, whether on paper or pixels. But if you choose to do so, be sure to use the following form:

  • The copyright symbol © (or for phonorecords, the symbol ℗ ); the word “copyright”; or the abbreviation “copr.”;
  • The year of first publication of the work; and
  • The name of the copyright owner.

Example: © 2017 John Doe


Other forms may be preferred in other countries; however, most countries do have reciprocal copyright laws, and the exact form of notice is less important than that you do give notice. Again, in the U.S., this notice is not required. But it does remove all doubt as to whether the infringement was willful or “accidental.”

A programmer can strip repetitive, “boilerplate” text from a document fairly easily. Most scrapers won’t bother — at least at first. Give them 100 examples and they might.

What they will do is strip out the author metadata that is automatically added to your posts, whether that’s on Medium or elsewhere. Programmatically, it’s a simple search and replace operation and works equally well if your name is John Doe or Jane Smith.

It’s harder to replace text that changes from one post to another.

Next, use more vivid and colorful language in the body of your text. There are free programs on the Internet designed to do one thing and one thing only: To cheat and to escape plagiarism detection. Check this out:

A clever writer can (mostly) thwart these nasty little bits of code, as they rely on synonyms and antonyms and basic substitution. They’re not really very smart, but if you write short, declarative sentences using easy, little words, they occasionally produce a passable facsimile of human writing.

Using the paragraph above as an example, here’s what our free article spinner churns out:

A smart essayist can (generally) impede these frightful small amounts of code, as they depend on equivalents and antonyms and fundamental replacement. They’re not actually quite savvy, but rather in the event that you compose short, definitive sentences utilizing simple, little words, they every so often produce an acceptable copy of human composition.

Could be worse. And that’s disturbing, because odds are reasonably good that will also pass plagiarism checkers.

But what if I stick a hashtag in the middle? If I add #thwartthebots to the end of that paragraph, it keeps it. And you thought #nonsensicalhashtagphrases had no legitimate purpose, didn’t you?

Be that as it may, imagine a scenario where I stick a hashtag in the center. In the event that I add #thwartthebots to the furthest limit of that passage, it keeps it. What’s more, you thought #nonsensicalhashtagphrases had no real reason, isn’t that right?

It’s not foolproof, and the bots get better all the time. But now, you can use that in your alert.

Now, go set up your text alerts. Create a free Talkwalker account, log in, and fill in the blanks. For more complex searches, you can refer to the supported syntax:

Try their free Social Search, as well. Don’t panic over “positive sentiment” and “negative sentiment.” Automated tools are quite hit-and-miss in analyzing humans. This, for example, they marked as “negative” for a search on my own name:

Well, it’s “negative” for Facebook, but not for me!
What’s “negative” about any of this? “Vanity”?

Anyway, take sentiment analysis — all of it, anywhere — with a grain of salt. Even the pricy tools struggle to analyze real emotions, sarcasm, and phrases taken out of context. Bits of code don’t know you.

Finally, set up your Talkwalker alerts and set the frequency at which you want to receive them, then sit back and relax while it sends you your “vanity surfing” results automatically.

UPDATE: After posting this story and setting up my little demo alert with Talkwaker, I got a new hit:

Example of a Rip-Off Artist, Screen Capture by Author

Go check it out — see if your Medium stories show up (badly machine translated) on this site. NO, I’m not giving them a backlink. But yes, go tell all your friends to see if their work shows up over there and if so, make sure they practice the DMCA take-down!

Then, when you find your words stolen, go back to this story:

And go after the jerks.

