A lot of emphasis in SEO circles is put on avoiding duplicate content – considered a kind of deathnail for getting a good rank. However, not all duplicate content is bad. For example, sites that are comprised of a list of articles or press releases will not be penalized for duplicate content in the same way as, say, two difference blogs containing identical content. Unfortunately, the latter is more common and it can put a real dent in your rank. You might spend hours upon hours writing and posting unique content only to find it stolen by some spammer. What can you do about this?
First, if you hire a copywriting firm to write the content of your website, you should use plagiarism software to make sure you get what you pay for. Even after the content has been written, the content could be stolen, and it is up to you, not the copywriting firm, to make sure this content has not been stolen. In my experience, plagiarism software is far from perfect so it should not be your only recourse. Cutting and pasting a section of the article (with quotes around it) and plugging it into Google is an additional method to see if the content is a duplicate. If it doesn’t show up in Google – but it’s still a duplicate – you’re partly in the clear because you won’t be penalized for the duplicate.
Google alerts will scan the web to look for duplicates of you every day. Think of it as your own personal spider. You could potentially do this for pages upon pages of content. Copyscape is another tool, which will scan the web looking for duplicate content of a web page. If you’ve got a large site of hundreds or thousands of pages, this can be time consuming, but it’s worth it – and it’s often larger sites that get content stolen because of the difficulty of monitoring a large site. But don’t think that just because you write an unknown blog that you won’t have information stolen either – remember, this is done via automated autoblogging software and other underhanded tricks, so the bot is not necessarily separating large from small sites, just different topics.
OK, you’ve spotted plagiarism online. If it’s been done by a copywriting company, that’s easy: just fire them and get your money back. Most often plagiarism is done by some unknown site that’s taken your content. There are some important steps to take in either case:
- Preserve the evidence. Take a screenshot of the site. Google cache is also your friend. Though people lament the fact that Google cache preserves some embarrassing content from year’s past, it is also very useful, especially in this circumstance.
- There are other cache services out there as well: like Furl which creates a caches archive of web content. Archive.org is another place to find an archive of the web. The moral is: if it’s not in Google’s cache doesn’t mean you can’t find it elsewhere.
Contact the Thief
Contacting a plagiarist depends on the origins of the duplicate content. Autobloggers and web scrapers are designed specifically to avoid this type of confrontation. If it’s someone’s personal website with other unique content, or it’s on a major site that contains other content (like Ezine Articles or Hubpages) then you can contact the webmaster. A tool like Domaintools.com is useful for finding out the contact information of someone who owns a domain. This is obviously only possible for a smaller domain and not an article site.
For a site like Hubpages, you can create an account and contact the plagiarist that way, in addition to contacting the main webmaster. The Digital Millennium Copyright Act requires that hosts must remove the content when notified. You must provide proof of what’s happened. They’re not just going to take your word for it that the content originated with you and that you didn’t just steal the content. In some respects, the Internet is lagging behind other free speech and copyright legislation, but the DMCA is a good safeguard for webmasters.
Contact Search Engines
Finally, contacting search engines is a good recourse for having the page removed from search results. This goes back to why Google Alerts are a good tool for any webmaster. There will already be a record that you put in an alert before the duplicate content was spotted, so you have a good case against the plagiarist. Lodging a complaint with Google requires a handwritten signature and takes time to process, but it should be your first step before contacting other search engines.
To sum it up, most webmasters concentrate on two things:
1. Creating new content
2. Seeing how that content is performing
Strangely, checking for plagiarism regularly does not rank as high on the list, even though it can have a dramatic effect on point number two. It will drag down your ranking for that particular page of content and the site overall, so it’s a very important part of optimization. Given the fact that spammers are breeding like wild and autocontent generators are more-easily accessible, this problem is only going to get worse, not better.