Duplicate content is a big topic in the SEO space. When we hear about it, it’s mostly in the context of Google penalties; but this potential side effect of content duplication is not only blown up in importance (Google hardly ever penalizes sites for duplicate content per se), but also hardly the gravest consequence of the issue. The 3 far more likely SEO problems that may be caused by duplicate content are the following:
- Wasted crawl budget. If content duplication appears internally on your site, it’s guaranteed to send some of your crawl budget (aka the number of your pages search engines crawl per unit of time) to waste. This means that the important pages on your site are going to be crawled less frequently.
- Link juice dilution. For both external and internal content duplication, link juice dilution is one of the biggest SEO downsides. Over time, both URLs may build up backlinks pointing to them, and unless one of them has a canonical link (or a 301 redirect) pointing to the original piece, the valuable links that would have helped the original page rank higher get distributed between both URLs.
- Only one of the pages ranking for target keywords. When Google finds duplicate content instances, it will typically show only one of them in response to search queries — and there’s no guarantee it’s going to be the one you want to rank.
But all of these scenarios are preventable if you know where duplicate content may hide, how to detect it, and how to deal with it. In this article, I’m going to outline the 7 types of content duplication — and how to tackle each.
1. Scraped content
Scraped content is basically an unoriginal piece of content on a site that has been copied from another website without permission. As I said earlier, Google may not always be able to tell between the original and the copy, so it’s often the site owner’s task to be on the lookout for scrapers and know what to do if their content gets stolen.
Alas, this isn’t always easy or straightforward. But here’s a little trick that I personally use.