Duplicate content means when your content matches with the content present on the other webpages including different pages on your website.
Having large number of similar pages on a website may negatively impact search engine rankings.
For example, word-for-word content on two different pages is duplicate content
But similar content (even if it’s slightly rewritten) on two pages is also duplicate content.
Does Google penalize Duplicate Content?
Google says duplicate content may lead to deindexing of the complete site from Google or a penalty on the ranking of the site.
But it happens rarely to the sites that are intentionally copying the content from other sites to manipulate rankings.
In case, you have a few pages on your site with duplicate content – you need not worry about a penalty.
However, you can face some major issues with duplicate content on your site.
Less Organic Traffic: Google wants to rank pages with original content. It doesn’t want to show pages that are copied from other pages in the Google index including yours.
For example, there are four pages (A, B, C, D) with duplicate content on them. Two Pages – A and B are on your website and C and D are on others sites.
Since Google isn’t aware – which one of them is original. As a result, all of them will suffer in rankings.
Fewer Indexed Pages: Sometimes Google refuses to index pages with duplicate content – especially in the case of e-commerce sites.
Usually, e-commerce websites have a large number of pages. And Google assigns a limited crawl budget to each of the sites. So if your crawl budget is wasted in crawling duplicate content – it might get difficult to get all of your pages crawled and indexed.
How to find Duplicate Content
Find pages with similar content on your site
You run a shoe-selling website. The same shoe can be available in different sizes and colors.
If you have designed your website correctly, then all the sizes and colors of the same product will be available on the same page.
Otherwise, there’ll be duplicate pages of the same product for different colors and sizes.
If your site has a site search function available, then pages created through that also can get indexed by Google. As a result, it can add innumerable duplicate pages.
Check Indexed Pages
This is the simplest way to find out duplicate pages on your site.
Figure out the total number of indexed pages and match the number with actual pages created manually by you.
Let me show you how to do that:
Perform a Google search:
And it will showcase you all the pages that have been successfully indexed.
Head over to “Search Console” and click on “Coverage” under “Index”.
It will show you umbers of the pages that Google has indexed for your site.
For example: it’s showing 30 pages for adschoolmaster.com that Google has successfully included in its index.
That number should match the number of pages that you have created manually on your site.
In case, if it’s showing more indexed pages than created by you – your site has duplicate pages.
How to deal with Duplicate Content
Do Proper Site Redirects
Sometimes you have more than one version of your site.
The home page of your site can have many versions of URLs:
example.com (non www)
The problem crops us when users type either of the above URLs to get to your website – but both the URLs don’t end up at the same URL.
In other words, your home page traffic is divided into two different URLs.
Make sure you prefer one of the versions of the URLs. Let’s say you want to prefer the WWW version. In that case, people who are accessing your website from the non-WWW version should be redirected to the WWW version.
The same problem happens when you switch your site from HTTP to HTTPS. Make sure you redirect all the HTTP URLs to the HTTPS URLs.
Deleting the duplicate pages is the easiest fix.
But if the duplicate pages have earned good page authority or they’re there for a purpose – then deleting them shouldn’t make much sense.
Use 301 Redirects
When a page has moved permanently from an old URL to a new URL – use 301 redirect to direct users as well as Google bots to the new page.
Using 301 redirects makes sense when you want to pass SEO equity from the old URL to the new URL.
That’s probably the easiest fix for the duplicate content.
Use Canonical Tag
When you have two or more versions of the same page and want to keep them all.
Use the canonical tag on one of the versions – that tells Google, ” ok, we have two versions but this is the original version”.
There are two URLs:
Here the second URL is the printout version of the first URL. But the content on both the URLs is the same.
You can implement the canonical tag on the second URL to tell Google that the URL “https://example.com” is the original version.
Google has said that a canonical tag is better than blocking pages with duplicate pages using the robots.txt tag or noindex tag.
Siteliner is a free tool you can use to find out pages with duplicate content.
It scans all the pages on your site for duplicate pages.
Merge or Combine Pages
Like I have told you, if you have two or more pages with duplicate content then you should consider either redirecting them using 301 Redirects or use the Canonical tag to give preference to the page with original content.
But what if you have pages with similar content?
Pages with similar content may not have word-to-word same content but they attract the same audience, basically, tell the same story – you can combine all of them into one amazing article.
In fact, Google loves lengthy and well-written content. That should help your rankings and solve the problems related to duplicate pages.
WordPress Generated Duplicate Pages
If you use WordPress then you may have noticed that it generates duplicate pages with Tag and Category Pages.
Technically, both Tag and Category Pages serve the same purpose. So even if you consider using them then drop one of them.
In order to avoid duplicate pages generated through Tag and Category Pages, use the “noindex” tag on these pages.
That way they can exist without having search engines indexed them.