How much duplicate content does the average website have?
Based on our 2015 study analyzing on-page SEO issues, here’s what we discovered about the average website crawl from our Site Auditor:
29% of pages crawled had duplicate content.
Here’s how it breaks down. Between early 2013 and mid 2015, marketers deployed Raven’s Site Auditor tool to crawl and recrawl 888,710 websites in search of on-page SEO issues. We anonymously analyzed this data and found that the average website crawl had:
- 71 pages with duplicate content, out of
- 243 total pages
This gives us our 29% figure, which can be useful in a couple ways.
First, it gives marketers a broad benchmark when running website audits. It’s helpful to know if a website you’re working on has more or less duplicate content than the average website crawled by Raven.
Second, data like this can be a great educational resource as conversations about content come up with clients and prospects. Education builds trust.
So, do we think 29% of all pages on the Internet have duplicate content? We don’t know. But our tool has crawled and analyzed hundreds of thousands of sites that agencies and in-house marketing departments manage. So in all likelihood, the percentage is probably higher for those who don’t have professionals optimizing their sites.
What is duplicate content?
29% sounds big, but consider what Google classifies as duplicate content:
- Discussion forums that can generate both regular and stripped-down pages targeted at mobile devices
- Store items shown or linked via multiple distinct URLs
- Printer-only versions of web pages
Duplicate content is any content on your website that is the same or very similar to other content across your website or across multiple websites. If content is the same but the URLs are different, it’s duplicate content.
In 2011, Google released an algorithm change called Panda that pushed down lower quality content in search results. Today, Google continues to reward high quality content while duplicate or thin content can be a sign of a low quality site to Google.
Should you worry about duplicate content?
If your duplicate content isn’t malicious, you likely don’t have too much to worry about. Google puts it this way:
Duplicate content on a site is not grounds for action on that site unless it appears that the intent of the duplicate content is to be deceptive and manipulate search engine results. When search engines run across duplicate content, they’ll try to rank the most authoritative version.
Search engines do a good job of sorting through duplicate content and choosing the most authoritative version. In general, this auto-pilot approach does the job pretty well, but there are some situations in which you’d want to manually get involved.
There are cases when optimizing or cleaning up duplicate content can make a difference. Start by reviewing these 10 duplicate content scenarios and how to solve them. Luckily there are a lot of options for optimizing duplicate content. You can potentially:
- Delete duplicate content
- Update duplicate content
- Redirect duplicate content
- Specify authority with the canonical link element
The bottom line is that the quality of your content and how it appears is a ranking factor to Google, so reviewing potential on-page SEO issues such as duplicate content is smart.
To see how much duplicate content is on your website, run an free audit on your website. You’ll get a report showing which on-page SEO issues are affecting you.
Analyze over 20 different technical SEO issues and create to-do lists for your team while sending error reports to your client.