Scraped Rankings vs Average Rankings

There’s been a lot of debate lately about the necessity of ranking data and the validity of where that ranking data comes from.

I wanted to write a balanced and honest look at the usefulness of rankings and compare the data that comes from Google Webmaster Tools (GWT) and scraped data that comes from providers like AuthorityLabs and Moz.

Table of Contents

What are rankings good for?

In the world of SEO, if you don’t rank, you don’t have (organic, free) traffic. Rankings have their place, and monitoring rankings can provide several benefits:

Knowing the position for campaign-related keywords
Correlating campaign efforts with changes in rankings
Correlating algorithm changes with changes in rankings
Comparing your site rankings with competitors
Analyzing rankings with other data points for advanced tactics

It’s important to point out that rank monitoring is not essential for doing effective SEO. It may be essential for some advanced analysis and tactics using formulas on a spreadsheet, but it’s not necessary for most activities.

An SEO needs to optimize a site for search engines, find relevant and authoritative sites and build relationships. Sprinkle in a little content creation and social sharing, and you sum up – at least at a high-level – what modern SEO looks like.

Ranking well for certain terms doesn’t tell SEOs they’re doing a good job. Organic traffic increases and conversions from that traffic tell SEOs they’re doing a good job.

Regardless, many SEOs still want their ranking data, and their clients want it reported to them.

A brief history of Raven and scraped rankings

When Raven first began tracking ranking results, we got the data straight from Google’s own API. Then one day they said, “Nope!”

That’s when we – and every other ranking results provider – had to turn to scraping. We didn’t want to, but we didn’t have any other choice if we were going to continue providing that data.

Google never really like being scraped and has made it increasingly harder to do over time. So much so that it spawned companies like AuthorityLabs, whose main speciality is scraping Google to get ranking results.

It became so time-consuming and difficult for us to get ranking results that we decided to use AuthorityLabs instead, so we could refocus our time on making Raven better.

AuthorityLabs was and is very good at what it does. So much so that it has become what I believe is now the largest and most reliable provider of scraped rankings in the world – providing millions upon millions of daily ranking results to customers and software vendors worldwide.

In late 2012, the AdWords API Compliance team notified us that we had to remove all scraped data derived from Google from our platform. They specifically listed AuthorityLabs and SEMRush. It turned out that for whatever reason, Google was using its AdWords API as leverage to get software companies like Raven to stop using data that was scraped from Google.

Based on the information we had from reliable sources, we determined that this was the beginning of a much bigger (albeit very slow-going) fight against companies using scraped data. We didn’t see it as an AdWords issue; we saw it as a Google issue.

The long-term success of the Raven platform ultimately relies on having a positive and healthy relationship with Google. We use and rely on its APIs to provide value to our customers. We asked ourselves what the product would look like if we defied Google’s request. For example, what if our access to the Google Analytics API were revoked?

It was both a difficult and easy decision to make.

If you doubt our concern, consider this. Google can at any time revoke access to the software provider you authorized with, or revoke your Google account from accessing its APIs. That means if you’re using Moz for scraped ranking results and you’ve authorized Google Analytics, Google can at any time either blacklist Moz from using its API (revoking all access) and/or remove your ability to authorize Google APIs – with the former being the most likely scenario.

GWT ranking results vs. scraped data

Scraped data certainly has a lot going for it. The biggest pros include the ability to get:

Specific universal rankings
Competitor rankings

However, there are some things it simply can’t get that only Google can provide. For example, scraped rankings cannot provide…

Query impressions, clicks and CTR
Keywords that Google is testing with your site
Average position based on real-life searches

If the campaign for your client includes a push for videos, you’ll want to know if the SERPs are displaying a video result and/or a page result. It’s also nice to know, analyze and keep tabs on where your competitors rank for the same terms you’re competing against.

However, aside from video-centric results, scraped rankings aren’t performance-based, nor are they insightful (at least not on the surface).

Unlike scraped data, GWT lets you know how your search queries actually perform in real-life searches. Real searches are personalized. They include a person’s search history, Google+ connections and other factors that in turn determine what the user will see.

Google makes sense of this diversity by reporting on the average position of a search query. Then it tells you how many times the result has been seen, how many people are clicking on it and its click-through-rate – something that’s close to impossible to perfectly match up with scraped data – or Google Analytics, thanks to (not provided).

Another argument for scraped rankings is that SEOs can see keywords that are barely ranking, especially long-tail keywords. Personally, I don’t find that data to be nearly as useful as GWT‘s.

In fact, GWT does something much more interesting and insightful with its results – something many people claim is an example of how bad the data is when it’s actually awesome!

GWT includes keyword phrases ranging anywhere between position 1 to the upper 500s and includes phrases even if their impressions and clicks are less than 10.

That data should be used as valuable insight from Google. For example, any keyword that falls within that range tells me that Google recognizes those phrases as being related to content on my site. More importantly, it tells me that those phrases are being tested in their SERPs – valuable information that scraped data cannot tell me.

Additionally, if keyword phrases have a high rank but few impressions and clicks, that tells me I may need to focus my campaign on more popular phrases or improve my landing page’s meta data and content.

Testing the data

One of biggest complaints I’ve read about GWT‘s ranking data is that it’s inaccurate. In my experience – aside from occasional anomalies that may occur with clicks and impress that are <10 – the data is quite accurate.

I compared two different sites with 8-9 keyword phrases that would fit a typical targeted campaign. I recorded results from GWT, AuthorityLabs and Moz, and also hand-checked the results in a browser I never use (Opera, which I installed just for this test).

The first site was a mobile computing site with a campaign to rank for phrases related to backing up your Mac. GWT, AuthorityLabs and Moz all had their hits and misses, but most were close enough to the hand-checked results. Three of the ranking results that were the most off from GWT also had impressions and/or clicks that were less than 10, so that was to be expected.

Keyword Phrase	GWT Rank	AL Rank	Moz Rank	Manual Check
backup for mac	20*	–	–	–
backup for macbook	9*	15	12	13
best backup for mac	11	19	17	11
best backup for macbook	5*	5	7	5
best cloud backup for mac	5	8	7	4
cloud backup for mac	19	16	19	16
mac book backup	3*	12	6	7
mac cloud backup	11	12	11	11

* Impressions and/or clicks <10

Next, I tested Raven’s Schema Creator site. In this test AuthorityLabs and Moz were spot on with my manual checks. GWT had three ranking results that were slightly different, most likely attributed to the fact that it reports the average ranking for real-life searches.

Keyword Phrase	GWT Rank	AL Rank	Moz Rank	Manual Check
schema markup	8	8	8	8
schema generator	1	1	1	1
schema creator	1	1	1	1
microdata generator	4	4	4	4
schema.org generator	2	4	4	4
schema	9	12	12	12
google schema	7	9	9	9
schema tool	1	1	1	1
schema maker	1	1	1	1

Which do you need?

There are some SEOs who will always need scraped data. If your methodology includes a quasi-scientific approach to reverse-engineering Google’s SERPs, then you’ll probably always want and need scraped data. The same is true if you need to know where your competitors rank at all times or need to know the exact type of universal result.

But most clients simply want to know how well their site is performing. And most modern SEOs are looking for the real proof that their efforts are paying off – proof that looks like organic traffic and conversions.

That’s why we added results-oriented data – like traffic and goals from GWT and Google Analytics – to Raven’s new ranking report. Combined with rankings data that comes directly from Google (just like your Google Analytics data) and provides unique insights, we think it’s the perfect ranking performance report for clients.

Update: As Remko van der Zwaag mentioned in the comments, you can filter results in GWT by locale. The article originally listed local/geo data as an advantage of scraped data, which is incorrect.

Keyword Rank Checker

Start Checking Your Keywords

Give your keyword research a massive boost by immediately getting actionable intelligence on the competition.

30 Responses to “Rankings Compared: Scraped Rankings vs Average Rankings”

Corey Eulas August 6, 2013
Nice Jon. Love the use of actual data to support your hypothesis & curiosity. Quick question – did the traffic from those keywords that GWT said you were getting match what you were actually seeing in Google Analytics (granted it didn’t fall in not provided). Presumably “mac book backup” traffic data should be quite inaccurate in GWT.
August 6, 2013 at 10:13 am
- Jon Henshaw August 6, 2013
  I just checked GA and it reported 1 visit for the past 30 days. To me, that says “mac book backup” is a low volume search query and/or my title and meta description suck 😉 I can also presuppose that there are actually more visits, but they’re hidden by “not provided” (as I think you were suggesting).
  August 6, 2013 at 10:30 am
victorpan August 6, 2013
I’ve had an embarrassing moment where Google WMT said clicks were say, 3000 – when the actual # of visitors to said page was 1500 in the same period. Since then I’ve had trouble trusting GWT 🙁
August 6, 2013 at 10:51 am
- Jon Henshaw August 6, 2013
  Clicks and Visits are two very different things. It’s possible that the same person clicked a result multiple times, especially since sessions can last hours in GA. More details here https://support.google.com/analytics/answer/1257084?hl=en
  August 6, 2013 at 11:21 am
  - victorpan August 6, 2013
    I’m aware. Do I think the unique visitors of my website keep pressing back to the SERPS and then click on said page again? Some of them might – but the magnitude was 10X. If these were clicks from a display campaign, I’d at least know there’s click fraud detection to get my back. GWT? Not so much. I’m not the only one that doesn’t trust GWT data: http://www.portent.com/blog/analytics/google-webmaster-tools-query-data-is-worthless.htm
    “Welcome to Google Webmaster Tools, where the clicks are made up and the CTR doesn’t matter” – just my two cents.
    August 6, 2013 at 12:19 pm
    - Mike Johnson August 7, 2013
      I have used and relied on the combination of GWT and GA for quite a long time and I have honestly never seen anything like you are describing. I manage over 150 sites and have been doing so since 2005. If you told me this 2 years ago I might have agreed, but now. No. Its far from perfect, but the data is very good.
      August 7, 2013 at 10:34 am
      - victorpan August 7, 2013
        I agree that the data is far from perfect, and I used to think the data is very good (I ran a test once). I know it’s wrong to judge things from a single incident, but we’re only human. Like you’ll stick to your experiences, I’ll remember mine.
        August 7, 2013 at 11:21 am
    - Mike Johnson August 7, 2013
      I have used and relied on the combination of GWT and GA for quite a long time and I have honestly never seen anything like you are describing. I manage over 150 sites and have been doing so since 2005. If you told me this 2 years ago I might have agreed, but now. No. Its far from perfect, but the data is very good.
      August 7, 2013 at 3:34 pm
      - victorpan August 7, 2013
        I agree that the data is far from perfect, and I used to think the data is very good (I ran a test once). I know it’s wrong to judge things from a single incident, but we’re only human. Like you’ll stick to your experiences, I’ll remember mine.
        August 7, 2013 at 4:21 pm
Kimber Scott August 6, 2013
Interesting info on GWT rankings data. I was under the impression that if you had 2 pages on the same site ranking for the same phrase that those rankings would be averaged. So if I had a page of my site ranking at #2 and another one at #22 GWT might report it as an average ranking of #11. The pretty deceptive to clients who then think they are not ranking at the top of the 1st page. Is that not correct?
August 6, 2013 at 12:26 pm
- Jon Henshaw August 6, 2013
  GWT takes the best (first) result from all searches and averages them out. It does not take the average of multiple results from an individual search. If an individual result had your site ranking at #2 and #22, it would count it as #2.
  August 6, 2013 at 12:58 pm
joeyoungblood August 6, 2013
I’ve seen rankings of 0 (no rankings, no search data) in GWT before, real life was far far different. Here’s a good example of Analytics vs. the Search Query Report: https://twitter.com/YoungbloodJoe/status/278972465374445569
I see this CONSISTENTLY for about 20%-30% of clients sites that I get access to. No matter what Google TRIES to convince us is real, the proof is in the data and it shows major issues with the report you use to create your rankings report.
August 6, 2013 at 2:13 pm
- Jon Henshaw August 6, 2013
  That’s the biggest complaint I’ve heard about GWT – its inconsistency. For the majority of the sites I have GWT and GA access to, the results are fairly consistent – not perfect, but close enough. Since I’m not Google, I have no idea why some sites seem totally off or when and if it’s going to get better. My guess is that it might be a combination of the GWT team and the GA team not being synced up (working in their own silos) and it may also have something to do with how GWT handles individual sites. It’s entirely possible that they treat sites differently based on age, authority, flags, etc…just like their algo does.
  August 6, 2013 at 4:04 pm
  - Mike Johnson August 7, 2013
    The best data definitely comes from GWT and GA integration. For whatever reason the GA data is always more accurate IMHO.
    August 7, 2013 at 10:35 am
joeyoungblood August 6, 2013
I’ve seen rankings of 0 (no rankings, no search data) in GWT before, real life was far far different. Here’s a good example of Analytics vs. the Search Query Report: https://twitter.com/YoungbloodJoe/status/278972465374445569
I see this CONSISTENTLY for about 20%-30% of clients sites that I get access to. No matter what Google TRIES to convince us is real, the proof is in the data and it shows major issues with the report you use to create your rankings report.
August 6, 2013 at 7:13 pm
- Jon Henshaw August 6, 2013
  That’s the biggest complaint I’ve heard about GWT – its inconsistency. For the majority of the sites I have GWT and GA access to, the results are fairly consistent – not perfect, but close enough. Since I’m not Google, I have no idea why some sites seem totally off or when and if it’s going to get better. My guess is that it might be a combination of the GWT team and the GA team not being synced up (working in their own silos) and it may also have something to do with how GWT handles individual sites. It’s entirely possible that they treat sites differently based on age, authority, flags, etc…just like their algo does.
  August 6, 2013 at 9:04 pm
  - Mike Johnson August 7, 2013
    The best data definitely comes from GWT and GA integration. For whatever reason the GA data is always more accurate IMHO.
    August 7, 2013 at 3:35 pm
gregory smith August 7, 2013
There’s still so much left on the table here, but nice post!
August 7, 2013 at 8:43 am
gregory smith August 7, 2013
There’s still so much left on the table here, but nice post!
August 7, 2013 at 1:43 pm
Remko August 7, 2013
Nice writeup. You say a big advantage of Scraped results is that it can be local.
While that is true, GWT provides the same option by geo-filtering. So it’s not really an advantage of Scraping.
Will Raven also provide geo- filtering for GWT rankings?
August 7, 2013 at 2:22 pm
- Jon Henshaw August 7, 2013
  Good point. You’re correct, you can filter by geo in GWT. However, when you filter “all the things” the exported data doesn’t contain the geo data. The only way to get the data for a locale is to filter it and then export it. That means we could do one or the other per campaign in Raven – you can have “all the things” OR data from a specific locale, but not both. I think that’s a good option to have though, especially for a regional campaign. We’ll definitely look into it, but it may have to wait until GWT updates their API to support retrieving the search query data and also either includes the locale in the results or allows us to get results filtered by locale.
  August 7, 2013 at 2:59 pm
  - Remko van der Zwaag August 7, 2013
    It would be great to have that feature in Raven. I work in the Netherlands for local clients mostly. For them (and for me) it’s only interesting what’s going on locally in terms of ranking.
    However, I’m with you guys; if it was up to me I wouldn’t be tracking rankings anymore (but traffic, conversions etc.). But, clients…
    August 7, 2013 at 3:32 pm
Remko van der Zwaag August 7, 2013
Nice writeup. You say a big advantage of Scraped results is that it can be local.
While that is true, GWT provides the same option by geo-filtering. So it’s not really an advantage of Scraping.
Will Raven also provide geo- filtering for GWT rankings?
August 7, 2013 at 7:22 pm
David Cohen August 7, 2013
It shouldn’t surprise anybody when nimble and progressive agencies that offer search/SEO line of business completely ditch rank tracking tools and reporting on rank, and in exchange reallocate human and financial resources to using data at the landing page level to inform search strategies for their clients.
As somebody who used to be in charge of finding and hiring SEO agencies, I can say that I lost interest in KW/ranking data years ago because it was mostly unusable and irrelevant data. And executives and boards of directors could really careless about any of that.
August 7, 2013 at 7:08 pm
Spook SEO August 7, 2013
“GWT team and the GA team not being synced up”
Good point Jon. That’s what I think is causing the inconsistency too. Because of these kinds of inconsistencies, client’s sometimes take their own stand on what they think is happening based on GWT when the fact is, their info is far from the truth.
August 7, 2013 at 8:11 pm
David Gaian August 16, 2013
Great artical Jon.
Has there been a blog post at Raven (or can you reference any quality external writings/discussions on the topic you respect) addressing the whole matter of White vs Black Hat SEO practices, esp as relates to use of tool and data sets like Raven?
Thanks Jon
August 16, 2013 at 4:28 pm
- Jon Henshaw August 19, 2013
  White Hat and Black Hat is a misnomer. It’s really about risky tactics. It doesn’t have anything to do with the tools and data you’re using. The only time it would, as an example, is if you were using software that exploited a hole in WordPress that added links to posts on other people’s blogs. Reputable and mainstream marketing apps like Raven don’t do ANYTHING like that.
  August 19, 2013 at 9:55 am
wfhs December 10, 2013
Its heartbreaking. Agencies just want to be able to provide a holistic view of search to their clients, showing what’s happening across both paid and organic search channels. Google wants to stop them at each turn. If they want to protect Adwords by blocking organic rankings and keywords on a platform level, they could at least make GWT rank data accurate.
I did see a reporting platform called BringShare that offers keywords ranking data from serps.com and Adwords/Analytics connections in a single report builder. Are they going to be told to pull out the external ranking data in the same manner?
December 10, 2013 at 8:24 pm
- Jon Henshaw December 11, 2013
  If they’re using scraped data (which it appears they are) and if they’re using the AdWords API, then I seriously doubt they’ll be able to keep it when the AdWords compliance team does their annual review with them.
  December 11, 2013 at 11:56 am
asher elran March 9, 2014
Well written, but it seems to justify Raven biz’s decision more than anything. Jon, we were testing Raven and now use it on a regular basis; we love the tool, especially the reporting module. However, it’s missing important elements that for now at least is a big determining factor for many web-marketers. Yes, I’m referring to organic position tracking. We use most of the techniques you’ve mentioned, but this is in **addition** to valuable organic information our clients expect to see. We obtain the data using RankTraker – the only solution that doesn’t rely on Google’s API. As for the data Google shares with us in Analytics and GWT, we already established that keyword data in GA is useless, which only leaves us with GMT. For those who don’t know, GWT is missing the keywords data as well. It appears that query data has stopped being fully recorded since late September 2013 (see here: http://goo.gl/HoqbCm). IMHO, any keyword level reporting based on Google reports will be deceiving and very inaccurate…and Google wants it this way. We as the end users of your tool, could lose clients because of this. Instead, like many others, we use other providers that take a firm position to keep scraping global and local data. This could change in the future, but seems the best and accurate solution for now in conjunction with landing page level data.
March 9, 2014 at 4:19 pm