Technical SEO: The ABCs of 404 errors
Written by Jeremy Rivera and published
It was 1:15 in the morning as I pulled my Subaru Legacy packed with our luggage, a frightened cat, a bored Boston Terrier/Pug, and a pregnant wife off the freeway into Albuquerque.
After 15 hours on the road to Nashville from California to fill the position of Product Marketing Manager for Raven, it was time to find our hotel for the night, but things weren’t looking quite right. My wife had copied and pasted the address into her navigation, and turn by turn it quickly became apparent that we were in a less than savory neighborhood – and nowhere near a hotel.
We’d just received the real world equivalent of a 404 error, caused by a small typo when they left the “North” off the street name. After locking our doors and redoing the search we eventually made our way to the hotel to get some sleep before continuing our cross-country trek.
When people reach 404 errors on your site, they’re being similarly derailed from their expectations, and you quite often lose that visitor for good – bad news for a blogger or site owner. If you’re starting a formal SEO audit, or just looking for a quick, actionable item of technical SEO, fixing 404 errors is a great place to start.
In this post we’ll look at the most common sources of 404 errors and the recommended ways to correct these errors quickly and efficiently, along with some tools to help make the process a little easier.
Common causes and cures for 404 errors
A 404 code is returned when the page cannot be found by a browser or crawling bot like Googlebot when it’s navigating its way through the links on your website.
This can cause problems of several sorts. For example, if you’ve got inbound links coming to those pages from other websites, you’re losing potential clients and conversions. You’re also losing the value of those links as Google seeks to determine your site’s relevance and authority. To get the best fix in place, it’s helpful to go cause by cause to identify the source of the problem.
This can be caused by a href links to pages being typed incorrectly or incompletely – for example, leaving off the “http://,” adding an extra set of quotation marks on either side of your link, or simply misspelling the URL.
To correct this 404 error, look at the “Linked From” tab on the Error Details popup of the Crawl Errors page in Google Webmaster Tools. Once you see which pages are referencing the page, edit and correct the link. As a follow-up, you should also add a 301 redirect for that malformed URL back to the page that had the error in the first place to make sure that any additional traffic or bots getting to that page know that it was an error, and it has been corrected. Last but certainly not least, you should return to Google Webmaster Tools and mark the error as corrected.
Category or tag removed
When you’re using a blog platform like WordPress and add either a tag or category, the action creates an additional page on your website that has a number of links pointing to it. If you later remove those tags but leave links pointed to those sections, you will send a visitor to a 404 page. If there’s enough value, it may be worth replacing those tags. If they’re not valuable, add a 301 redirect.
Page is missing or renamed
Sometimes that 404 error really does exist because the page was accidentally removed or renamed after it had been published and linked to initially. Here’s where you’ll need to decide if you want to recreate or restore that page as a means of fixing the error. If you choose not to restore or recreate the page, then choose the closest related page and use a 301 redirect to send any remaining traffic to that page instead.
Most platforms create their own XML sitemap file and keep it updated. But if you created and uploaded your own XML sitemap, you may have accidentally referenced a non-existing page. Update that file and add a 301 redirect the erroneous URL to a related page for good measure!
There are a couple additional errors that can make a page inaccessible. For example, if the original URL redirected to a https:// and those pages are later removed, it will return a 500 error. You can add a 301 and solve that error. Additionally, you may also run into issues when changing from a .ASP or .PHP server that creates dynamic URLs to a static URL structure. This means that pages used to end with /pageid-12345 and now have a more logical format of /topicofthepage. If this is the case, you can add 301 redirects for folders and directories but it may also be too burdensome to use the .htaccess file for a 301 redirect and a programming solution may be needed.
You intentionally removed the page
If you discontinue selling a specific product or offering a service and really want to signal that the page and its related content is removed, replace that 404 error with a 410 server response code of ‘Gone’ to speed up the process for crawlers to stop attempting to access that page.
Tools for discovering 404 errors
Of course the first step to fixing a problem is identifying that you have a problem. Luckily, there are many different tools that can help you identify broken links without having to manually check and click every link on your site. Here are three tools you can use to identify 404s quickly so you can get them fixed ASAP.
Google Webmaster Tools
This Google tool has slowly evolved to provide a direct view of errors Google has spotted on your site. If you already have a Raven Tools account, you can access Webmaster Tools at Site > Webmaster. Once you authorize the connection, it’s hassle free.
Once you’re in Webmaster tools, click through to Health > Crawl Errors to get the latest information from Google on the errors it’s seeing on your website. Click “Not Found” to get a table of all of your broken links. If you click on a specific link, you’ll see details on the nature of the error and what pages are linking to this page, which may give you a clue about the cause of the error.
It’s recommended that once you fix a 404 error you click “Mark As Fixed,” to indicate the issue has been resolved. You can also Fetch the page as Googlebot to confirm that your corrective action was applied properly.
Xenu Link Sleuth
Screaming Frog SEO Spider is a powerful desktop program that gives you your entire site structure, metadata and other pieces of information. But you can also use it to spot your 404 errors while you’re slicing and dicing the data for other action items.
Update by @RavenJeremy 7/26/2012: @pageoneresults shared an extra tool you can use: http://urivalet.com which allows you to query a page, see your server response headers, tweak the user agent and view page load speed, page objects and status of all your on-page links.
Getting the job done
Hunting down 404 errors can be a tedious job, but it’s one of those steps that should be part of any good site audit. So buckle down and fix those pesky 404 errors! Know of some better tools or extra steps for fixes? Let me know in the comments!