It was 1:15 in the morning as I pulled my Subaru Legacy packed with our luggage, a frightened cat, a bored Boston Terrier/Pug, and a pregnant wife off the freeway into Albuquerque.
After 15 hours on the road to Nashville from California to fill the position of Product Marketing Manager for Raven, it was time to find our hotel for the night, but things weren’t looking quite right. My wife had copied and pasted the address into her navigation, and turn by turn it quickly became apparent that we were in a less than savory neighborhood – and nowhere near a hotel.
We’d just received the real world equivalent of a 404 error, caused by a small typo when they left the “North” off the street name. After locking our doors and redoing the search we eventually made our way to the hotel to get some sleep before continuing our cross-country trek.
When people reach 404 errors on your site, they’re being similarly derailed from their expectations, and you quite often lose that visitor for good – bad news for a blogger or site owner. If you’re starting a formal SEO audit, or just looking for a quick, actionable item of technical SEO, fixing 404 errors is a great place to start.
In this post we’ll look at the most common sources of 404 errors and the recommended ways to correct these errors quickly and efficiently, along with some tools to help make the process a little easier.
Common causes and cures for 404 errors
A 404 code is returned when the page cannot be found by a browser or crawling bot like Googlebot when it’s navigating its way through the links on your website.
This can cause problems of several sorts. For example, if you’ve got inbound links coming to those pages from other websites, you’re losing potential clients and conversions. You’re also losing the value of those links as Google seeks to determine your site’s relevance and authority. To get the best fix in place, it’s helpful to go cause by cause to identify the source of the problem.
Malformed URLS
This can be caused by a href links to pages being typed incorrectly or incompletely – for example, leaving off the “http://,” adding an extra set of quotation marks on either side of your link, or simply misspelling the URL.
To correct this 404 error, look at the “Linked From” tab on the Error Details popup of the Crawl Errors page in Google Webmaster Tools. Once you see which pages are referencing the page, edit and correct the link. As a follow-up, you should also add a 301 redirect for that malformed URL back to the page that had the error in the first place to make sure that any additional traffic or bots getting to that page know that it was an error, and it has been corrected. Last but certainly not least, you should return to Google Webmaster Tools and mark the error as corrected.
Category or tag removed
When you’re using a blog platform like WordPress and add either a tag or category, the action creates an additional page on your website that has a number of links pointing to it. If you later remove those tags but leave links pointed to those sections, you will send a visitor to a 404 page. If there’s enough value, it may be worth replacing those tags. If they’re not valuable, add a 301 redirect.
Page is missing or renamed
Sometimes that 404 error really does exist because the page was accidentally removed or renamed after it had been published and linked to initially. Here’s where you’ll need to decide if you want to recreate or restore that page as a means of fixing the error. If you choose not to restore or recreate the page, then choose the closest related page and use a 301 redirect to send any remaining traffic to that page instead.
Sitemap errors
Most platforms create their own XML sitemap file and keep it updated. But if you created and uploaded your own XML sitemap, you may have accidentally referenced a non-existing page. Update that file and add a 301 redirect the erroneous URL to a related page for good measure!
Server errors
There are a couple additional errors that can make a page inaccessible. For example, if the original URL redirected to a https:// and those pages are later removed, it will return a 500 error. You can add a 301 and solve that error. Additionally, you may also run into issues when changing from a .ASP or .PHP server that creates dynamic URLs to a static URL structure. This means that pages used to end with /pageid-12345 and now have a more logical format of /topicofthepage. If this is the case, you can add 301 redirects for folders and directories but it may also be too burdensome to use the .htaccess file for a 301 redirect and a programming solution may be needed.
You intentionally removed the page
If you discontinue selling a specific product or offering a service and really want to signal that the page and its related content is removed, replace that 404 error with a 410 server response code of ‘Gone’ to speed up the process for crawlers to stop attempting to access that page.
Tools for discovering 404 errors
Of course the first step to fixing a problem is identifying that you have a problem. Luckily, there are many different tools that can help you identify broken links without having to manually check and click every link on your site. Here are three tools you can use to identify 404s quickly so you can get them fixed ASAP.
Google Search Console
Google’s Search Console provides a list of 404 pages that it finds as it crawls your site. It’s recommended that once you fix a 404 error you click “Mark As Fixed,” to indicate the issue has been resolved. You can also Fetch the page as Googlebot to confirm that your corrective action was applied properly.
Site Auditor
The quickest way to find all of the 404 errors on your site is to use the Site Auditor. This tool will crawl and analyze every page on your site and report on all the 404 page errors it finds.
Getting the job done
Hunting down 404 errors can be a tedious job, but it’s one of those steps that should be part of any good site audit. So buckle down and fix those pesky 404 errors! Know of some better tools or extra steps for fixes? Let me know in the comments!
Link Spy helps you find top-quality links based on those websites that are already ranking for your focus keywords.
Nice writeup Jeremy. Just printed along with a couple of other similar articles to jump in and start reducing 404s and 500s. I’d also add Bing WT to the list to compare with Google WT and see if the errors being detected are consistent. Doesn’t take much time to add and could provide some valuable info that the other missed.
Thanks for writing this helpful aricle
One more for you: Using https:// when the page is just http://. We see that often in our crawls.