{{ searchResult.published_at | date:'d MMMM yyyy' }}

Loading ...
Loading ...

Enter a search term such as “mobile analytics” or browse our content using the filters above.

No_results

That’s not only a poor Scrabble score but we also couldn’t find any results matching “”.
Check your spelling or try broadening your search.

Logo_distressed

Sorry about this, there is a problem with our search at the moment.
Please try again later.

Every page you visit on the Internet will return something called a ‘status code’, a code consisting of three numbers that communicate to the requester the status of their request for a particular page.

A 404 is ultimately an error message by default and is a very frequent and recognisable message experienced by every single internet user. 404s are not inherently bad, they exist for a very good reason.

Their ambiguous nature however means that search engines (and your users, and your rankings) will often benefit from some direction on what action to take when they come across them. Without this direction and left unmanaged, 404 errors are problematic.

Here are the SEO impacts and the possible solutions...

Status codes

What is it?

Every page you visit on the Internet will return something called a ‘status code’, a code consisting of three numbers that communicate to the requester the status of their request for a particular page.

These can be set by the administrator of a server or be a default server communication based on certain criteria being met (or not met). The following are the most frequently returned status codes:

  • 200 OK: The page you are requesting has been found and here it is.
  • 301 Moved Permanently: The page you have requested has moved permanently from the location you’ve requested it from (Location A) to another location (Location B), and here it is.
  • 302 Found: The page you have requested has moved temporarily from the location you’ve requested it from (Location A) to another location (Location B), and here it is.
  • 404 – Not Found: The server you are requesting the page from has acknowledged your request but the page you are requesting could not be found.

The last of these, the 404, is an ambiguous status code as the server cannot find what you are looking for but has made no attempt to contextualise why that might be.

Is it because the page was removed by the webmaster, or the URL was mistyped by a user? Is it because a malformed internal or external link was followed to the failed location from another website?

Or is it because the page was deleted or renamed, intentionally or unintentionally?

A 404 is ultimately an error message by default and is a very frequent and recognisable message experienced by every single Internet user.

You can check whether a URL is delivering a 404 response by using the ‘Fetch as Googlebot’ feature in Google Webmaster Tools, as well as a number of tools that will crawl your site and identify them all, tools such as Xenu and many others.

The check is important because many people have 404 pages that look like 404 pages, complete with a standard ‘Something went wrong’ message, but the implementation was incorrect and the response actually delivered is a ‘200 OK’, i.e. it looks like a 404, reads like one, but technically isn’t because the status code returned isn’t a 404.

That is called a ‘Soft 404’ and is far more common than you might think – even Intel has made that mistake and academic institutions too.

Intel 404

What are the SEO impacts?

Firstly, 404s are not inherently bad. They exist for a very good reason and the search engines expect to see them on most sites. Their ambiguous nature however means that search engines (and your users, and your rankings) will often benefit from some direction on what action to take when they come across them.

Without this direction and left unmanaged, 404 errors are problematic for two reasons:

Firstly, 404s often introduce link, page and site integrity and fidelity issues. At the most basic, 404s on your site can break crawl paths and impact on accessibility, and attempts to manage 404s often create even bigger problems, for example when SEOs and webmasters make poor decisions around where to 301 redirect them.

Furthermore, a search engine must make a judgement call on a site in its entirety if it is seeing a huge number of 404s as a percentage of all pages on the site.

Secondly, a search engine will be allocating link equity across the pages of the internet by following links from pages to pages and a 404 header response breaks that chain so a search engine needs to decide how to algorithmically deal with that.

Let’s call that a ‘link sink’, with the implication of a ‘sunk cost’ quite intentional given the marketing and proactivity that may have led to that link being placed that ends in a 404 on your site.

With big sites, and those that may have accumulated a large number of 404 pages over time, the quantity of lost link juice may be substantial and herding it would be a legitimate and good use of your time, as well as using your default approach to 404s to pre-empt the most common problems.

Ultimately, SEOs and webmasters will very likely have existing 404 problems, deficiencies, and inefficiencies to resolve, but also need to put in place a robust infrastructure and process for it to be as self-maintaining and optimising as possible, particularly for huge sites.

What are the possible solutions?

SEOs and webmasters typically believe that they have three choices with regards to how to manage 404 pages.

1. Do nothing

Search engines are really smart these days and some SEOs and webmasters believe that there’s very little value to be found in trying to manage 404s and that, assuming the site is configured properly, that the search engines will pretty much take care of everything.

2. Use a soft 404 rather than a real one

The rationale here for many is that a real 404 cannot be manipulated from an SEO perspective fully as by its nature you are instructing a search engine to purge the page from its index.

With a soft 404 the page can contain links to your commercial pages and you can ‘funnel’ link equity around the site like the administrator of a complex aqueduct.

3. 301 redirect all 404 pages to the homepage

Some SEOs believe that there should be no 404s returned by the web server...ever.

This school of thought dictates that every 404 be 301 redirected to the homepage automatically as and when they materialise to preserve link equity and also give consumers a starting position if they were to come to the site via that 404.

4. 301 redirect all 404 pages to a related and relevant live page

As above but with some logic to dictate where a 404 page should be redirected based on page relevance, funnelling link equity to a more appropriate page than the homepage, and also funnelling link equity to arguably more appropriate pages than just the homepage.

In reality, it is a combination of those four solutions that will be right and each solution will be different depending on the site in question.

Guidelines to maximise SEO value

All solutions, however, must be consistent with the following guidelines to maximise SEO value:

1. Do not use soft 404s and test your 404s to make sure that they have been implemented correctly.

Alternatively, you can use a 410 status code rather than 404 – Google suggested that both 404 and 410 were considered by Google as identical in 2007, but by 2009 they suggested that they were considered different by Google and that a 410 may expedite the purge.

At the very least they’ll be deemed comparable in intent. If you want to create a custom 404 then you can do so with it still being a real one.

You may be tempted to create a custom page that is quirky and innovative so that it can attract links, whose link juice can then be funnelled around the site via links on that custom 404.

This would only work if it were a soft 404 page, not a real one (as the search engines won’t follow links from a real 404 page).

Whilst these can be incredibly cool, they do not come with the other benefits of using real 404s (automatic housekeeping, link juice preservation, intelligent redirection for consumers, etc, etc).

So, if you want a custom, novelty 404, just make sure it is returns a real 404 status code, but be willing to forego any links that it might attract – you should just consider it viral marketing, as opposed to SEO marketing.

2. 404 pages that receive traffic should be redirected

They should be redirected to a page that is the most appropriate to its original topic but that will also not jar with human users if and when they are redirected.

A good way of doing this is to have a search box on your 404 page and see what people search for after arriving at such a page, which then will help determine where those people should be redirected to.

Always remember that in many cases you aren’t just redirecting pages and search engines, but real people, with real money, and real buying intent.

3. 404 pages that have inbound links from other websites should be redirected to pages that are consistent with the anchor text mix of the links 

If there isn’t a page where the anchor text profile will not conflict with how Google perceives the page you’re thinking of redirecting the 404 to, then 301 redirect it to the sitemap instead (or the homepage if the inbound link anchor text profile is consistent between both pages). 

4. Leave 404 pages that have no traffic or link value as they are and the search engines will purge them from the index.

If speed is of the essence then a 410 may expedite matters. Remove all links to those pages from your site though to conserve link equity and improve your user experience. 

Bespoking an approach based on the guidelines above can be done in a number of ways, including building a custom 404 handler.

This is a method of adding your own custom code to how your server deals with 404s, including conditional arguments before returning the 404 message. For example, you could code your 404 handler to check for the requested URL in a database you might have to determine where to redirect it to.

You could even have the 404 handler check that the URL receives traffic and if it does then to redirect it to a page that has the closest anchor text link profile, and if it doesn’t receive traffic or links to leave it be, etc, etc.

There are practically no limitations to the power of a customised 404 handler other than determining what the effort versus benefit might be of the coding effort.

Typically it is monstrously large and complex websites that would benefit most from that level of automated intelligence. All sites large or small will however benefit from an optimal and consistent approach to the management of 404s.

Pros and cons of 404 solutions

Andreas Pouros

Published 5 September, 2013 by Andreas Pouros

Andreas Pouros is COO at Greenlight and a cotnributor to Econsultancy. You can connect with him on Google+.

15 more posts from this author

Comments (4)

Avatar-blank-50x50

Dennis

Good post Andreas. I actually get a lot of questions from clients that are obsessed with 404's that they see in GWT. They worry about the urls that weren't even really there (as scraper sites pointed to a url that never really existed)

I can just point them to your article now lol :)

Keep it up!

over 2 years ago

Avatar-blank-50x50

Kristian Humle Lauritsen

Great extensive post on the area.

Another 404 mistake I have encountered several times is not having Web Analytics installed on your 404 pages. By having this you can more closely monitor the volume of these pages. But more important SEO wise you are able to track the external referrers to the 404 pages and identify external links that can easily be optimised to link to correct pages for optimal link value.

over 2 years ago

Avatar-blank-50x50

J P Nayak

Truly most irritating to get 404 error as a result & worst possible user experience.
Really value reading this post. Cover almost everything related 404 error.

over 2 years ago

Avatar-blank-50x50

Shakhboz Sidikov, Managing Director at Adigmo

This report is very value to read and to get some solutions , thanks for the advices.
Shakhboz Sidikov
http://www.Adigmo.com/

about 2 years ago

Comment
No-profile-pic
Save or Cancel
Daily_pulse_signup_wide

Enjoying this article?

Get more just like this, delivered to your inbox.

Keep up to date with the latest analysis, inspiration and learning from the Econsultancy blog with our free Daily Pulse newsletter. Each weekday, you ll receive a hand-picked digest of the latest and greatest articles, as well as snippets of new market data, best practice guides and trends research.