Search Engine Accessibility: Easily Missed Checks That Make a Huge Difference

Search Engine Accessibility: Easily Missed Checks That Make a Huge Difference

12th July 2010

I’m not kidding about this – I recently came across a website that was blocking a user agent from a very well known search engine. No one at the company had any idea of the situation, and this particular situation had been costing them a ton of traffic over weeks and weeks.

Time to get the honesty box out: When was the last time you switched user agents? Checked a 304 not modified response? Made sure your canonical www redirect was working correctly? Some things are so easily missed in todays “out of the box” code world. Here are 5 quick checks that are so easily missed, but can save hours of head scratching!

Check your canonical redirects and domain inventory

Ok, if you’re a seasoned old timer, there’s nothing new here – but, be honest! When was the last time you checked your canonical redirects? Does your “www” redirect in, or out (depending on which you prefer) with a 301 server header response? Mine does – but I just checked SEOgadget’s for the first time in 6 months. This same tip applies to case redirects, trailing slashes and even your redirected domain inventory. Remember, web server configurations can change, often without the SEO being made aware.

Test your website is accessible with JavaScript disabled

Some websites won’t serve content to a JavaScript disabled browser. An age old problem that crops up from time to time, and a nasty problem for search engine traffic. I got an email from a concerned webmaster in Germany that, after migrating to a new website, had lost all his search engine traffic. JavaScript was the problem. Disabling JavaScript is easy in Firefox, using the SEOmoz toolbar or Web Developer Toolbar. Browse around your site and make sure all is well.

Periodically browse the internet with a different user agent

In the example at the beginning of this post I mentioned the problem with a single search engine bots user agent being restricted from crawling a site. I can’t remember the last time this problem cropped up, it’s so rare! Browsing the internet with your user agent set to say, MSNbot (or Bingbot from October 2010) can reveal some fascinating oversights, errors or dare I say, cloaking. SEOmoz’s toolbar or User Agent Switcher both offer the capability to switch user agents in Firefox.

Beyond 404’s – server header checks that get missed

Beyond checking that your error pages produce a 404 (and that Google Webmaster Tools isn’t reporting too many), you might want to consider digging in to your server header responses a little deeper. For example, a “304 not modified” is a response to an if-modified-since header field in the client request header. In English: some webservers will  respond with a “not modified” if the page requested hasn’t changed since the last time it was crawled. I’ve seen 304 responses handled really badly. In one situation, a web site was responding normally to all requests except when the if-modified-since header field was present. The server, instead of returning the correct 304 response, collapsed spectacularly with a 403 error. Oops! Test your site with Feed the Bot’s awesome 304 header checker tool (one of my favourite SEO tools, ever). Are 304 responses worth worrying about? Yes, if you have a large site. Bing and Google requests support if-modified, and check out this data on crawl coverage for pages with and without the conditional reponse active:

While we’re on the subject of server headers

Ever look out for the X-Robots tag? X-Robots is part of robots exclusion protocol (REP) and can be found in the server header response of a web page. You can noarchive, noindex, nofollow with an X-Robots tag, so it’s probably worth checking to see if something unexpected is lurking. You could even try checking for X-Robots with (and without) your user agent configured as a search engine… What are your oft-overlooked but seriously handy search engine accessibility checks?

Featured Image credit: Arthur Chapman

About the Author, Richard Baxter

Richard Baxter (@richardbaxter) owns – UK SEO Consultants helping people and organisations succeed in search. Richard has accrued valuable experience throughout his career in travel, engineering, recruitment, technology startup, retail and events industry SEO.


Written By
This post was written by an author who is not a regular contributor to State of Digital. See all the other regular State of Digital authors here. Opinions expressed in the article are those of the contributor and not necessarily those of State of Digital.
  • This field is for validation purposes and should be left unchanged.