Depending on who you ask Xenu is either the dictator of the “Galactic Confederacy” who, 75 million years ago, brought billions of his people to Earth in a DC-8-like spacecraft or a tool that finds broken links. The second option doesn’t sound as interesting but to you and me it is far more valuable.
Earlier this week I was speaking to a few SEOs I know who work in house and I was surprised to hear that neither of them had heard about Xenu. I was even more surprised since they both work for large brands whose sites have over 50k pages indexed and if anyone can benefit fromXenu it is large sites. Xenu was originally built as a broken link checker but as I will illustrate you can use it as a basis for technical and on poge analysis but to also as a basis for client questions and discovery. A lot of our regular readers will most probably be familiar with Xenu although there are hopefully some takeaways for everybody.
Load up Xenu and enter in the website you want it to crawl. Before you click run, go into the more options section in the bottom left and make sure you click ‘treat redirections as errors’ as this will now identify the http status codes for all redirects (important if you want to benefit from amending 302s to 301s)
Once you have saved this file, open up an excel spreadsheet, click file => open and find your text document and then click open. Follow the following instructions to transform your text document into an editable and manageable excel spreadsheet.
For anyone wanting to skip the main course and dive headfirst into the desert simply go to the status codes column. This drop down will give you a list of status codes but what you are looking for are the following for quick actionable issues:
-2 This is an external link the tool has found
302 A temporary redirect
301 A permanent redirect
404 Page not found
-2 informs you of an external link on your site.
If you have just taken a new client on this can be valuable in three main ways. Firstly, you get a nicely organised list of all the external links, secondly, you can gain an insight into the existing relationships the site has and lastly it allows you to follow up and add these as potential link prospects for the later stages of your campaign.
This will give you an insight into the historical changes of the site and will highlight all 302 links. Most often than not these links are not temporary and are not passing on the link equity to the new pages.
Implementing a 301 redirect will pass the value to the new pages and depending on how many there are could give you domain/page a slight bump.
This identifies pages that could not be found by the server.
Apart from fixing these links, one tip is to find the URL within the Xenu tool, right click and use the wayback machine to see what content (if any) used to be published there. You might find out it is seasonal content in which case you should read the article surrounding what to do with old seasonal content and then try implementing “living URLs” as explained by Michael Grey.
SEOs have been saying this for years now but improving site speed is important. Not just for SEO purposes but for the end user. As explained in this article the faster a site is the more pages a user will be likely to view so go away and make sure you action the tips on optimising page site speed. I use Xenu and firebug to test site speed in the early stages of an audit. Start by sorting the spreadsheet by the ‘size’ column, in descending order.
Xenu automatically displays the file size in bytes which can be a bit overwhelming but to get a more laymen idea of the size you can use simple formula and divide by 1024 to convert the bytes to kilobytes and make things a bit easier to read. Dividing this new column by 1000 will give you the megabytes size which most people are familiar with and will give you a greater idea of whether the size is too big.
Make a note of files that are too large and use firebug to test how long it takes to load individual elements of a page. Think about compressing files or hosting video content on its own unique URL to reduce page load times of important pages.
Duplicate (duplicate) Content (content)
The recent Panda update means it is important to remove or amend duplicate content pages and filter out the poor performing pages of your site. The process of reassessing “low quality” pages, especially for large sites, should be a part of every SEOs strategy. Methods of removing low quality content have been discussed through this methodology and you can also use analytics to help.
Slightly off point but valuable nevertheless, use Xenu to check internal duplicate content as these pages can be fixed with redirects, canonical or no index tags. Some pages can also be improved, for example if it was a new product page which automatically implemented the default title tag setting that is common on many CMS. Manually checking your site would obviously take a lifetime so Xenu is great for speeding this process up by 5 years or so. In your spreadsheet click on the dropdown of the title column and deselect all. Now you can go through and click on the titles and see if they produce one or say several URLs. If they use the same title tag then they most probably have the exact same content on the page too. Look at this content and see how it can be dealt with or made more unique.
Analyse the “Money” Pages
What do we mean by money pages? Key pages which contain your core set of keyphrases and are more likely to result in conversions. These are the most important pages of a site so it is crucial they are not only reached within as few clicks as possible but also linked to effectively throughout the site. This is particularly important for larger sites which have a range of silos within their architecture.
This equates to how many clicks away from the homepage the content is. Sorting your data by the level allows you to identify “money” pages which may be more hidden than ideal. Preferably all key pages should be located within 1-2 levels away from the root. If you find you have key pages hidden 3,4 or 5 levels down make a note and come back to address the information architecture.
Always remember that information architecture shouldn’t remain static, it should change in line what customers’ needs and tools such as analytics, Google trends & insights should be utilised to sculpt the architecture.
One process that should be continuously monitored is the internal site search feature within your analytics software. If you are looking to insert additional internal linking or create new navigational areas this can be a goldmine. This process enables you to identify what users are actively searching for that they can’t find easily enough within your site. Out of this process could come cross linking opportunities, new categorisation, or new internal links such as ‘top selling products’ to drive visitors to pages they want quicker. This article by Avinash Kaushik on internal site search is old but still has some very relevant points.
Organising your spreadsheet now by Links in and by descending order will highlight pages that have been either ignored or forgotten about.
If you imagine a pyramid built of glasses, if you pour champagne into the top glass you expect it to flow down slowly filling up the cup beneath a little less than the top layer. This is how your data should look and if it doesn’t then something is wrong and it should be addressed.
Since the Panda update, the importance of individual page link metrics has increased and pages located deep within the site without any links or treated negatively, or at least in comparison to how they used to be treated when pages would rank purely from being hosted on domains with strong metrics.
One way of increasing the amount of internal links to a page would be using a variety of tactical linking methods such as:
- Standard navigation
- Ancillary navigation (footers)
- Breadcrumb trails
- Cross linking from similar product pages
- Listings of previous on-site search results
- Implementing widgets to pull through most popular products
- Using blog content to deep link
These are very simple yet very effective methods. It is utilised mainly by B2C organisations as it is very customer centric although if you are clever and creative enough you can find many ways of cross linking.
Here are some great examples:
There is a lot more you can do with Xenu so feel free to tweet me but in the meantime that should be enough to set you on your way.