In recent months there have been a few occasions where I’ve had to emphasise the importance of clear, well–structured URLs for websites. As any SEO knows, having the right keywords in the actual URL of a webpage will make that page more relevant, and will help the page rank better in search results. Yet not everyone is entirely aware of this.
I believe the value of optimised URLs go beyond simple relevance, though. To me, a properly structured page URL will carry a range of benefits, which go beyond SEO. Website URLs are also a user experience aspect, aid in site maintenance and content management, and benefit a company’s offline marketing in very real ways.
When we think of a website’s structure, we often think of a tree-like relationship between pages, as per this popular graphic from Moz:
However, it’s not always evident how this should relate to each page’s URL. For me, a good URL structure is one that conveys clear meaning and intent, describes the page’s place within the overall site structure, and offers clear navigational options.
That means a page’s URL needs to be hierarchical; there needs to be a clear parent-child relationship in the URL, identifying the page’s relationship with content that sits ‘above’ it in the site’s reversed tree, and opening logical options for URLs of deeper pages.
Let’s explain this by means of an example, one that I like to use in my lectures and workshops; a website that sells safety boots for construction and warehouse workers.
As a fan of Caterpillar boots, let’s take one of their popular safety boots as an example product, and clarify what would make for a great hierachical URL for this product page.
Main Category URLs
First of all we’ll need to rely on our keyword research to see what types of keywords people use to search for this sort of safety boot. A quick search using Google’s Keyword Planner reveals that ‘safety boots’ is popular, but not as popular as ‘work boots’. However, ‘work boots’ has a strong seasonal element, whereas ‘safety boot’ doesn’t, which leads me to suspect they might not necessarily mean the same thing. And indeed, ‘work boots’ are primarily intended for outdoor use, whereas ‘safety boot’ is more generic and can refer to both the indoors and outdoors work boot.
There’s also a strong geographic aspect to the popularity of these keywords, with ‘work boots’ the preferred keyword in the USA.
Which keyword do we want to use? Let’s keep it simple and stick to ‘safety boot’ for now, avoiding any issues around cultural differences and semantic variances.
So our top-level category will be ‘Safety Boots’. This gives us a pretty straightforward category page URL:
Next we’ll want to have a think about what kind of subcategories we want to identify. Again, we’ll need to rely on keyword research to ensure what we choose as logical subcategories aligns with people’s search behaviour.
Our keyword research shows that there’s a strong brand element to searches for safety boots, with users often looking for specific manufacturers like Dr Martens and Caterpillar. The popular of brand keywords is significantly higher than searches for specific attributes, such as ‘steel toes’ or ‘non-slip soles’.
So our first subcategories will be boot brands:
Now with two levels of hierarchy, we have created a site structure where theoretically any given product could be accessed within three clicks from the homepage. For me that’s an ideal scenario, but it does leave us with a dilemma: do we add further subcategorisation to enable category pages for feature-specific attributes, or do we rely on a different type of product filtering to enable users to find what they’re looking for?
There’s no one-size-fits-all solution, it really depends on your specific situation, requirements, constraints, etc. Basically you’ll have two choices: add one level of categorisation in a clear URL hierarchy, or rely on URL parameters to filter product lists down more narrowly:
Either one works, though the former option – sub-subcategorisation in hierarchical URLs – throws up a secondary issue: that of product URLs containing categorisation elements. More on that below.
Again, in an ideal scenario, your product pages are the children of your categories and subcategories, so should follow that hierarchical structure:
As far as URLs go, this one is about as perfect as it can get for the purpose of SEO as well as for usability. Just by looking at that URL, you already know what kind of page you’ll be visiting. There’s no ambiguity; the URL is completely self-evident and descriptive.
In terms of SEO, the ranking benefit of such a URL should be obvious. It contains two of the most relevant keywords that people search for, and has a clear categorisation hierarchy that allows Google to understand what it is looking at. Even before the page is crawled, Google has several relevance values to associate with the page’s content.
As per above, we could try to make the page URL even more search engine friendly by adding a third layer of categorisation, based on specific product attributes that people search for, such as ‘steel toes’ and ‘waterproof’.
However, when we add a third level of categorisation based on product attributes, we end up with a single product that can easily belong to multiple categories. We thus risk creating duplicate product versions:
We would then be forced to either canonicalise on a single product URL (which we really don’t want to, as it kills the URL’s relevance for the non-canonical attribute) or revert to root URLs for products instead:
This is not ideal, as we also lose a lot of the hierarchical URL’s keyword value.
URL Parameters Are Okay
If we instead use parameters in subcategory URLs to filter product lists on attributes, we can still use the full SEO-friendly hierachical product URL containing the main category and subcategory. The subcategory’s URL with parameters would show a filtered list, and we can ensure Google can crawl and index this URL, thus still conveying some measure of keyword value:
I also have a preference for using URL parameters for attribute-based filtering, as often we don’t necessarily want Google to index these filtered product lists. Many parameters, like pricing and size, are useful for website visitors to narrow product lists, but have very limited SEO value and can in fact cause crawl optimisation issues on your website.
By using parameters you enable easier crawl control – you can simply add those parameters that have limited SEO value (and which you don’t want crawled) to your robots.txt file, while you keep those you do want crawled unaffected.
In such a case, even when your website allows users to select multiple attributes to filter products by (i.e. faceted navigation), you still have full control over which pages Google can crawl and index, thus reducing crawl waste and preventing index issues.
Site Structure and Information Architecture
All of the above makes one thing really clear: you need to think about your website’s content long before you start coding a single line of HTML. It’s absolutely crucial to understand how your website’s content needs to be structured and made part of a coherent, meaningful architecture that has a place for all your existing content, but also allows for natural growth and expansion of your website.
Failing to plan your site structure is likely to lead to all kinds of issues – not just for SEO – as there’s a very real risk your website will not be set up properly to weather all the challenges that it may face.
This is why I’m such a big proponent of applying information architecture best practices to website design. Unfortunately this is often an entirely overlooked or, at best, hastily skimmed aspect of a website’s design brief. And yet it has such a crucial role to play in a website’s success.
One of the best works on information architecture as it relates to websites is O’Reilly’s so-called Polar Bear book: Information Architecture for the Web. The updated 4th edition has just been released, with an expanded remit to include all forms of digital design:
I can’t recommend this book strongly enough. If you care about websites and how they’re used and made successful, this is truly a must-read book.
On average I find myself mentioning this book several times a year in workshops and conference talks, and it still surprises me that so few industry professionals have read any of the book’s four editions, or any text on information architecture in general. To me, this is a crucial knowledge to have which encourages us to think about website structures in ways that allow us to maximise utility and enable scalable growth.
Often we as SEOs do not have the luxury to get involved in website projects early on and give our input on how the site is designed. Unfortunately, that means often we have to work with less than ideal structures, and find ourselves applying fixes to issues about content design and hierarchy that could so easily have been prevented.
If and when we do have the opportunity to get stuck in the early phases of a new website project, our most important input revolves around the design of the site’s information architecture, and – by extension – the hierarchical URLs that should emerge from that architecture. Our job as SEOs is made so much easier if a website is created with IA best practices in mind, and we should not allow web designers & developers to get away with shoddy implementations.
Clear, human-readable URLs that conform to a well thought-out website architecture are not optional features – they’re an essential ingredient for all successful websites.