Killing your darlings – a practical guide for cutting the dead wood

We write a lot at Click Consult, and have since I started at the agency some three years back – and while we try never to publish an article just for publishing’s sake, over time there’s an accrual of what Philip K. Dick might have termed ‘kipple’. There are blogs which covered big news events, blogs which cover aspects of the industry which are no longer relevant, blogs which stopped getting traffic 18 months back. It turns out that blogs, like all things, adhere to the third law of thermodynamics – they tend toward chaos.

I’ve seen a few articles recommending removing various pages, but these tend to erroneously link the endeavour to conservation of crawl budget – this isn’t one of those, and unless your site is entering in to the millions of pages, you should probably ignore the ones that do. There is, however, I learn (after getting 2/3 of the way through this one) an Ahrefs blog dealing with this subject rather well – though with some different approaches in places – I’ve tried to use a variety of tools and techniques, for example, which should hopefully make this endeavour repeatable by someone without access to one or two of them.

This is simply about getting rid of the chaff – the stuff that just isn’t required anymore, pages and articles which might result in keyword cannibalisation, or may provide information to the user that, while correct at the time of writing, makes no sense anymore. It’s predominately an endeavour akin to a spring clean, and one which should be carried out if only for the benefit to user experience (UX).

cannibalisation

I’ll be making reference to bits and pieces of my previous article on report automation – for the sake of not torturing readers of that previous article with more description of various formulae. If you haven’t read it – it’s over here. If you have – here are some relatively simple steps for killing your darlings.

Step one – The Setup

A combination of data entry and data retrieval, I used five tools for this review – Google Analytics, Google Sheets, Google Analytics Add-On for Sheets, Moz’s link explorer and Screaming Frog. It’s possible to carry out the review using just the free Google tools, but you want your review to be as robust as possible, so the more data sources you have, the better the judgements you can make.

In fact, were I to have infinite time – I’d have added Deep Crawl, Ahrefs and SEMrush data in to the initial review, but sadly I am but one man and had to complete the endeavour before websites are rendered useless by the inevitable collapse of western civilisation (this is also the reason I’ve started the process on one sub-folder rather than on the site as a whole).

What data to use

To give myself the best chances of making informed decisions, I ran the blog through Screaming Frog, exported the results and essentially copied and pasted the values I needed in to a Google Sheet. These include the below

log cull spreadsheet image1

This gave me a list of all the blog titles, their URLs, the word count and status code – to get the second word count field, I essentially subtracted the approximate word count of the template page (so the sidebars, footer and menu bar etc.).

From there, I used the Google Analytics add on for Sheets to pull data for both a ‘12 month and an ‘all time’ date range then used the ‘Blog URL’ from the Screaming Frog cells and some extensive CONCATENATE, to create look up formulae to pull the report data in to the summary sheet (again – how to do this can be found in my previous blog, with some more information in another blog here). To calculate a better indication of the word count for each blog, I used a simple =SUM, subtracting the total word count of a template page (the menu, sidebar info, footer copy etc.) from the Screaming Frog number, but the rest is raw data from the tool.

log cull spreadsheet image2

Sadly, because the MOZ Link Explorer API is so far above my skill level as to be invisible to the naked eye, I then had to slog through each row, manually searching and inputting the Page Authority (PA), link count and average link Domain Authority (DA).

blog cull spreadsheet image3

With this data gathered, I added filters to the second row, and began the next step.

Step two – review, redirect and delete

The easiest place to begin is in a place which requires no real value judgement – the common sense removals. These tend to fall in to one of four categories (though, doubtless, there are more – these four are as good a place to start as any) all of which will have, more often or not, low or minimal traffic during the preceding 12 months:

  • No longer needed: – these are things like staff profiles for former employees.
  • Out of date: – these can be things like ‘industry reviews’, ‘best practice’ and other types of content with an expiration date.
  • Thin or sub-par content: – tend to be company announcements and similar less than meaty content types or attempts to deal with a complex topic that have not gone well.
  • Overlapping content: – this is content which, while the keyword research may have been done on each piece, neglected to take in to account that it was targeting the same keywords as other pieces.

I’m going to break these down and give some ideas as to how to deal with each type, but while I can offer general advice, every brand will have a different content strategy, so will need to assess their content honestly for signs of each type.

no longer needed

No longer needed

Steve was great – he was the life and soul of the party, great at his job, such a good sport. Sadly, Steve left the business ten years ago and nobody remembers him – except Google, who still features his smiling face (what a face) in the image results for your brand. The reason is – someone profiled Steve for the newsletter and his witty ramblings are sat on page 50 of your blog, like Dorian Gray – never ageing, never showing a sign of the stress and hair loss that Steve has no doubt experienced since.

There is a ton of this type of blog all over the internet – doing little but taking up terabytes of space on some server farm. While it probably isn’t doing much harm to your SEO, it’s undoubtedly useless and, from a user perspective, actively misleading. So what do you do?

Is there a page that serves the same end?

In the case of staff profiles, this can be a full staff page; or in the case of a service or technology no longer offered or used – what replaced it? By redirecting content to these pages, you can not only clear unnecessary pages (making it easier for users to find the right one), but also ensure you’re providing the right information in the unlikely event someone goes looking for Steve in the future and, if the page has inbound links – some of the authority they pass can be directed toward the relevant page.

If there is no similar page…

Does one need to be created? The chances are, the post was created for a legitimate purpose and, unless your brand has changed considerably in the interim, that purpose must surely remain. If so, create a page that deals with the content in a current and factual manner that would be useful to users. If there was no real reason for the post, then it can be: (a) deleted – if it receives no traffic (you’ll be able to see this in the setup stage), or (b) redirected to the homepage with a pop up that explains why the post no longer exists (though a 404 will probably send better signals from an SEO perspective, so a custom 404 would be better in most cases) – there are dozens of ways to do this, but the JavaScript below will work fine if you create a branded pop up html file for the use.

function checkRef() {
if (document.referrer.indexOf('your-site.com/the-page-youwant-to-redirect-from/') > -1) {
window.open('your-pop-up-code.html', "your-site.com","toolbar=0,location=0,directories=0,status=0,menubar=0,scrollbars=1,resizable=1,width=450,height=450,left=30,top=80");
}

out-of-date-fresh-bad-apples

Out of date

These are also incredibly common – especially in the digital sector, where best practice changes from month to month. The main issue to consider with posts that are out of date is that they were almost certainly keyword targeted and are likely (at least more likely) to have garnered links as a result, so while the fix is fairly simple and obvious, there are additional considerations.

What to do…

The answer, in most cases, is to update the piece or, where this is more trouble than it’s worth (if the content requires completely rewriting), replacing with a new page and 301 redirecting.

Take the below example – it’s for a blog that, to all intents and purposes, advertises a downloadable resource (bad for EAT optimisation). Obviously, the set up will give you a lot of information – you’ll see the amount of traffic, the PA etc., but you don’t want to lose out on the keywords and visibility that such a blog delivers.

The easiest way to find out what you need to ensure is present in the replacement page (without using either Searchmetrics or Ahrefs – in which case, you use the URL and find the ranking keywords with ease) is to use Google Search Console. By checking the performance tab, setting your 12 month date range, adding your URL as a filter and then filtering position by ‘smaller than ten’, you can see which queries are ranking on the first page and, therefore, generating traffic.

search console capture1
Performance tab chart
URL filter dialogue pop up
URL filter dialogue pop up
Filtered for page one keywords
Filtered for page one keywords

Once you’ve established the traffic generating keywords, it’s important that your new page serves these keywords at least as well as the existing one, or you’ll risk losing the traffic the page generates.

chocolate teapot

Thin or sub-par content

Using the spreadsheet set up in step one, you can filter content to below a certain threshold. I chose 400 words – while this is above the minimum recommended (250-300 word) limit, I tend to feel that anything below 400 words is probably not painting a full picture, or is not thorough enough. Obviously, for things like product descriptions, this limit is probably too high, but, as the audit is at this point only dealing with blog content (content aiming to answer questions, offer opinions or discuss best practices and techniques) the limit seems fair.

For content such as that previously mentioned, the answer is reasonably simple – a 301 redirect to an improved resource landing page. Anything worth keeping from the article was transferred to the resource landing page before redirection.

However, because – for a long time – brands were advised that having content was good, and that as long as it was over 250-300 words, any content would do, there is a lot of half formed content online. By this I mean that brands were staking claims to keywords for the sake of it – writing short, generic, often very similar content just to have content that targeted a specific keyword.

As time and staff move on, these same keywords were often targeted again and again by new staff and these small, linguistic homunculi stack up over time.

What to do…

The easiest thing to do is to make lists of these half formed things, group them by topics or themes and once you’ve completed the organisational part of the project you can then look to assemble a real article – one that addresses the search term and provides value for the user – from the component parts. Not only is this likely to fare better with search engines, it becomes a piece of content you can be proud of and your users will be pleased to read, rather than a collection of a dozen fairly pointless pages serving no real purpose.

overlapping leaves

Overlapping content

While the content above obviously falls into this category too, I’m treating it as separate because even for brands actively pursuing best practice – i.e. writing high quality articles on important aspects of their industry – there are times when you will, through no fault of your own, fall victim to keyword cannibalisation.

One of the many ranking factors that have been ignored (at least until recent EAT and quality updates to the Google algorithm) is authority. Google wants to return the best possible answer to every query – so what are you telling the search engine by returning multiple pages for the same query? While some crossover is to be expected, there should be every effort taken to limit the number of pieces of long-form, targeted content that pursues the same keyword.

This is the area where you’re likely to experience the most pain when deleting and redirecting – the chances are, you’ll have put significant effort in to writing each, and the piece ranking the best may not necessarily be your favourite of the bunch. However, there is nothing wrong with really long content – provided it’s easy to navigate (for a simple way to do this, you can read the – pretty short – piece I wrote on using id tags for subsection linking). So, while you may lose pages dear to you, you can preserve the best sections by building out the best ranking page (or the page with the most links/highest PA) with the content from your other articles.

What to do…

For the sake of simplicity, I used a search of the setup sheet to look up focus key terms and partial key terms (which should be in the URLs of any onsite content) and filtered for word count and session count. Anything below 100 sessions in the past year, with fewer than 500 words that hadn’t been touched so far was searched for industry terms and a list of URLs collected for each term that could potentially be combined in to an improved piece of content. For example, we had a number of ‘top tips’ blogs on elements of a single subject – these are in the process of being combined and updated in to a more substantial and useful piece.

Deletion

Once the redirects are in place, the only thing left to do is to delete the old content so it doesn’t show in the blog regardless of your good work. Personally, and because I always worry about deleting content, I shifted the pages to drafts – I can delete them later if I needed to, but I can’t use them later if I’ve deleted them.

Step three – keep track and repeat

Firstly, I’d say that if – for whatever reason – you’ve decided to 404 any of your content rather than redirecting or using the pop up method, it’s important to ensure you use your tool of choice to remove any internal links to deleted pages. So long as you’ve followed the rest of the advice, however, the pages to which you’ve redirected should perform the same function as those redirected to it, so should be fine (though it can’t hurt to check if you have time).

Once this is either done or not done, you should repeat your initial crawl – to make sure you’ve not missed any redirects, metas etc. from the new post and make notes of the blogs you’ve created (especially those that have been updated as these will doubtless need to be updated again. This will stop future lengthy redirect chains – as, in future, if the replacement URL is updated, the old redirects should be edited to the new URL rather than redirecting on and on as this can cause issues – both in terms of UX and ranking.

How often should this process be done?

If your brand is like most, then this may be the first time this process has been completed – as such, it will have been fairly time consuming. The time it takes increases in a manner directly proportional to the time since it was last carried out, therefore I’d recommend leaving it no more (and preferably less) than every twelve months.

session totals jan 2019

Results so far

Microsoft Word tells me I began writing this shortly after lunch on August the 13th 2018 (it’s been one of those things that I’ve had to tinker with over time), so I can’t provide the immediate traffic increase charts that others carrying out a similar process have achieved.

However, the majority of the work (barring some compilation pieces undertaken in August and September) has taken place throughout January 2019 – January is historically one of our best performing months – and while this process can’t take all the credit (Click Consult is a search and digital marketing agency, we’re forever optimising), there has been a 65% increase in sessions YoY over January of 2018 which had been previously been our best January, and August through December all outperformed the previous year for organic sessions.

stephen king quote

Anything to add? Let me know in the comments.

John Warner

About John Warner

John is an internal marketer at Click Consult where he spends his time accruing industry certifications, tinkering with code, plotting strategy and writing articles on all aspects of search marketing. He also contributes to free quarterly search magazine 'Go Viral' and is the occasional host of the on.click podcast (available on iTunes and Stitcher).