Perfectly Imperfect Data: How to make data-informed decisions with rubbish data

Perfectly Imperfect Data: How to make data-informed decisions with rubbish data

14th June 2018

No marketing strategy should be based off gut-feel alone. In fact, marketing decisions must be formed with a strong foundation of data beneath them. Testing, measuring and iterating is a fantastic way to make smart business decisions and can support you to invest wisely in the channels which perform for you. But what do you measure? There are thousands of options available for collecting data, and often sources will differ when you’re trying to determine the value of each moving part in your strategy. Just take a look at how many different attribution models are available – just a drop in the ocean compared to the huge number of sources and data points accessible to us.

But are we about to be left drowning under this sea of numbers? I’d like to argue that all of the data we see each day is imperfect. No single source gives us all the information we need to the degree of accuracy we can trust. Instead, we need to make the best of the imperfect. Find the context and the trends amongst the absolute numbers to make informed decisions with data which is perfectly imperfect.

Perfectly Imperfect Data

The Prime Example: Google Analytics & Search Console

Google is the primary example when it comes to restricted data and SEO’s attempting to find insight in a less than perfect environment.

[not provided] vs. search analytics

When I think about missing data, the first element that comes to mind is the substantial difference between clicks in Search Analytics and Google Analytics’ traffic data for Google Organic search. It’s clear that there’s a significant volume of traffic missing from Search Analytics’ profiles and no way to determine exactly what percentage of visits this sample data is based upon. It’s frustrating to say the least. Similarly trying to align (not provided) to the metrics in Search Analytics rarely works due to the differences in data. Whilst multiple tools attempt to plot this, it’s always a ‘best estimation’ rather than hard science.

The key to moving beyond the challenges of poor data is to focus on comparing and contrasting the data. Regular exports of the changes in impressions, CTR and clicks will help you to identify trends within the data, rather than absolute numbers. If all brand terms have moved down by 40% in terms of impressions, then there’s clearly a drop in the brand awareness to focus your strategy on. By giving up the dream of completely quantifiable changes in site performance, and moving to a more trend based measurement process, your analysis will become quicker (though marginally less accurate) allowing you to focus on moving forwards with strategy.

Demographic information

The audience demographic section of Google Analytics should be a good source of additional information on the audience and help to build personas of those using the website, and how their behaviour differs between user groups. In previous years, this data was extremely helpful to marketers and business owners as it provided clear breakdowns on how elements like age or gender would impact the experience on site.

However, over the years (in particular the past 3 months) there has been a significant reduction in the sample sizes used in these audiences. Often We see as little as 15-20% of total traffic being used for a sample. That’s not enough to make informed business decisions.

So how do we combat this? It would be great if there was a workaround to gain more data on the actual customers on our websites, but with GDPR and more regulations on customer privacy being introduced, it’s certainly hard to envisage a way this could be done. Instead, I’d recommend running separate user testing experiments on different audience segmentations to identify any trends in behaviours. For example, testing a younger audience on a tablet versus an older audience on mobile etc etc. User testing may not be real website visitors and customers, but it does allow you to gain a solid set of test data in a controlled environment to base your strategy on.

Keyword Planner

The lack of accurate data produced from Google Keyword Planner is almost comical. Firstly there was the restriction of data on low paying/no budget AdWords accounts and then the grouping of keyword search volumes updates in 2016. It’s common knowledge that we cannot trust the numbers in this tool as accurate – the definition of imperfect data.

However, that doesn’t render the information which Google Keyword Planner provides as completely useless. Instead, I believe it’s important to use it as a trendline and benchmarking process. In particular the month-by-month breakdowns add context to if that search volume is high or low compared to the rest of the year. Looking at just one keyword in isolation creates complexities with how much you can trust the numbers, but by creating groupings and keyword sets you can generate your own comparisons.

For example; if you’re selling pencil cases you could compare a set of keywords based on colour vs. a set of keywords on size to determine which attribute is most important to be featured in your titles, headings, metadata and so on.

Since starting this article; Verve have also released a free tool called the Keyword Cleaner which works by simplifying complex variations of keywords; e.g. pluralisations or reordered phrases to calculate a base search volume across a group of similar terms. This provides a much more realistic estimate of true search volumes to use in forecasting and measuring market size.

Even still, this only provides an estimation by aggregating the keyword volumes across the group. It’s clear that getting accurate data for this is hard, and even with excellent tools, consumer behaviour can be complex.

We can’t dismiss the fact that more people may Google ‘package holiday’ more often than ‘inclusive holiday’ based purely on subliminal messaging around their day to day lives. With this in mind, trends change and consumer behaviour changes continuously before purchase.

Multi-channel Funnels

Consider, how many times do you visit a website before you purchase?

It might be that you see an item you’re interested in on Instagram via an influencer, you immediately view it on site but don’t purchase because of bank card faff whilst you’re commuting. Then maybe later that day your mind wanders back to that item and you visit on desktop directly, but your boss comes over so you stop shopping and get back to work. Four days later at home, vaguely remembering the site’s name you do a search to find the item and immediately convert.

This is a typical conversion path for users. They visit sites across multiple devices, and via multiple channels.

There’s been a huge increase in shopping in this way as more and more channels and devices become available to us. In particular, on big ticket items it’s rare that a consumer will immediately convert. Instead, multiple visits will need to be part of this journey and weighed up against one another.

This is a huge problem for those of us trying to attribute the value of each channel in our marketing strategy as one may not convert as the last click, despite it being crucial in the decision-making process. Add in the fact these users may be swapping devices and therefore near-to-impossible to track and we end up at a loss as to which channel did what, for each pound we make.

Whilst a black box attribution system or even a custom made funnel might seem like the answer here, there will always be limitations to using these. Be it cost or accuracy or customer data restrictions, sometimes it can be hard to understand what these external attribution systems are really worth to you. In an instance where these aren’t practical, using multiple attribution models at once can be an easy win. Rather than reporting on singular last click, multiply your reports to toggle between first click, equal distribution, weighted and last click models. This will help you see the range of minimum and maximum values each channel could have had in your purchase funnel and inform your approach accordingly.

Offline to Online

The majority of brands don’t rely solely on an online presence. It’s important that they diversify their marketing to cover offline channels; e.g. TV, Radio, Print and Out of Home, to gain the maximum brand attention. This is becoming increasingly important as brand becomes more significant in how websites rank organically. Though, offline is notoriously hard to measure. There are no tracking pixels or click-to-buys on a newspaper ad.

This means offline can boost online sales with customers coming from seeing an advert already ready to purchase upon their first visit to site. Your online metrics will look excellent in comparison to spend, but your offline marketing will appear to have had little impact. This isolation of budgets is part of the problem as it causes channels to operate separately rather than collaborate towards one revenue goal. To keep on top of offline users coming onto site, unique discount codes relevant to the advertising could be used to identify the number of purchases which take place who have seen the ads. Although this doesn’t take into account users who visit but don’t convert, users who forget the code and the knock-on effect of traffic in later months. Those problems as much harder to solve, and we are still looking for the perfect solution.

However, online can also boost offline. If you sell a particularly high ticket item or something tactile which users like to see and feel before purchase (For example a sofa or a bed), then it’s likely after browsing online, they may choose to visit the store to make their final decision. Instantly your online performance channels will be down on revenue compared to the value they truly had in the funnel. It’s important to look to solutions here which will help you to keep track of online customers when they move to the offline environment; for example by allowing them to book in-store consultations or providing a specific code for them to take to store for a discount on the product they’ve looked at online. This can close the loop on much of the online experience for your data collection; though inevitably some users will be missed with this process.

All in all, there is no sufficient solution to removing the data gap between offline and online. Each method risks losing individual customers and missing out on how to fully attribute each sale to the appropriate channels. Alongside the above, it’s essential to accurately track all activities in a single calendar. This means plotting all marketing channels and the actions against total revenue to identify particularly influential activities across the board, rather than just in their own channel.

Ecommerce vs. Amazon

In 2017, Amazon was accountable for nearly 44% of all ecommerce sales in the US – an enormous market share to be held by just one brand. Further from this, across retail as a whole Amazon took 4% of the revenue in the US in 2017, representing it staking its claim over the significant increase of online transactions taking place. So if you’re a business with an ecommerce store as well as a presence on Amazon, how do you collate the data across channels?

The simple answer is, you can’t. Not accurately anyway. You may run a fantastic influencer campaign on social media which generates a whole tonne of brand awareness, however these might not translate directly into sales on your site. Instead in a few months time there may be an increase in searches of your brand name, who end up on your Amazon store and then end up converting. How much of those amazon conversions were driven by your influencer campaign vs natural growth vs other activities? Amazon has a significant advantage with brand recognition, secure site, a trusted returns policy and fast delivery which often means users are quicker to convert here than direct on site. However, that shouldn’t mean that you cease all your online marketing external to Amazon as of course, these channels all interact.

To overcome this, the process is similar to understanding the impact of offline and online marketing upon one another. It’s key to have a detailed timeline of every branch of marketing and the activities which took place; plotted against sales from each channel you operate on.

Final Thoughts

Marketing is a tough sector to be in when it comes to finding truly accurate data. It can be easy to get trapped in the cycle of finding more and more data points, overloading your time spent analysing and causing choice paralysis when it becomes time to put together the strategy. Instead of fixating on absolute numbers, I propose that we move towards much more comparative metrics. Looking at changes and trends across the different areas we work, and finding controllable test environments to identify core areas to focus upon.

“Half the money I spend on advertising is wasted; the trouble is I don’t know which half.”
– John Wanamaker


Written By
Hannah Thorpe is a Director at, with 3 years’ experience in content marketing and technical SEO so far. is a digital marketing agency which works across SEO, PPC, Content Marketing and Digital PR.
  • This field is for validation purposes and should be left unchanged.