Recovering From Data Overload in Technical SEO

Data-driven SEO is a worthy goal: the better the data from which we draw conclusions, the better our results. However, in the world of big data, the volume of data available to SEOs has exploded. The ability to choose the right data to use, the best metrics to track, and the most appropriate methods of analysis is often what makes the difference between implementing SEO best practices and offering a high-performance SEO strategy.

Below, I’ll show you some of the high-level tactics for focusing on data that matters for a website.

Living with an increasing number of data sources

The key idea to keep in mind is something you already know: SEO uses data to measure, monitor, and improve website performance.

As we’ve built and gained access to reporting, recording, and analytics tools, it has become increasingly complex to perform efficient reporting that provides practical and actionable insights.

First, there’s the issue of too many tools. In and of itself, drawing information from multiple sources can be a good thing:

  • Using different sources for different types of data often means that the tool’s creators have been able to specialize in the techniques and technologies specific to the data they report on.
  • When no means of reliable, direct measurement is possible, using multiple sources for the same data can allow you to obtain a better picture of the data.
  • Few data sources can provide all of the data you might want or need.

But multiple data sources and multiple tools means you need a way to bring all of the analyses to the same report. Depending on the specific tools you use and your technical ability, the current solutions hinge on heavy use of spreadsheets, Google Data Studio, Business Integration solutions, SEO data platforms, and APIs (for custom solutions).

And second, there’s the volume of data. “Website performance” covers a huge breadth of information. We could be talking about daily ranking fluctuations, or we could be tracking HTTP response codes. And we might need this data for the full site, for each site section, or on a per-page scale. No useable report can cover all of this information–and not all of this information is pertinent. Even worse, the data that is most useful for one site–or even one part of a single site–is not always useful for another.

Some sources of technical SEO data and examples of the types of data available.
Some sources of technical SEO data and examples of the types of data available.

To avoid data overload, identify on the data with the most impact. Being able to pinpoint and obtain this data will significantly improve the impact of your technical SEO.

Start with business-based data

Take a step back. When looking at the sheer amount of data available, it’s too easy to lose sight of the big picture.

The purpose of most corporate websites is to increase notoriety, to gain qualified leads, and sell products or services.

To develop a data-driven SEO strategy, the place to start is by searching for the metrics that allow you to measure and predict performance on these website objectives.

Then, give priority SEO actions that improve performance on data points related to these metrics.

In practice, this might involve prioritizing metrics like:

  • Geographic reach. Data that support geographic reach includes the (often-unreliable) IP localization of website visitors, the physical location of new leads and prospects, and SERP studies for the targeted regions, particularly when different countries or languages are involved.
  • User journey information. Different stages of the user journey imply differences in keywords and in conversion type and location. Some stages may be under-optimized.
  • Tracking conversion through the sales funnel. Certain traffic may tend to lead to propsection or to a sale more easily that other types of traffic. If further information on the sales funnel is available, look at the profile of conversions that lead to successful sales.
  • Sales and revenue. Some sales are more profitable than others. Some products are easier to sell than others. Taking advantage of this sort of knowledge can allow for SEO quick wins with a direct impact on revenue.
  • Seasonality. Businesses with seasonal fluctuations in activity should see trends that evolve throughout the year. Studying bot activity and delays between when you publish a page, when it’s discovered, and when it’s first visited can show you when to start pushing seasonal SEO for maximal effect.
Google 5-year trends for "chocolate" with clear seasonality. Source: Google Trends.
Google 5-year trends for “chocolate” with clear seasonality. Source: Google Trends.

Find the pertinent SEO data

Not all data is useful for all websites. There are tricks to finding what data matters on your site and for your objectives.

Cross-analysis is a great way to understand how one type of data correlates with another. Cross analysis uses different data sources, and combines their data in a single analysis. For example, you could combine data on SERP positions for each URL with data on page load time. The results will help establish whether or not there is a correlation between the two types of data.

Comparing duplicate content and canonical URLs (crawl data) with bot hits (server log data) to understand the effect of duplicate content on getting pages crawled and indexed.
Comparing duplicate content and canonical URLs (crawl data) with bot hits (server log data) to understand the effect of duplicate content on getting pages crawled and indexed.

Keep only data that correlates with metrics you want to track and influence. A lack of correlation is a sign that even the best optimization will not have an effect on the metrics you’re trying to influence.

One effective way to run cross-analysis is to group your site’s pages intelligently, and look at performance for different metrics for each page group.

Many SEO solutions group pages in your site by first-level directory:

https://www.my-example-site.com/first-level-directory/slug.html

Often, however, this is not meaningful in the context of what pages do and which pages are important to you.

Some examples of groupings include:

  • Type of page. For an e-commerce site, for example, a site might break down into promotional landing pages, category pages, FAQ pages, product pages, and blog pages. By grouping pages by their role on the site, it becomes easier to prioritize actions. You should be less interested in optimizing FAQ pages, but concerned about fixing technical issues on product and promotional pages as quickly as possible.
Differences between 503 responses to bots on product pages vs static pages on an e-commerce website.
Differences between 503 responses to bots on product pages vs static pages on an e-commerce website.
  • SERP position. Grouping pages by page rank can reveal differences between top ranking pages and pages that are important to your strategy, but that struggle to rank as well. Key differences between pages that rank well and pages that struggle are often key factors for your site.
On this site, pages that rank well have more internal links pointing to them.
On this site, pages that rank well have more internal links pointing to them.
  • Publication date. For blogs and media or news websites, understanding how article recency effects performance will help you target the most effective optimizations to improve readership and rankings. It can also allow you to make informed decisions about how to redistribute crawl waste.
Recency distribution of articles on a media website by page depth.
Recency distribution of articles on a media website by page depth.
  • Price or profit margin. By viewing a website by the profit margin that can be earned on products and services, it is possible to uncover elements that hinder your marketing strategy. For example, if high-profit items are inadvertently placed at extremely deep levels of your site, it can prevent them from being crawled and ranked quickly.
After optimization, best-selling price-ranges were placed no more than 2 clicks from the home page.
After optimization, best-selling price-ranges were placed no more than 2 clicks from the home page.

Finally, focus on areas that have the potential to make a big impact: trends, exceptions and edge cases. These areas can be found using visual analysis of graphics or even machine learning that uses your URLs to predict performance on different metrics.

Trends show sitewide issues, where a correction of a single problem can positively affect the entire site. For example, you might find a trend related to page speed if all types of pages show issues with pages speed, or if rankings correlate closely with page speed.

Trends: On this site, slow pages are less likely to rank well.
Trends: On this site, slow pages are less likely to rank well.

Exceptions are often cases where a single URL or type of page is disproportionately affected by a single metric. These cases stand out as being unlike other pages on your site: either they rank or convert exceptionally well, or they do so exceptionally poorly. Understanding the source of the issue and applying that knowledge to the rest of the site can unlock significant changes.

Exceptions: One group of pages on this site is not constructed like the others and is more susceptible to thin content.
Exceptions: One group of pages on this site is not constructed like the others and is more susceptible to thin content.

The third area with a potential for big impacts are edge cases: the lowest or highest percentiles for a given metric. Commonly, 10-20% of a websites pages are responsible for bringing in the vast majority of organic traffic. Optimizing this smaller subset of pages, for example, will have more impact on organic traffic than optimizing the handful of pages that earn the least organic traffic at the moment.

Edge cases: The 10% of the pages that earn the most impressions on this site are responsible for nearly all clicks from the SERPs.
Edge cases: The 10% of the pages that earn the most impressions on this site are responsible for nearly all clicks from the SERPs.

Build metrics that make sense

Key data is not always raw data.

Instead of tracking long dashboards of raw data, simplify by developing north star metrics and optimized KPIs. This type of metric has multiple advantages: they are easier to understand and to link to business goals, and they are specifically adapted to your site and your business.

Where to go from here

The next steps are simple:

  • Implement quick wins. Better, more focused data will reveal issues you might not be aware you had. As with any SEO strategy, take care of the easy-to-solve problems with the biggest ROI first.
  • Be patient. Data-driven decisions require a history of data to be reliable. Furthermore, in SEO, seasonal search and market trends can affect performance, making it hard to judge data on a month-to-month basis.
  • Refine your data. As Google algorithms, market trends, search habits, and even your own site evolve, you may need to periodically revisit the data you’ve selected.

The way past data overload in technical SEO is through the data. By concentrating on the data that carries the most weight on your site, you can reduce the indicators you track. More importantly, you can improve the effectiveness of your SEO through a better understanding of the impact of your optimizations and a more goal-oriented focus.

OnCrawl

About OnCrawl

OnCrawl is an award-winning technical SEO platform, that combines your content, log files and search data at scale so that you can open Google’s blackbox and build an SEO strategy with confidence.