SEO Correlation Studies: Are We Looking At Them Wrong?

Every now and again a new SEO rankings factor correlation study is published, showing which elements of SEO correlate best with high rankings in Google. Take for example this 2013 study from Searchmetrics, or the (now-retired) Open Algorithm research.

In and of itself these correlation studies are usually not a bad thing. If performed correctly these correlation studies can provide valuable data about SEO.

The problem usually arises when people start to create hypotheses about the causal relationships between rankings and the correlated factors. Recently this became very clear in Cyrus Shepard’s thorough and well-researched post that showed a strong correlation between Google+ and high rankings in Google.

Moz +1 correlation study

That caused quite a stir, and solicited a response from Matt Cutts on Hacker News:

“If you make compelling content, people will link to it, like it, share it on Facebook, +1 it, etc. But that doesn’t mean that Google is using those signals in our ranking.”

While this may read like a denouncement of +1’s, it doesn’t actually contend the statement Cyrus makes that activity on the Google+ platform – such as shares – can result in higher rankings. After all Google+ shares do seem to pass link value as they’re followed links.

Now, aside from the questionable wisdom of using a Google-controlled platform to manipulate rankings in Google’s search engine (to me it seems a bit like volunteering to be a rat in Google’s SEO maze) there is another issue here: what if we’re looking at these correlation studies the wrong way?

In 2011 Moz also published a correlation study that seemed to show Facebook shares correlated with high rankings in search. This too was a contentious post, as that correlation study also resulted in many SEOs proclaiming that Facebook shares helped rank websites higher.

Moz Facebook correlation study

But here too a counter-claim emerged from Matt Cutts when he said in an interview that Google has limited access to Facebook’s data and so could not make optimal use of Facebook shares as a ranking factor.

It occurred to me that we as the SEO community, in our enduring efforts to find easy answers and quick solutions, probably look at these correlation studies from a flawed perspective. We’re confusing the noise for the signal, and put the cart in front of the horse.

Prediction, not Causation

Aside from turning the tables and interpreting these correlated factors as not causing higher rankings, but resulting from them, we could also view them from a third perspective. What if a high degree of Facebook shares, tweets, and +1s are not signals that Google takes in to account, but predictors of improved visibility in search?

The exact causal relationship between social shares and high rankings is likely a very complex and muddled process, relying on many different interconnected factors. Instead of attempting to dissect these in detail – and risking making our in-depth research obsolete the next time Google releases an update or rolls out something like the Transition Rank patent – we should instead see the correlation studies like prediction signals.

A webpage that receives a lot of social shares is statistically more likely to also rank higher in search. That is undisputed. But instead of seeing the social shares as the cause of the rankings, we should interpret them as a prediction signal. A new blog post that goes viral and receives a lot of social attention can be forecast to have lasting search engine visibility. It’s not a certainty (the correlation is only around 0.3 after all) but together with the other SEO correlation factors you could predict a piece of content’s success in Google SERPs with some degree of confidence.

By placing the correlation factors in this predictive context, we do not risk of conflating correlation for causation, and we make a more honest assessment of the value of social sharing in a SEO framework. It also emphasises the inherent uncertainty in SEO, where we can never guarantee results.

I welcome future well-performed correlation studies, as this will give us more data to work with. But we should stop to seek causal relationships as I believe that to be a futile effort that even seasoned scientists would struggle with. Instead we need to interpret these correlations as predictions, with all the uncertainty that entails. Only then will we be able to see the true value of these studies, and incorporate their findings in our SEO tactics appropriately.