How search engines might define quality
There are a lot factors playing a role in SEO. From a single line of code in your website to major changes in the way you run your business. One of the most important factors however is ‘high quality content’. Often referred to as linkworthy content, fresh content or whatever other term you can make up. But how would you define quality content? It’s not easy to describe. For search engines it’s even harder. Search engines can’t tell if certain content is high or low quality by reading it. They have to use signals to identify the quality of content. In this article I will explain the signals search engines use to define quality.
First of all you have to keep in mind that search engines score documents as a whole. They can’t score the quality of parts of documents. Documents are the results shown in the search engine so those have to be scored.
Domain level signals
The domain on which a document is placed gives a lot of information about the overall quality on that domain. Search engines use different kinds of signals from the domain for determining the quality of a specific document.
Domain age: The domain age tells search engines simply how long a domain exists. Search engines argue that valuable (legitimate) domains are often paid for several years in advance where illegitimate domains rarely are used for more than a year.
Existence of domain in directories: Some directories where submissions are edited by hand might still have a positive impact on the quality of a site as seen by search engines.
Prior site ranking: If you’re domain has had a prior ranking on a specific keyword search engines might consider the domain a high quality source for topics considering that keyword.
TrustRank and PageRank: The TrustRank of a domain tells search engines the overall trust it has in a website. The PageRank is a signal for linkworthiness of the complete domain. Both these factors are a signal for the quality and trust of each document on that domain.
Document level signals
The document itself holds the most information about the quality of the document. Search engines use different kinds of signals on-page and off-page.
TrustRank: The TrustRank is a number that indicates the trust that search engines have that the document is not spam. “TrustRank is a method for separating reputable, good pages on the Web from web spam”. TrustRank is based on the distance (in links) of the document from hand-selected trusted sites. The closer the document is to trusted sites the higher the probability that it’s a trustworthy site itself. Trustworthiness can be a signal for quality.
Number of inbound links: The number of inbound links may be a signal for the quality of information in a specific document. The more people link to a document, the more likely it is it has high quality information.
Link profile of inbound links: Although the number of links might be an indication of the quality of the information, the types of links might matter even more. There are a few factors to take into account here. The trust of the source is one factor. This can be based on the TrustRank or PageRank of the page/domain. But here are also other signals like domain extension. .edu and .gov domains tend to have a higher trust than other domain extensions.
Links in the body text of a page appear to be of more relevance than footer links, therefore the sources have to be me more trusted (otherwise you wouldn’t link from the body and increase the chance of leading your visitors to that page).
Another factor considering the link profile could be how ‘natural’ the link profile is. Here you can think of variance in anchor texts, the use of site-wide links, same-IP links etc.
Uniqueness of the document: Mainly copied content doesn’t have any added value for users. A document must have a certain amount of unique content to get classified as high quality at all.
Staleness of document: The freshness or staleness of a document could influence the percepted quality of the document. For some kinds of topics older information could be outdated while recent information covers the topic much better. The freshness is not only determined by the age of the document (when it was crawled for the first time), but also by freshness of links, the latest changes to the document, growth of links to a document and changes in anchor texts. More on freshness of documents and changing content on SEO by the Sea.
Outbound linking: Linking to quality sites from your document might not help you in ranking that much, but linking to ‘bad neighborhoods’ definitely will harm your rankings. Yahoo! Says “Hyperlinks intended to help people find interesting, related content, when applicable.”
Over optimization/keyword stuffing: Search engines are on some level able to recognize ‘natural texts’. However, over optimization and keyword stuffing they can recognize in a second. And it definitely degrades the percepted quality for a document. All search engines claim they value content created primarily for users and secondary for search engines.
Load time: Although it’s a little bit of a long shot, the load time also can be a quality signal for search engines. Because a faster loading document creates a better user experience the overall valuation of the quality of the document might be higher simply because the user experience is better. It doesn’t mean the information on the page is better, but in the perception of the user it might be better.
User behavior signals
User behavior could be a strong indicator of the quality of a document. Search engines have filed multiple patents supporting this theory. These include factors like CTR in the SERPs, bounce rates, time on page, overall traffic to a page, bookmarking etc. Despite the possible strength of these signals they haven’t been used by search engines that much. Most important reasons for this are the noisiness of the signals and that these signals are easily spammable.
There are probably even more factors search engines consider in scoring the quality of a document, could you add any?