From what I heard, online marketers and especially SEOs might sometimes be interested in getting links pointing to a self-owned (or a client’s) website OK, fun aside. When building links one of the things that really annoys me, are people trying to cheat you – and I’m not talking a PR 4 vs. a PR 6 link but more stuff like removing a link after a striking a deal (which is more or less the easiest to detect), cloaking content based on user-agents/IPs or using methods like X-Robots headers or the robots Meta tag to prevent indexing pages.
This article is going to be a small check-list on things you’ll have to watch out for – or if possible – your link-building backend should have “an eye on” (and really make sure you run these tasks on a regular basis, say at least once a week):
To have some URL-patterns to show I’m going to use www.domain.com (as the root-domain) and www.domain.com/path/content.html (where the link is placed at) in the following examples. And just to be sure: I take it as basic knowledge that the link you dealt to be placed on that specific page is implemented as a hyperlink using the good old <a>-tag!
First of all I’d recommend making sure that the URL is not disallowed using robots.txt (which would be located at www.domain.com/robots.txt). In our case we’d have to watch out for multiple patterns including:
You get the idea – and also keep in mind to check ALL relevant user-agents because sometimes it happens that these pages will only be disallowed for a specific search engine, let’s say Google.
2. HTTP status code & X-Robots header
The second thing to do is sending a simple HTTP HEAD request to the URL where the link is placed at and looking at the response you should especially pay attention to:
2. Meta robots & canonical tag
The next things to watch out for are the Meta robots tag as well as the canonical tag. Both can be set in a way that your link is completely worthless:
3. The rel-attribute
Let’s move on to the link-level, shall we? As mentioned earlier the obvious thing to do is to check for the link itself – this could be done by using a regular expression – this would also allow you to validate the href-attribute and of course the anchor text value.
Additionally you’d have access to other attributes being used – like for example a rel=(*) to detected “nofollow”ed links (if you do care about).
Another thing that seems to be getting quite popular at the moment is to cloak-out links for search engines. Basically what some webmasters are trying to do is to deliver the content including the hyperlink to the users but to remove the link when a crawler accesses the website (to reduce the number of outbound links, I guess). The most common tactic is still doing it on a per user-agent basis which is pretty easy to detect – but if it’s getting more advanced like IP based stuff, etc. you’d probably also have to validate the Google cached version of that specific page to really ensure the link is present.
5. More stuff to watch out for
- Is that page being indexed at all? After a while (especially when it’s a new page) you should definitely do a search on Google for that specific URL to make sure the page is being indexed at all – if not, is that page being linked internally to get any link juice from the domain at all?
- It could also be very interesting to monitor the number of outbound links on www.domain.com/path/content.html – because if the number rises (big-time) it might be bad for you (probably you’ll get less link power, maybe the page from that point onwards is turned into an “obvious link-selling page”?)
- And last but not least it might be interesting to monitor if there are some of the common “ppc”-keywords / bad-words on that page. Maybe the page got hacked and you’re now in a very bad neighborhood? I’d love to know
I hope this helps a little bit when dealing links or building a tool to make sure those stay in place. If you don’t have a tool to do all the tasks at once, here are some Firefox add-ons to do some of that work for you:
Other add-on recommendations or more things you’d check for? Looking forward to your feedback in the comment section!