This Wednesday many sites around the web will turn black. This as a protest to plans in the US called the Stop Online Piracy Act. Even though the White House is said not to support SOPA this bill might just become a law, which would be disastrous for the online industry. It is supposed to protect the intellectual properties online, but it can also be seen as possible Internet censorship. Therefore some of the big players, like Wikipedia and Reddit, on the web will turn black on Wednesday, to make a statement.
Maybe you are making that same statement. If so, you need to think of a few things. What many sites don’t realize is that “blacking out your site” might just harm your rankings in Google. Google after all doesn’t stop crawling and if they can’t find your site or do find it but won’t find anything ‘but black’ just when you are joining in on the protest, you might just lose your rankings.
Luckily Google has Pierre Far. This Googler, who is speaking at the next Thinkvisibility by the way, is not only a nice guy if you meet him in person, he thinks along with webmasters. So he decided to post some helpful tips for webmasters who plan to participate in the blackout on his Google+ Profile.
The Tips from Far are amongst others:
Webmasters should return a 503 HTTP header
This goes for all the urls participating in the blackout (parts of a site or the whole site). Far says this helps in two ways:
a. It tells us it’s not the “real” content on the site and won’t be indexed.
b. Because of (a), even if we see the same content (e.g. the “site offline” message) on all the URLs, it won’t cause duplicate content issues.
You have to keep in mind that Googlebot’s crawling rate will drop when it sees a spike in 503 headers. According to Far that will however recover quickly once back online.
Carefull with your Robots.txt
A Googlebot will stop all crawling of the site if the site’s robots.txt file returns a 503 status code for robots.txt. This crawling block will continue until Googlebot sees an acceptable status code for robots.txt fetches.
Don’t block Googlebot’s crawling with a a “Disallow: /”. This has a high chance of causing crawling issues for much longer.
Monitor your Crawl Errors section in Webmastertools
In the weeks after the black out be sure to monitor your Crawl Errors section “to ensure there aren’t any unexpected lingering issues.”
Don’t change too many things
Far explicitly makes clear you should not make too many changes if you are participating. Keep it simple. Don’t change the DNS settings, don’t change the robots.txt file contents and don’t alter the crawl rate setting in WMT, he says.
If you are not participating these tips are also very helpful. They can for example be used for when you are doing server maintenance or moving your site from one server to another. For now: thank you Pierre and for those participating in the black out: Good luck!