Last week Google announced that the number of valid copyright infringement notices will become a negative ranking factor within the SERPS. The idea is the more valid copyright complaints they get about a site the more difficult it will be for that site to rank naturally. It’s a great move in many ways however I question the effectiveness for written content and I feel to tackle the problem properly it would need an element of automation.
In the announcement they showed the scale of copyright complaints have spiked alarmingly over the last year. In fact they get more copyright removal notices now in a single day than they got in the whole of 2009. To illustrate the scale of the problem Google got over 5 million requests from the copyright owners in July 2012. The main copyright complainants are unsurprisingly major software and music companies with Microsoft heading up the list of top complainants.
Also unsurprisingly the top infringers seem to be torrent websites and other file sharing sites. While these sites are clearly upsetting major corporates it is unclear how successful Google will be at enforcing copyright for the little guys.
Articlesbase.com for example has been involved in 194 requests with 310 URLs requested to be removed since April 2011. I personally find it exceptionally difficult to believe that this is even 1% of the scale of copyright abuse from people copying other content and dropping it on Articlesbase with a link back to their site. If we were to monitor every article that we create and submit a copyright infringement we would possibly need someone full-time on that which wouldn’t justify the cost for us or our clients.
Basing the copyright owner programmatically on initial crawl date also has a great deal of holes in it as a lot of the time that is dependent on crawl regularity. It is speculated that Google will use rel=author mark-up more and more to determine copyright ownership however that will also have its drawbacks.
The only way I can think of programmatically fixing the problem is for Google to a Google+ SOAP interface which correlates back to the authorship mark up and allows you to set the publish date. It would eliminate the effectiveness of scrapers overnight as the timestamp will differ. It will also encourage more people to engage with Google + when distributing notification of new content which would have an added benefit for them.
It would be great to hear your thoughts on the issue and if you feel the scale of the copyright infringement can be managed more effectively?