SES Berlin 2010: Duplicate Content Issues

We’re back after the lunch break with another session at SES Berlin – this time covering the various issues around duplicate content. First on is Lutz Ulrich from Google, so let’s see what the “official” version is.

Ulrich starts by showing a DC example on http://www.royal.gov.uk/ (vs. http://www.royal.gov.uk/home.aspx) – so even the British Monarchy seems to have issues – nice! So let’s get on it. Does Google penalize DC? Google says no… if you do it by accident Google doesn’t penalize it! But if you do it on purpose (like doorway pages on various domains or similar) this will get caught and handled “accordingly”.And surely it can cause problems as well: If DC resides on more than one domain Google will crawl all of them and basically join them. So it’s a massive waste on crawling resources. And when crawling non-relevant stuff – the good content might not get indexed (cause the crawlers are busy getting through the dupes). You don’t want that to happen. I guess no news in here, sorry folks.

So how to avoid DC? Here are some tips from Ulrich:

Keep URLs structures really simple!
Have one URL to contain all relevant information.
Careful when doing URL rewrites – a lot of times (especially when done wrong) it can cause more problems than it solves.
Create a sitemap (also important to help Google finding the canonical version)

And as a closing tip: You could also check Google Webmaster Tools to exclude various URL parameters when crawling your site. All right next one up is Markus Hövener – and these are the main take-aways:

DCs can hurt – looking at it from a user’s perspective. Sometimes it happens that mobile websites do appear in web search results. Clicking on these results you often get something like a page without a navigation, etc. – you’re basically stuck. Not a good user experience.

DC in AT vs. CH vs. DE – a lot of times searching in AT you’ll get a DE version, but let’s say you have an ecommerce site – a lot of times you can’t buy just in the country where you live. Good point…

Markus states that in his opinion – if a website has no unique content at all – it should be penalized. He also believes that there is something like that already present. And if not, according to him, it should 😉 Markus also suggests not doing a redirect for mobile sites (like m.domain.com) but rather do device detection and serving all contents on one consistent URL pattern. Can be risky, from what I have seen. And last but not least: Do not let Google decide – they’re right in ~98% of the time. But if they’re not you’re screwed!

Well… one more to go: Last speaker in this session is Christoph Burseg. He is going to talk about Google News and duplicate content issues. Christoph states that there are NO duplicate content issues in News – so push in whatever you have, it cannot hurt you! Another interesting fact is that if you’re small publisher with unique content you probably won’t be shown but a “big source” like spiegel.de can re-publish content and will rank, even it’s a dupe. Interesting stuff. To to some it up: There are players in Google News which can publisher whatever they want – even though it’s not unique. Seems to be unfair…

And that’s it for now folks, back in a bit!