SES New York 2011: Duplicate Content & Multiple Site Issues
Last session of the SES New York 2011 I’ll cover is the session about duplicate content & multiple site issues. We got Brian Ussery of Search Discovery Inc., Eric Enge of Stone Temple Consulting and Tiffany Oberoi of Google’s evil (just kidding, Matt :-)) search quality team. The session is moderated by Anne Kennedy.
Tiffany Oberoi is starting the session.
Duplicate content is not a penalty, it’s just filtered. She start with some tips to reduce pages with duplicate text:
- Someone is using my content, should I worry? A: Get him to link to you.
- Add unique & valuable content
- don’t publish stub articles
Country specific content: Google states that they have no problem with that. I’ll say something about that at the end
- consider using top level TLD’s to separate content.
- Avoid using url parameters for country specific content
- Geotargeting in Webmasters Tools
Furthermore she suggests to correct duplicates in Webmasters Tools: look for duplicate descriptions and titles and fix them.
What’s in a name (or url)?
Don’t split up your reputation! Use canonical url’s.
- Duplicate content will make the crawl inefficient and as they have only a crawl budget it well get you important pages not crawled.
- Dilution of link juice
Tiffany now presents some solutions to avoid duplicate content:
1. 301 redirects
- good for the user
- it will transfer link juice
2. Canonical tags: when you can’t do a 301 redirect.
FAQ’s to canonical:
- should they be identical? No, but they have to be very similar
- Should I use canonical for http and https? Yes
- Does it work across hosts / subdomains? Yes
- Does it work across domains? Yes, (at least at Google)
Watch out for:
- Infinite loop of canonicals
- Pointing to a url that doesn’t exist
Identify non-content-related parameters. Let Google ignore these parameters in Webmaster Tools.
301 redirect is the best choice, if you cannot use canonical tags, Webmaster Tool parameter handling is only for Google, but more easy.
Next speaker is Brian Ussery
Q: Why is duplicate content Bad?
A: it negatively effects crawl efficiency and loses link juice.
- Default page / link consistency
Example: googlestore.com,googlestore.com/default.aspx, https version, dev.googlestore.com
Look at the cached version at Google if the url is the same.
- Product descriptions
- Product level categorization – watch out with parameters to sort etc.
- Faceted navigation – be careful that the facets don’t outrank the original
- Syndication / Scraping:
Press release syndication:
Create a unique and original version of Press Releases hosted on your site
Release you Press Release 3 to 6 hours before syndicating.
Link from the syndicated version to the original unique version posted on your site
Focus on editor buy in…..
1. Link consistency
2. Use unique products descriptions
3. Understand the impact of categorization and faceted navigation
4. Syndicate wisely.
Last speaker on the session is Eric Enge with his presentation “Going deep with Dupes”. He will do a technical presentation.
He first shows where to find the patents about duplicate content: http://patft.uspto.gov/netahtml/PTO/srchnum.htm
Google will try to evaluate the content part of your page.
Some remarks of Eric about duplicate content issues:
1. If you have manufacturer supplied text on your page, add user generated content to it!
2. Watch for shingles: if you have the same content sorted on different parameters, you’ll have duplicate content.
3. Faceted Navigation Solutions: use the canonical tag. Backup is to Noindex alternate sort orders.
4. Simple Database Substitution: subsitute just some words automatically. This doesn’t work anymore.
5. Simple Synonym Substitution: Eric says that also this doesn’t work anymore.
Also a mixed up of shingles and synonyms is not working.
Duplicate content filters are query specific. Eric thinks that in 2011 Google will foucs on a few aspects: original Author, link equity score, recency.
Eric closes a his speech with a few words about the Panda update. The Panda update is similar to a duplicate content filter, but in his opinion Panda is more a quality update and not just a filter.
At the end I would like to add a personal note. Google’s Tiffany Oberoi confirmed once again at a conference, that they don’t have problems with duplicate content on different country tld’s. I’m at the conference with a few German top-SEO’s and we all know that Google still has problems with cross-country duplicate content. (For example between .de and .at, co.uk and .ie etc,) Of course you should set the targeting right in Webmaster Tools and you should have backlinks from the right country (something nobody in the panel mentioned!), but unfortunately from my experience Google can still have duplicate content issues. I hope they’ll fix it soon!