Clicky

X

Subscribe to our newsletter

Get the State of Digital Newsletter
Join an elite group of marketers receiving the best content in their mailbox
* = required field
Daily Updates

Duplicate Content and Multiple Site Issues

24 February 2011 BY

0 Flares Twitter 0 Facebook 0 Google+ 0 LinkedIn 0 Buffer 0 Email -- StumbleUpon 0 Pin It Share 0 Filament.io 0 Flares ×

More and more site owners are concerned that they might be getting penalised accidentally or overtly because of duplicate content.

Do they have cause for worrying? Certainly, many in house or external SEOs have experienced duplicate content issues and 90% of them through natural means such as syndicating content or catalogue driven product pages that vary only in colour of product.

This session looks at various issues and explorers potential solutions.

Mikkel Svendsen and the Myths surrounding Duplicate Content

As the title suggests Mikkel starts by stating that there are various myths that surround duplicate content, luckily for us he has broken it down into the main 7.

1.    You don’t have to deal with duplicate pages as search engines do just fine
Wrong.
They do, however, you need to be aware as search engines can filter out important pages which can lead to loss of organic traffic.

2.    Google will brutally punish you for duplicate content
Wrong.
Key factor is to understand the difference between punished and filtered.
The difference is crucial to how you fix it.

3.    If it ain’t broke don’t fix it
Wrong.
People seem to think duplicate content is ok if you are not getting filtered or punished. However, Mikkel says it is like a landmine, people think it’s ok if you don’t step on it but sooner or later it will explode.

4.    Duplicate Content is only problem across domains – not within your own domain
Wrong.
SE will try to filter out DP if it pollutes the index. When it makes doesn’t sense form a user point of view they will filter it.
Website should never be accessible on more than one domain – your brand domain (Canonical)

Multiple domains
301 redirect

Sub domains
Make sure it is only accessed through one sub domain at a time

Test domains
Password protect or implement a robots file as sometimes Mikkel says an old test site with exactly same content takes a while to filter out if you haven’t blocked search engines indexing the pages before launch of real site.

Http or https
Search engines do index both

5.    Just implement the canonical tag and everything will be fine
Wrong.

Problems with canonical tag
–    Works slower than 301
–    Requires page to be crawled and indexed first – 301 don’t
–    Doesn’t work perfectly yet
–    There have been too many example of this
–    Implemented perfectly
–    You have to identify all duplicate content anyway
–    Have to know where the problem is anyway so why don’t fix?
–    Google parameter handlings

6.    If you stay below certain % of text then you will be fine
Wrong.

A lot more than just a benchmark %

It is about context and link data (which I find an interesting theory)

The example Mikkel gives is if you have a piece that is featured in two papers then the context is taken into account. They both have different audiences and via analysing their link data search engines can see they both have very different circles they operate in so can determine that although the piece of content is very similar it warrants uniqueness.

Search engines also filter out boiler plate content (such as disclaimers and footer copy) and other content across many pages. They don’t always filter out entire page, just a ignore the duplicate content they find.

Technical issues

  • Tracking – parameters of utm source code
  • Session ids – in substitution of cookies

How you fix campaign and affiliate tracking URLs

Add this piece of code to your to GA tracking script:

Pagetracker_setAllowAnchor(true);

Change ? after URL to # and this will stop search engines indexing URLs

Avoid duplicate content using RSS

  • Never put entire post in your feed
  • Use an abstract
  • In WordPress to  “use more” function

7.    Virtually impossible to stop all internal

The catch all solution

Unfortunately I didn’t catch all of his slide as he was running out of time but I am sure if you twitter him @demib


Fantomaster

Fantomaster didnt have a presentation but instead spoke to us freely about his thoughts on duplicate content. This was remarkably refreshing as was more like one of those talk shows “An audience with….” luckily, having seen Fantomaster speak many times before he is captivating and his years of experience shines through.

I thought it would be valuable if I picked some of the best quotes and most thought provoking ideas he touched on…

Looking from an empirical sense and a scientific manner – in regards to duplicate content we, SEOs, are just groping around in the dark.

We obviously get no specific details from search engine so we are just making educated and logical guesses.

Some forms of duplicate content occurs through the search engines use of stop words. If content is stripped of pronouns/prepositions then a lot of content might be similar.

If you don’t want stop words to be stripped out then put in inverted quotes or plus signs

Article spinning

Just because it reads different to you not for the same for search engines

If you use different tools you will get different results – % similarity of content can not be determined correctly so best thing to do is to make contenmt as unique as possible.

Gogole gave information on what they look for if you want to believe them -Shingles technology – phrases that are compared rather than singular words and synonyms and LSI.

Duplicate content often happens due to scaling.

Red shoes, black shoes, blue shoes – mainly for catalogue driven sites

Penalty/filter
Not pushing you into hell just not letting you into heaven

Switching key phrases in content does not work – not how search engines determine duplicate content

Shoe example:

Consider adding below the fold pieces of copy on shoe history, shoes in ballet, etc… to make pages, that are more often than not very simialr, unique.

Put copy high up page in css/html structure but to appear below the fold

Vary product description
20% of 100,000 pages are orphaned – due to poor CMS or poor user knowledge

Automatically Generated Content is an option to vary content but 99% is illiterate – poor for conversion

Link building – another example of where duplicate content happens

Where people fail the most is in titles – language savy to succeed, synonyms won’t cut it. Think in terms of phrases rather than individual words

Ensure variance – objective is links not shakespeare prize for writing

AUTHORED BY:
h

Sam Murray graduated from University with a BA (Hons) in Marketing in 2007 and wrote his 10,000 word dissertation on Search Marketing. Sam is a freelance search manager.
  • Pingback: Tweets that mention Duplicate Content and Multiple Site Issues - Search Engine Strategies, Technical SEO, Uncategorized - State of Search -- Topsy.com

  • Sabrina

    Thanks for the tips. Always good to read through something like this and check if I’m not missing anything. Quick “newbie” question from me: I recently analyzed some SEO pages from a company that says it specializes in SEO.. Thing is, it turned out all pages featured the exact same text, only the key words where replaced. So text 1 would start with “Looking for [key word 1]?”, text 2 would be “Looking for [key word 2]” etc. So the total text (about 200 words) is the same every time, just different key words. This is totally wrong right? Or is changing the main key words enough to make a text unique? Seems to be it’s a stupid thing to do… But apparently an “SEO expert” is doing it.. so now I’m confused.

  • tina

    hi,

    I have a question about this:
    I have a ecommerce website and recently I was thinking of a better domain name that might bring more traffic from google.
    In the same time I want to keep my original one, because it’s already been out there for one year and I have traffic and more important buyers.

    So here is the question: will I be penalize if i will duplicate the products/category, but the content on each page I will change?

    Thanks

    • http://www.david-whitehouse.org/ David Whitehouse

      Hey Tina,

      If you have to ask, you probably will, I would say you are better off 301 redirecting it to the new website, you can read more about how I deal with duplicate content issues here.

  • William Maia

    Haha David,” links not shakespeare prize for writing” Excellent!!

  • http://hire-seoservices.com/ Zain

    Can you just recommend me a tool to check duplicate content pages on my site? Actually I’ve an ecommerce site and have thousands of pages, Google webmaster tools on shows duplicate descriptions or title pages only.

    Thanks!

  • Rob – RPL Driving School

    A free way of checking for duplicate content is copyscape.

    One of my sites suddenly dropped in the rankings. After research I found my test site was visible to search engines and I suspect that was part of it. I’ve just done a 301 from the test site to the new one so hopefully that’s helped…

0 Flares Twitter 0 Facebook 0 Google+ 0 LinkedIn 0 Buffer 0 Email -- StumbleUpon 0 Pin It Share 0 Filament.io 0 Flares ×

Nice job, you found it!

Now, go try out the 12th one:

Use Google Translate to bypass a paywall...

Ran into a page you can't read because it is blocked or paywalled? Here's a quick trick (doesn't always work, but often does!):

Type the page into Google translate (replace the example with the page you want):

http://translate.google.com/translate?sl=ja&tl=en&u=http://example.com/

How about that!?

Like this 12th trick? Tell others they need to look for this trick on our page: http://www.stateofdigital.com/search-hacks-marketers/

Or Tweet: Found the secret 12th one!