« SEO 101 - 12-13 of 13 common SEO Roadblocks (Graphics and Pages which are hard to access) | Main | SEO 101 - Search Engine Spam: the does and don'ts of spamming »

June 10, 2008

SEO 101 - Why Duplicate content is a problem for search engines

Duplicate content can be a major issue for most sites. Search engines do not like duplicate content and may penalize or ban the sites. This can be an issue for many reasons:  

• The "www.sitename.com" and "sitename.com" version of your site could both be indexed. This results in two copies of your site being indexed and listed by the search engines. The problem stems from the fact that some people linking to your site will use the "www.sitename.com" and some just "sitename.com" and, by default, your server accepts both site names and sends them to the same site, though according to standards, these are really two different sites. Unless you detect and correct this issue, you will end up with both sites indexed.

• That blog post you made will appear in the month archive, the category archive, the home page, etc.

• Sites selling goods tend to use the descriptions supplied by the wholesaler. As most goods are sold on multiple sites, each site will essentially duplicate the content found on the other sites.

• Affiliates can unintentionally cause an issue. Affiliates are third parties that resell products or services for a main site. If the affiliates use text from the main site, it can be viewed as duplicate content. Also, if they use spamming techniques, it can reflect poorly on the main site and result in penalties.

Duplicate content causes issues for the search engines because:

• It takes additional bandwidth to download the duplicated pages.

• It takes additional storage to store the duplicated pages.

• It takes additional processor cycles to process the duplicated pages.

• It takes additional time to scan the search engine’s index because it contains duplicate pages.

• It does not provide a better experience for the searcher. No searcher wants to see near- identical pages taking up the top 10 results for a search. Search engines want to provide unique and relevant pages, so they try to eliminate duplication.

All of these issues lower the quality of the search engine results. Search engines are particularly keen to rid themselves of this nuisance and whole sites have disappeared because of it. Duplicate content is one of the main reasons pages end up in the supplemental index on Google.

The supplemental index is used by Google when the query entered doesn't return enough results – therefore, any query satisfied by the supplemental index is, by definition, a low volume, low competitive search phrase. The supplemental index is where pages go to die and is usually filled with old, removed pages (404), and those pages that are not worth indexing properly.

Thanks to SEMPO institute for this information.

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/services/trackback/6a00e55184fcf0883300e552a242498834

Listed below are links to weblogs that reference SEO 101 - Why Duplicate content is a problem for search engines:

Comments

Feed You can follow this conversation by subscribing to the comment feed for this post.

Duplicate contenent?

you spam for br3games.com pezzo di merda

Believe it or not, basic SEO is all about common sense and simplicity. The purpose of search engine optimization is to make a website as search engine friendly as possible.

i cant agree with seo melbourne any more

Verify your Comment

Previewing your Comment

This is only a preview. Your comment has not yet been posted.

Working...
Your comment could not be posted. Error type:
Your comment has been posted. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.

Working...

Post a comment