Gooruze

First VisitRegister with GooruzeLog in to Gooruze
 
   
 

WarrenDuff
The SEs will spider a varity of different file formats - Google boosts that they will index 13 common file types (http://www.google.com/help/faq_filetypes.html). However, if you were to put, say a whitepaper, available in all of these file formats, would you hit duplicate content issues?
Know a little? Give an answerKnow a lot? Write an article Report
 
 

Answers

 
 

Re: Duplicate Content in Different File Formats

andybeal
Vote:

October 2007

It depends on the authority of your site and back links to the content. You can have an HTML page and PDF both rank - if you have enough links to both. Take a look at this double listing for my site. The content in the PDF is (virtually) identical to the content on the HTML page.

If you have lots of duplicate content (and not enough links to the content), it's best to exclude the versions that don't benefit you (using the robots.txt as described in the other comment).
Reply Reply Report

Re: Duplicate Content in Different File Formats

RKF
4.00 (Good) Vote: Interesting Interesting Interesting Interesting Interesting

October 2007

Yes, if you publish the same content in multiple formats they will count as duplicate content.

If by "issues" you mean penalties, then the answer is no. Penalties for duplicate content are quite rare. What will happen is the search engines will get to determine which version they display in the SERPs. I have seen cases where a PDF document ranks instead of the web page with the same content.

A good solution for this is to host all your documents in a subfolder (ex: /pdf/) and use your robots file to prevent that folder from being spidered.
Reply Reply Report
 
 

Invite someone to Gooruze

Home | Read News | Post News | Read Articles | Write Articles | Q & A | Groups | Activity | Members | More

Privacy Policy | House Rules | About Us | Contact Us | House Blog | FAQ | Advertise With Us

© Copyright 2007 Gooruze ™ | Built by Market United