Gooruze

First VisitRegister with GooruzeLog in to Gooruze
 
   
 

LandonPorter
Ok, what exactly is a scraper and how do I deal with it after it happens?
 
 

Answers

 
 

Re: What is a scraper?

WarrenDuff
5.00 (Excellent) Vote: WOW! WOW! WOW! WOW! WOW!

November 12th

Effectively a content scraper if stealing your content - the text and the images, generally to be reproduced on another site. They are stealing them for a reason - generally to build out content quickly so they can rank and take traffic. I differinate a content stealer from, say a blogger who is reproducing the content, by if they acknowledge and provide a link. No link, they are stealing. An email scraper (as descibed well by Brian) is stealing email addresses and I totally agree with his ways of over coming it. How can you stop them? Well here are some ideas but they will have trade offs: 1. Put your content in a secure area. A bad solution for SEO! 2. See if you can identify which IP address they are coming from and ban the IP. A reactive strategy which is not going to be that effective as a content stealer is going to be masking their IP. 3. Lots of time and money investing in lawyers to potect your content. I see it a bit like shoplifting. We all know it happens and we don't like it but it is almost impossible to stop (now that I think about it, sounds a bit like click fraud really doesn't it ;)). Keep up your efforts to introduce fresh and realative content and beat them at the traffic game.
Reply Reply Report

Re: What is a scraper?

duncanriley
4.29 (Good) Vote: Interesting Interesting Interesting Interesting Interesting

November 13th

I'm fascinated by your belief that scraping content then providing a link back makes it somehow ok. It's still content theft, and no amount of attribution makes that ok...and I say this from a legal standpoint as well, as it's copyright theft. You can quote from someone else legally, but if you're simply ripping the content its illegal, and pretty immoral as well.

 I'd recommend a DMCA notice if someone is stealing your content...presuming that the stolen content is hosted in the US. If it's overseas you're pretty much screwed, although some hosting companies will respond positively to a copyright request.
Reply Reply Report

Re: What is a scraper?

andybeal
Vote:

November 16th

If you've ever tried filing a DMCA notice you'll know its easier to just ignore the scrapers than try to fight them. It's like whack-a-mole -- easier for them to launch another scraper site than for you to submit a DMCA notice.
Reply Reply Report

Re: What is a scraper?

WarrenDuff
Vote:

November 14th

Hi Duncan

Brian put it a bit more elegantly than I did - but that is what I meant by attributing the authoer and providing a link.  That is how the blogosphere seems to work - a defacto-standard almost.

Not saying it makes it right or legal.    But it does happen.

Reply Reply Report

Re: What is a scraper?

BrianChappell
5.00 (Excellent) Vote: WOW! WOW! WOW! WOW! WOW!

November 13th

Good point Duncan, it technically is illegal. However in the blogosphere, as I am sure you are well aware many a times people "quote" pieces of posts then link back. I would say 99/100 times the original author wouldn't mind. Thats how I took "I differinate a content stealer from, say a blogger who is reproducing the content, by if they acknowledge and provide a link." at least.

Here is an example of a DMCA notice I just dug up, worth noting, if you ever need to contact a site about stealing your content.
Reply Reply Report

Re: What is a scraper?

duncanriley
5.00 (Excellent) Vote: WOW! WOW! WOW! WOW! WOW!

November 13th

Brian
it's not technically illegal, its just illegal, at least in the English speaking world. The difference (in US terms) is between theft and fair use. I could quote a big chunk of a post on my site, wrap it in commentary, link out and we'd be ok, if I stole a post 100% and that's all I was posting, that's not fair use, that's copyright theft. As for people liking having their content stolen, please introduce me to these mystery people! :-) Everyone I have known in the content creation business hates scrappers. Everyone loves a link and someone talking about their post (and even heavily quoting it) but scrapping is just stealing, and it also runs the risk of damaging your place in Google as well due to duplicate content...although on that I'd like to hear more because it doesn't seem to be quite as big these days...but that's a gut feeling and I have no proof
Reply Reply Report

Re: What is a scraper?

andybeal
Vote:

November 16th

I total agree Duncan. I don't care if you do link back to my site, if you steal 100% of my post, without permission, it's theft. Fair use limits you to just an extract of the original post AND you have to add to the conversation. Simply scraping the first half of my posts is just as bad. Now, if you call me a jerk and explain why, then that's fair use.
Reply Reply Report

Re: What is a scraper?

asinGer
4.00 (Good) Vote: Interesting Interesting Interesting Interesting Interesting

November 12th

An email scraper is a tool used to collect large amounts of email addresses at a time.  Some believe this method is unethical.  Many forums and websites protect their email addresses against this method, however many older sites are  vulnerable.
Reply Reply Report

Re: What is a scraper?

BrianChappell
4.00 (Good) Vote: Interesting Interesting Interesting Interesting Interesting

November 12th

Basically if your website is crawlable, and indexable, and the email address is in plain text like   blahblah@emailservice.com then it is vulnerable. Doesn't really matter if its a new, or old site.  

Thats why its best to make your email an image or use code to hide it to scrapers.
Reply Reply Report

Re: What is a scraper?

BrianChappell
4.00 (Good) Vote: Interesting Interesting Interesting Interesting Interesting

November 12th

Hmm. Little bit of a vague question but essentially in the context you posted it:

You probably posted content on your website.  A bot came by and scraped your content, then posted it on its site.

Generally if its a big enough of a problem you can try directly contacting the owner of the site, or try contacting their host directly and let them know what they are doing.

If you post a lot of content on your site typically its good to have a link back somewhere in the post. That way you will get a direct link to your site via the scraped/hosted content and should greatly help from incurring a dup content penalty.
Reply Reply Report
 
 

Invite someone to Gooruze

Home | Read News | Post News | Read Articles | Write Articles | Q & A | Groups | Activity | Members | More

Privacy Policy | House Rules | About Us | Contact Us | House Blog | FAQ

© Copyright 2007 Gooruze ™ | Built by Market United