Thus spake Matthew (pumpkin@xxxxxxxxx): > On 13/02/11 19:09, scroogle@xxxxxxxxxxx wrote: > >I've been fighting two different Tor users for a week. Each is > >apparently having a good time trying to see how quickly they > >can get results from Scroogle searches via Tor exit nodes. > >The fastest I've seen is about two per second. Since Tor users > >are only two percent of all Scroogle searches, I'm not adverse > >to blocking all Tor exits for a while when all else fails. > >These two Tor users were rotating their search terms, and one > >also switched his user-agent once. You can see why I might be > >tempted to throw my "block all Tor" switch on occasion -- > >sometimes there's no other way to convince the bad guy that > >he's not going to succeed. > > For the less than knowledgeable people amongst us (e.g me) who want to > learn a bit more: what was the rationale for those two Tor users doing what > they did? What do they get from it? I second this. Daniel, If you can find a way to fingerprint these bots, my suggestion would be to observe the types of queries they are running (perhaps for some of their earlier runs from when you could ban them by user agent?). One of the things Google does is actually decide your 'Captchaness' based on the content of your queries. Well, at least I suspect that's what they are doing, because I have been able to more reliably reproduce torbutton Captcha-related bugs when I try hard to write queries like robots that are looking for php sites to exploit. I would love to hear more about the types of scrapers that abuse Tor. Or rather, I would like to see if someone can at least identify rational behavior behind scrapers that abuse Tor. Some of it could also be misdirected malware that is operating from within Torified browsers. Some of it could also be deliberately torified malware. Google won't tell us any of this, obviously ;). -- Mike Perry Mad Computer Scientist fscked.org evil labs
Attachment:
pgp3dyAwFCkco.pgp
Description: PGP signature