[seqfan] Re: evil crawlers (123people and the like)

Joerg Arndt arndt at jjj.de
Tue Jun 29 11:54:07 CEST 2010

There are several such sites
(123people_com, pipl_com, yasni_com).
The contents are generated by crawling the web.

to get the picture.

If the name of the crawler could be identified
we can block them (if they heed robots.txt).
Ah, here we go: on the (German) web page
we find:
-------- robots.txt ---------
User-agent: MyOnID
Disallow: /

User-agent: 123People
Disallow: /

User-agent: Pipl
Disallow: /

User-agent: Yasni
Disallow: /
-------- robots.txt ---------

I strongly suggest to enter these lines to
(and the corresponding AT&T file).

Wait a second, we have:
-------- robots.txt ---------
User-agent: *
Disallow: /
Disallow: /w/
Disallow: /wiki/Special:Search
Disallow: /wiki/Special:Random
-------- robots.txt ---------

We certainly do not want to block _all_ search engines, do we?
That is, I suggest to remove the line
Disallow: /

Btw. our fine Fritzl friends are:
% whois 123people.com

[owner-c] fname:             Helga
[owner-c] lname:             Bernold
[owner-c] org:               123people
[owner-c] address:           Stronsdorf 24
[owner-c] city:              Stronsdorf
[owner-c] pcode:             2153
[owner-c] country:           AT
[owner-c] state:             Austria
[owner-c] phone:             +43-664-4398603
[owner-c] fax:               +43-2526-6710
[owner-c] email:             domains at 123people.com

[admin-c] fname:             Martin
[admin-c] lname:             Stemeseder
[admin-c] org:               123people
[admin-c] address:           Linke Wienzeile 8/29
[admin-c] city:              Wien
[admin-c] pcode:             1060
[admin-c] country:           AT
[admin-c] state:             AT
[admin-c] phone:             +43-664-4398603
[admin-c] fax:               +43-2526-6710
[admin-c] email:             domains at 123people.com

* Jaume Oliver i Lafont <joliverlafont at gmail.com> [Jun 29. 2010 10:30]:
> Hello all,
> This message is not about numbers, but about misuse of the OEIS from a
> third party, regarding personal data privacy.
> There is something that appears to get e-mails from , changes (AT) by
> @ and then publish the result on the web.
> My example is at http://www.123people.com/s/jaume+oliver
> I assume they get the data from here because of the result when
> clicking on the e-mail adress.
> Checking for other contributors gives similar results.
> I complained and received a response that did not match the complaint;
> chances are they did not even read my text.
> Regards,
> Jaume Oliver
> _______________________________________________
> Seqfan Mailing list - http://list.seqfan.eu/

More information about the SeqFan mailing list