Robots Identification
|
URI: |
http://herbert.gandraxa.com/herbert/rid.asp |
|
Link template: |
<a href="http://herbert.gandraxa.com/herbert/rid.asp">Robots Identification</a> |
|
Link symbols: |
|
Robots Identification
|
URI: |
http://herbert.gandraxa.com/herbert/rid.asp |
|
Link template: |
<a href="http://herbert.gandraxa.com/herbert/rid.asp">Robots Identification</a> |
|
Link symbols: |
|
Home »
Robots Identification
This article serves to identify those surfers which spider the web automatically.
2007-Nov-23 11:00 — Setup of trap
2007-Dec-14 10:30 — Last update
Web robot on WikipediaIt is somewhat unlikely that you are a human visitor, because no person I know would click on the spot you needed to click to find this page. However, it is possible that a search engine provided you with this link. In this case, you most likely will not find what you were looking for, not on this page that is.
For a robot, though, it is no issue to find the link, and after having been found the robot usually will follow it. This makes this page ideal for the stated purpose, namely to identify those automatic travellers. I want to know them, because I want to ban some from my pages, to reduce overall Internet traffic and also useless traffic on my server. That said, I want to make it clear, that I do not want to ban all robots from my site, but I certainly will attempt to block e-mail harvesters and the like.
I will first attempt a blockade via the
Robots Excusion Standard (RES), and if that is to no avail, more drastic measures are invoked.
So, if you happen to be a human or a useful robot: thanks for your visit, and have a nice day. The others can't read anyway, so there is no point to tell them anything.
If you are human and still are on this page, then you maybe look forward to know which robots are involved. For you I will maintain this table here:
[Note: Updates are made manually, hence the 3 columns First visit, Last visit and Number of visits depend on the date of the last update. It can be expected that updates come at a quite infrequent and in any case irregular rate.]
| IPs | Organization | User Agent string | Visits [1] | First visit [1] | Last visit | RES [2] | Follows RES | Blocked IPs |
| 83.138.172.72 | Rackspace Managed Hosting San Antonio, TX |
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322) | 1 | 14.12.2007 09:58:46 | 14.12.2007 09:58:46 | No | n/a | |
| 65.55.165.40 | Microsoft Corp Redmond, WA |
Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.2; .NET CLR 1.1.4322) | 1 | 12.12.2007 20:59:55 | 12.12.2007 20:59:55 | No | n/a | |
| 24.73.96.230 | Road Runner HoldCo LLC Herndon, VA |
Mozilla/4.0 (compatible ; MSIE 6.0; Windows NT 5.1) | 1 | 12.12.2007 18:35:27 | 12.12.2007 18:35:27 | No | n/a | |
| 65.55.212.26 | Microsoft Corp Redmond, WA |
msnbot-media/1.0 (+http://search.msn.com/msnbot.htm) | 1 | 07.12.2007 03:31:12 | 07.12.2007 03:31:12 | No | n/a | |
| 64.208.172.181 | Clobal Crossing Phoenix, AZ |
ia_archiver | 1 | 04.12.2007 03:47:20 | 04.12.2007 03:47:20 | No | n/a | |
| 65.54.165.35-65.55.208.27 | Microsoft Corp Redmond, WA |
msnbot/1.0 (+http://search.msn.com/msnbot.htm) | 9 | 25.11.2007 07:24:44 | 12.12.2007 20:58:11 | No | n/a | |
| 74.6.26.119 | Inktomi Corporation Sunnyvale, CA |
Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp) | 18 | 23.11.2007 15:15:03 | 13.12.2007 17:12:03 | No | n/a | |
| 66.249.65.208 | Google Inc. Mountain View, CA |
Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html) | 4 | 23.11.2007 12:50:34 | 08.12.2007 19:26:38 | No | n/a | |
| 66.249.65.208 | Google Inc. Mountain View, CA |
Mediapartners-Google | 2 | 23.11.2007 11:03:50 | 01.12.2007 07:18:01 | No | n/a | |
[1] Since 23.11.2007 11:00
[2] If the robot is not added to the RES (Robots Exclusion Standard), then this means that that particular robot is welcome or at least tolerated.