搜索引擎爬虫蜘蛛的User-Agent收集

百度爬虫

    * Baiduspider+(+http://www.baidu.com/search/spider.htm”)

例如:

172.16.10.113 1000 - - [13/Mar/2013:00:00:55 +0800] "GET /sitemapdata/sitemap_hangzhou.xml HTTP/1.0" 304 - "-" "Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)" "#Sess: -" 

google爬虫

    * Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)

    * Googlebot/2.1 (+http://www.googlebot.com/bot.html)

    * Googlebot/2.1 (+http://www.google.com/bot.html)

 

雅虎爬虫(分别是雅虎中国和美国总部的爬虫)

    *Mozilla/5.0 (compatible; Yahoo! SlurpChina; http://misc.yahoo.com.cn/help.html)

    *Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)

 

新浪爱问爬虫

    *iaskspider/2.0(+http://iask.com/help/help_index.html”)

    *Mozilla/5.0 (compatible; iaskspider/1.0; MSIE 6.0)

 

搜狗爬虫

    *Sogou web spider/3.0(+http://www.sogou.com/docs/help/webmasters.htm#07)

    *Sogou Push Spider/3.0(+http://www.sogou.com/docs/help/webmasters.htm#07)

 

网易爬虫

    *Mozilla/5.0 (compatible; YodaoBot/1.0;http://www.yodao.com/help/webmaster/spider/;)

 

MSN爬虫

    *msnbot/1.0 (+http://search.msn.com/msnbot.htm)

你可能感兴趣的:(搜索引擎爬虫蜘蛛的User-Agent收集)