Nginx屏蔽百度谷歌爬虫

游戏测试环境使用的是Nginx,可通过公网进行访问测试,大下午搜索游戏推广信息,竟然找到了测试服的链接地址。
为此在Nginx上做如下限制,拒绝可恶的爬虫访问.
修改Nginx.conf文件,具体的配置信息如下:
server {
listen 80;
server_name test.game.com;
if 
($http_user_agent ~* "qihoobot|Baiduspider|Googlebot|Googlebot-Mobile|Googlebot-Image|Mediapartners-Google|Adsbot-Google|Feedfetcher-Google|Yahoo! Slurp|Yahoo! Slurp China|YoudaoBot|Sosospider|Sogou spider|Sogou web spider|MSNBot|ia_archiver|Tomato Bot|FeedDemon|JikeSpider|Indy Library|Alexa Toolbar|AskTbFXTV|AhrefsBot|CrawlDaddy|CoolpadWebkit|Java|Feedly|UniversalFeedParser|ApacheBench|Microsoft URL Control|Swiftbot|ZmEu|oBot|jaunty|Python-urllib|lightDeckReports Bot|YYSpider|DigExt|YisouSpider|HttpClient|MJ12bot|heritrix|EasouSpider|Ezooms|^$")
{
return 403;
}
添加完,重启Nginx
模拟测试:
[root@~]# curl -I -A 'Baiduspider' test.game.com
HTTP/1.1 403 Forbidden
Server: nginx
Date: Thu, 30 Apr 2015 05:32:57 GMT
Content-Type: text/html; charset=utf-8
Content-Length: 168
Connection: keep-alive

[root@~]# curl -I -A 'Googlebot' test.game.com
HTTP/1.1 403 Forbidden
Server: nginx
Date: Thu, 30 Apr 2015 05:33:03 GMT
Content-Type: text/html; charset=utf-8
Content-Length: 168
Connection: keep-alive

[root@~]# curl -I -A '360' test.game.com
HTTP/1.1 200 OK
Server: nginx
Date: Thu, 30 Apr 2015 05:37:46 GMT
Content-Type: text/html; charset=utf-8
Connection: keep-alive
Vary: Accept-Encoding
X-Powered-By: PHP
Set-Cookie: PHPSESSID=fsma8aauuc4817k15tqbog4ko0; path=/
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Pragma: no-cache

你可能感兴趣的:(nginx,seo,spider,Google)