有时候去分析访问日志的时候会有很多非法的访问,或者说你不想让它访问的一些请求,比如说一些比较垃圾的蜘蛛搜索引擎,其实我们对于这些搜索引擎是可以禁掉的,因为你的网站访问量的请求如果很大的话,那么会有一半的访问量是这些搜索引擎他们的爬虫来访问的,它们的访问跟人为的访问是一样的,同样会造成咱们服务器的负担
看一下访问日志
[root@zhangmengjunlinux ~]# tail /usr/local/apache2/logs/test.com-access_
test.com-access_20151230_log test.com-access_20151231_log test.com-access_20160101_log test.com-access_log
[root@zhangmengjunlinux ~]# tail /usr/local/apache2/logs/test.com-access_20160101_log
192.168.140.2 - - [01/Jan/2016:11:34:15 +0800] "GET /admin.php?action=recyclebin HTTP/1.1" 403 211 "http://www.test.com/home.php?mod=space&do=notice&view=manage" "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.99 Safari/537.36 2345Explorer/6.4.0.10751"
192.168.140.2 - - [01/Jan/2016:11:34:11 +0800] "GET /misc.php?mod=patch&action=pluginnotice&inajax=1&ajaxtarget=plugin_notice HTTP/1.1" 200 63 "http://www.test.com/forum.php" "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.99 Safari/537.36 2345Explorer/6.4.0.10751"
192.168.140.2 - - [01/Jan/2016:11:34:12 +0800] "GET /misc.php?mod=patch&action=pluginnotice&inajax=1&ajaxtarget=plugin_notice HTTP/1.1" 200 63 "http://www.test.com/home.php?mod=space&do=notice&view=manage" "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.99 Safari/537.36 2345Explorer/6.4.0.10751"
192.168.140.100 - - [01/Jan/2016:11:34:54 +0800] "HEAD http://www.test.com/data/info.php HTTP/1.1" 403 - "-" "curl/7.19.7 (i386-redhat-linux-gnu) libcurl/7.19.7 NSS/3.19.1 Basic ECC zlib/1.2.3 libidn/1.18 libssh2/1.4.2"
127.0.0.1 - - [01/Jan/2016:12:53:57 +0800] "HEAD http://www.test.com/data/info.php HTTP/1.1" 200 - "-" "curl/7.19.7 (i386-redhat-linux-gnu) libcurl/7.19.7 NSS/3.19.1 Basic ECC zlib/1.2.3 libidn/1.18 libssh2/1.4.2"
127.0.0.1 - - [01/Jan/2016:13:29:43 +0800] "HEAD http://www.test.com/data/info.php HTTP/1.1" 403 - "-" "curl/7.19.7 (i386-redhat-linux-gnu) libcurl/7.19.7 NSS/3.19.1 Basic ECC zlib/1.2.3 libidn/1.18 libssh2/1.4.2"
192.168.140.100 - - [01/Jan/2016:13:29:55 +0800] "HEAD http://www.test.com/data/info.php HTTP/1.1" 403 - "-" "curl/7.19.7 (i386-redhat-linux-gnu) libcurl/7.19.7 NSS/3.19.1 Basic ECC zlib/1.2.3 libidn/1.18 libssh2/1.4.2"
192.168.140.100 - - [01/Jan/2016:13:30:07 +0800] "HEAD http://www.test.com/ HTTP/1.1" 403 - "-" "curl/7.19.7 (i386-redhat-linux-gnu) libcurl/7.19.7 NSS/3.19.1 Basic ECC zlib/1.2.3 libidn/1.18 libssh2/1.4.2"
192.168.140.2 - - [01/Jan/2016:13:30:19 +0800] "GET / HTTP/1.1" 403 202 "-" "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/537.36"
192.168.140.2 - - [01/Jan/2016:13:30:19 +0800] "GET /favicon.ico HTTP/1.1" 403 213 "http://www.test.com/" "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/537.36"
这里又curl访问的:"curl/7.19.7 (i386-redhat-linux-gnu) libcurl/7.19.7 NSS/3.19.1 Basic ECC zlib/1.2.3 libidn/1.18 libssh2/1.4.2"
chrome访问的Chrome/39.0.2171.99 Safari/537.36 2345Explorer/6.4.0.10751"
我们可以试着把它们禁掉
[root@zhangmengjunlinux data]# vim /usr/local/apache2/conf/extra/httpd-vhosts.conf
<IfModule mod_rewrite.c>
RewriteEngine on
RewriteCond %{HTTP_HOST} ^www.aaa.com$ [OR]
RewriteCond %{HTTP_HOST} ^www.bbb.com$
RewriteRule ^/(.*)$ http://www.test.com/$1 [R=301,L]
RewriteCond %{HTTP_USER_AGENT} ^.*curl.* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*chrome.* [NC]
RewriteRule .* - [F]
</IfModule>
在这里添加这3行,它用到的也是Rewrite,首先定义它的条件,[NC]是不区分大小写, [F]是Forbidden
我们用chrome来访问一下:
然后我们用curl访问:
[root@zhangmengjunlinux ~]# curl -x192.168.140.100:80 www.test.com -I
HTTP/1.1 403 Forbidden
Date: Fri, 01 Jan 2016 05:42:09 GMT
Server: Apache/2.2.31 (Unix) PHP/5.3.27
Content-Type: text/html; charset=iso-8859-1