Linux分析Nginx日志统计IP访问次数的shell脚本

平时运维遇到最多的就是nginx的日志分析了,要时常做系统监控,检查IP的访问次数是否有异常,防止恶意访问。

假设我的nginx日志如下:

140.205.201.39 - - [09/Apr/2015:06:43:45 +0800] "\x00\x0E8\xC5;.\x5CC\xE3W\xCD\x00\x00\x00\x00\x00" 400 181 "-" "-"
220.181.108.104 - - [09/Apr/2015:07:00:20 +0800] "GET /xref/linux-3.18.6/ HTTP/1.1" 502 181 "-" "Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www
.baidu.com/search/spider.html)"
216.244.66.249 - - [09/Apr/2015:07:01:43 +0800] "GET /robots.txt HTTP/1.1" 502 181 "-" "Mozilla/5.0 (compatible; DotBot/1.1; http://www.opensiteexplor
er.org/dotbot, [email protected])"
211.0.154.125 - - [09/Apr/2015:07:11:39 +0800] "GET / HTTP/1.1" 502 181 "-" "Mozilla/5.0 (Windows NT 5.1; rv:6.0.2) Gecko/20100101 Firefox/6.0.2"
180.76.15.27 - - [09/Apr/2015:07:11:39 +0800] "GET /robots.txt HTTP/1.1" 502 181 "-" "Mozilla/5.0 (Windows NT 5.1; rv:6.0.2) Gecko/20100101 Firefox/6.
0.2"
180.76.15.152 - - [09/Apr/2015:07:30:33 +0800] "GET / HTTP/1.1" 502 181 "-" "Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/sp
ider.html)"
180.76.15.15 - - [09/Apr/2015:07:31:11 +0800] "GET / HTTP/1.1" 502 181 "-" "Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spi
der.html)"
216.244.66.249 - - [09/Apr/2015:07:31:40 +0800] "GET /robots.txt HTTP/1.1" 502 181 "-" "Mozilla/5.0 (compatible; DotBot/1.1; http://www.opensiteexplor
er.org/dotbot, [email protected])"
216.244.66.249 - - [09/Apr/2015:07:35:30 +0800] "GET /robots.txt HTTP/1.1" 502 181 "-" "Mozilla/5.0 (compatible; DotBot/1.1; http://www.opensiteexplor
er.org/dotbot, [email protected])"
180.76.15.157 - - [09/Apr/2015:07:44:03 +0800] "GET /s?path=/sbin/lilo&project=linux-3.18.6 HTTP/1.1" 502 181 "-" "Mozilla/5.0 (compatible; Baiduspide
r/2.0; +http://www.baidu.com/search/spider.html)"
220.181.108.111 - - [09/Apr/2015:07:45:20 +0800] "GET /xref/linux-3.18.6/ H
TTP/1.1" 502 181 "-" "Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www
.baidu.com/search/spider.html)"
 

下面是各种统计访问次数的shell代码:

1.2015年4月10日期间访问次数最多的5个IP:

cat /home/shiyanlou/access.log | grep "10/Apr/2015" | awk '{print $1}'| sort | uniq -c | sort -k 1 -n -r| head -5 |awk
 '{print $2}' >output1.txt

2.2015年4月11日期间访问次数大于等于10次的所有IP地址:

cat /home/shiyanlou/access.log|grep "11/Apr/2015" |awk '{print $1}' |sort | uniq -c | awk '{if ($1 >= 10) print $
2}'| sort -n -r | less >output2.txt

 

 3.日志文件中访问次数最多的10个请求(日志每行GET后面的内容)例如 /s?defs=ascii&project=linux-3.18.6,注意不允许有空行,不包含 /robots.txt*.js*.css*.png 这类静态文件、图片等访问。

cat /home/shiyanlou/access.log|grep "GET"| grep -Ev 'txt|js|png|css ' | awk '{ print $7}'| sort | uniq -c | sort 
-k 1 -n -r | head -11 | awk 'NR>1 {print $2}' >output3.txt

 

4.日志文件中访问状态为 404 的所有访问请求地址:

cat /home/shiyanlou/access.log|grep "404"| grep -Ev 'txt|js|png|css ' | awk '{print $7}' | sort | uniq -c |awk '{
print $2}' >output4.txt

 

你可能感兴趣的:(Linux,Shell,Nginx)