想要监控流量,流量到达阀值报警,报警就用NAGIOS
主角:check_traffic.sh
脚本地址:https://github.com/cloved/check_traffic/
============== client:m153
cd /usr/local/nagios/libexec || cd /usr/local/nrpe/libexec
chmod nagios.nagios check_traffic.sh && chmod a+x check_traffic.sh
yum -y install net-snmp net-snmp-utils bc //脚本需要snmpwalk及bc
echo "rocommunity cacti 127.0.0.1" >> /etc/snmp/snmpd.conf //新加一行 cacti是public,要是没加下面的网卡检测会超时
/etc/init.d/snmpd reload
网卡参数检测:
/usr/local/nagios/libexec/check_traffic.sh -V 2c -C cacti -H 127.0.0.1 -L
List Interface for host 127.0.0.1.
Interface index 1 orresponding to lo
Interface index 2 orresponding to eth0
Interface index 3 orresponding to eth1 //记住index3对应eth1的这种顺序,这里我的外网卡是eth1,eth0是内网卡,这个服务器有2个IP,别名eth1:1,似乎不能监控到别名
测试能否正常收集到数据:(非必须)
/usr/local/nagios/libexec/check_traffic.sh -V 2c -C cacti -H 127.0.0.1 -I 3 -w 1024,100 -c 2048,200 -K -B
执行后会在/var/tmp/生成文件,测试完毕后把生成的文件删掉,否则会提示
check_traffic;UNKNOWN;SOFT;1;Unknown - Read or Write File /var/tmp/check_traffic_127.0.0.1_3.hist_dat_root__64 Error with user uid=497(nagios) gid=498(nagios) groups=498(nagios).
编辑nrpe.cfg
vi /usr/local/nagios/etc/nrpe.cfg
command[check_traffic]=/usr/local/nagios/libexec/check_traffic.sh -V 2c -C cacti -H 127.0.0.1 -I 3 -w 1024,1024 -c 2048,2048 -K -b
command[check_traffic]=/usr/local/nrpe/libexec/check_traffic.sh -V 2c -C cacti -H 127.0.0.1 -I 3 -w 1024,1024 -c 2048,2048 -K -b
重启/etc/init.d/nrpe reload ||kill `ps -ef |grep [n]rpe|awk '{print $2}'`
V 协议 C 组织名 H 主机 I 网络接口 K kbps b bit/s 流量高的写M
更多参数请参考脚本的注释
============ 服务端m44
vi /usr/local/nagios/etc/objects/m153.cfg //添加服务
define service {
host_name 1.1.1.153
service_description check_traffic
check_period 24x7 //全天候检测
max_check_attempts 4 //出现故障的连接次数,达到这个次数后就报警
normal_check_interval 3 //重新检测的时间间隔3分钟,常态也是3分钟检测一次
retry_check_interval 2 //出现故障后每2分钟检查一次
contact_groups sagroup
notification_interval 10 //分钟
notification_period 24x7 //全天候通知,这里可以在timeperiods.cfg 自定义
notification_options c,r //这里只配置严重和恢复报警,没有配置w,u 警告和未知
check_command check_nrpe!check_traffic
}
[10-24-2013 14:50:42] SERVICE ALERT: 1.1.1.1;check_traffic;OK;HARD;4;OK - The Traffic In is 32Kbps, Out is 29Kbps, Total is 61Kbps. The Check Interval is 180s
Service Critical[10-24-2013 14:47:42] SERVICE ALERT: 1.1.1.1;check_traffic;CRITICAL;HARD;4;Critical - The Traffic In is 43Kbps, Out is 279Kbps, Total is 322Kbps. The Check Interval is 117s
Service Critical[10-24-2013 14:45:52] SERVICE ALERT: 1.1.1.1;check_traffic;CRITICAL;SOFT;3;Critical - The Traffic In is 61Kbps, Out is 639Kbps, Total is 700Kbps. The Check Interval is 123s
Service Critical[10-24-2013 14:43:42] SERVICE ALERT: 1.1.1.1;check_traffic;CRITICAL;SOFT;2;Critical - The Traffic In is 74Kbps, Out is 909Kbps, Total is 983Kbps. The Check Interval is 120s
Service Critical[10-24-2013 14:41:42] SERVICE ALERT: 1.1.1.1;check_traffic;CRITICAL;SOFT;1;Critical - The Traffic In is 55Kbps, Out is 428Kbps, Total is 483Kbps. The Check Interval is 180s
重启NAGIOS
/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
/usr/local/nagios/bin/nagios /usr/local/nagios/etc/nagios.cfg -d &