nagios流量监控报警

想要监控流量,流量到达阀值报警,报警就用NAGIOS

主角:check_traffic.sh 

脚本地址:https://github.com/cloved/check_traffic/


==============  client:m153

cd /usr/local/nagios/libexec || cd /usr/local/nrpe/libexec 

chmod nagios.nagios check_traffic.sh  && chmod a+x check_traffic.sh

yum -y install net-snmp net-snmp-utils bc  //脚本需要snmpwalk及bc

echo "rocommunity cacti 127.0.0.1" >>  /etc/snmp/snmpd.conf  //新加一行 cacti是public,要是没加下面的网卡检测会超时

/etc/init.d/snmpd reload

网卡参数检测:

/usr/local/nagios/libexec/check_traffic.sh -V 2c -C cacti -H 127.0.0.1 -L       

List Interface for host 127.0.0.1.

Interface index 1 orresponding to  lo

Interface index 2 orresponding to  eth0

Interface index 3 orresponding to  eth1   //记住index3对应eth1的这种顺序,这里我的外网卡是eth1,eth0是内网卡,这个服务器有2个IP,别名eth1:1,似乎不能监控到别名

测试能否正常收集到数据:(非必须)

/usr/local/nagios/libexec/check_traffic.sh -V 2c -C cacti -H 127.0.0.1 -I 3 -w 1024,100 -c 2048,200 -K -B

执行后会在/var/tmp/生成文件,测试完毕后把生成的文件删掉,否则会提示

check_traffic;UNKNOWN;SOFT;1;Unknown - Read or Write File /var/tmp/check_traffic_127.0.0.1_3.hist_dat_root__64 Error with user uid=497(nagios) gid=498(nagios) groups=498(nagios).

编辑nrpe.cfg

vi /usr/local/nagios/etc/nrpe.cfg 

command[check_traffic]=/usr/local/nagios/libexec/check_traffic.sh -V 2c -C cacti -H 127.0.0.1 -I 3 -w 1024,1024 -c 2048,2048 -K -b

command[check_traffic]=/usr/local/nrpe/libexec/check_traffic.sh -V 2c -C cacti -H 127.0.0.1 -I 3 -w 1024,1024 -c 2048,2048 -K -b

重启/etc/init.d/nrpe reload ||kill `ps -ef |grep [n]rpe|awk '{print $2}'`

V 协议 C 组织名 H 主机 I 网络接口 K kbps b bit/s 流量高的写M

更多参数请参考脚本的注释


============  服务端m44

vi /usr/local/nagios/etc/objects/m153.cfg  //添加服务

define service {

        host_name             1.1.1.153

        service_description   check_traffic

        check_period          24x7 //全天候检测

        max_check_attempts    4    //出现故障的连接次数,达到这个次数后就报警

        normal_check_interval 3    //重新检测的时间间隔3分钟,常态也是3分钟检测一次

        retry_check_interval  2    //出现故障后每2分钟检查一次

        contact_groups        sagroup

        notification_interval   10    //分钟

        notification_period     24x7  //全天候通知,这里可以在timeperiods.cfg 自定义

        notification_options    c,r   //这里只配置严重和恢复报警,没有配置w,u  警告和未知

        check_command      check_nrpe!check_traffic

        }

[10-24-2013 14:50:42] SERVICE ALERT: 1.1.1.1;check_traffic;OK;HARD;4;OK - The Traffic In is 32Kbps, Out is 29Kbps, Total is 61Kbps. The Check Interval is 180s

Service Critical[10-24-2013 14:47:42] SERVICE ALERT: 1.1.1.1;check_traffic;CRITICAL;HARD;4;Critical - The Traffic In is 43Kbps, Out is 279Kbps, Total is 322Kbps. The Check Interval is 117s

Service Critical[10-24-2013 14:45:52] SERVICE ALERT: 1.1.1.1;check_traffic;CRITICAL;SOFT;3;Critical - The Traffic In is 61Kbps, Out is 639Kbps, Total is 700Kbps. The Check Interval is 123s

Service Critical[10-24-2013 14:43:42] SERVICE ALERT: 1.1.1.1;check_traffic;CRITICAL;SOFT;2;Critical - The Traffic In is 74Kbps, Out is 909Kbps, Total is 983Kbps. The Check Interval is 120s

Service Critical[10-24-2013 14:41:42] SERVICE ALERT: 1.1.1.1;check_traffic;CRITICAL;SOFT;1;Critical - The Traffic In is 55Kbps, Out is 428Kbps, Total is 483Kbps. The Check Interval is 180s


重启NAGIOS

/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg

/usr/local/nagios/bin/nagios /usr/local/nagios/etc/nagios.cfg -d &

你可能感兴趣的:(nagios流量监控报警)