被监控LInux端需要安装nagios-plugin和NRPE 被监控window段需要安装NSClient++
一:监控远端Linux主机
被监控端安装的软件方式前面有介绍这里省略
需要修改
[root@xen etc]# grep -v '#' nrpe.cfg |grep -v '^$' log_facility=daemon pid_file=/var/run/nrpe.pid server_port=5666 nrpe_user=nagios nrpe_group=nagios allowed_hosts=192.168.0.138 #允许那个服务器来从这获取数据 dont_blame_nrpe=0 allow_bash_command_substitution=0 debug=0 command_timeout=60 connection_timeout=300 command[check_users]=/usr/local/nagios/libexec/check_users -w 5 -c 10 command[check_load]=/usr/local/nagios/libexec/check_load -w 15,10,5 -c 30,25,20 command[check_xvda1]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /dev/xvda1 command[check_xvda2]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /dev/xvda2 #自己定义的需要监控项 command[check_xvda3]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /dev/xvda3 command[check_zombie_procs]=/usr/local/nagios/libexec/check_procs -w 5 -c 10 -s Z command[check_total_procs]=/usr/local/nagios/libexec/check_procs -w 150 -c 200 command[check_ping]=/usr/local/nagios/libexec/check_ping -H 127.0.0.1 -w 100.0,20% -c 500.0,60% command[check_disk]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p / command[check_swap]=/usr/local/nagios/libexec/check_swap -w 30% -c 10%
重启nrpe
来到nagios服务器端,因为用nrpe来获取数据,所以啊,先要在服务器端也需要安装nrpe(安装过程前面有)执行
[root@kcw libexec]# /usr/local/nagios/libexec/check_nrpe -H 192.168.0.156 #检测服务器段能不能检测被监控主机 NRPE v2.15
定义NRPE
vim /usr/local/nagios/etc/objects/command.cfg #添加如下 define command { command_name check_nrpe command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$ }
定义主机和服务 (nagios服务器段一定要包括这个文件linuxhost.cfg)
cfg_file=/usr/local/nagios/etc/objects/linuxhost.cfg #在nagios主配置文件里添加这么一行
[root@kcw objects]# grep -v '#' linuxhost.cfg |grep -v '^$' define host{ use linux-server,host-pnp ; Name of host template to use ; This host definition will inherit all variables that are defined ; in (or inherited by) the linux-server host template definition. host_name linuxhost alias my linux host address 192.168.0.156 } define service{ use generic-service,srv-pnp ; Name of service template to use host_name linuxhost service_description CHECK PING check_command check_nrpe!check_ping } define service{ use generic-service,srv-pnp ; Name of service template to use host_name linuxhost service_description CHECK SWAP check_command check_nrpe!check_swap } define service{ use generic-service,srv-pnp ; Name of service template to use host_name linuxhost service_description Root Partition check_command check_nrpe!check_disk } define service{ use generic-service,srv-pnp ; Name of service template to use host_name linuxhost service_description CHECK Users check_command check_nrpe!check_users } define service{ use generic-service,srv-pnp ; Name of service template to use host_name linuxhost service_description check Total Processes check_command check_nrpe!check_total_procs } define service{ use generic-service,srv-pnp ; Name of service template to use host_name linuxhost service_description check zombie Processes check_command check_nrpe!check_zombie_procs } define service{ use generic-service,srv-pnp ; Name of service template to use host_name linuxhost service_description Check Load check_command check_nrpe!check_load } define service{ use generic-service,srv-pnp ; Name of service template to use host_name linuxhost service_description xvda1 Partition check_command check_nrpe!check_xvda1 } define service{ use generic-service,srv-pnp ; Name of service template to use host_name linuxhost service_description xvda2 Partition check_command check_nrpe!check_xvda2 } define service{ use generic-service,srv-pnp ; Name of service template to use host_name linuxhost service_description xvda3 Partition check_command check_nrpe!check_xvda3 }
也可以单独把主机定义一个文件里,服务单独定一个文件里
ll hosts.cfg services.cfg -rw-r--r-- 1 nagios nagios 1216 2月 16 14:56 hosts.cfg -rw-rw-r-- 1 nagios nagios 5029 2月 16 14:47 services.cfg cat hosts.cfg define host{ use linux-server,host-pnp host_name web1 alias web1 address 2.2.2.2 } define host{ use linux-server,host-pnp host_name web2 alias web2 address 1.1.1.1 ..... cat services.cfg define service{ use generic-service,srv-pnp host_name web1,web2,master1,slave1 service_description Current Load check_command check_nrpe!check_load } define service{ use generic-service host_name web1,web2,master1,slave1 service_description Disk Usage check_command check_nrpe!check_disk .....
监控window主机
首先下载NSClient++
下载地址http://nsclient.org/nscp/downloads
安装就是一步一步下一步这里省略需要填写的就是nagios服务器端的IP nsclient++ 的密码 为了安全 我这里省略 勾都点上 支持的检测的方式不同
安装完成默认启用了check_nt 监听端口12489 nrpe监听的5666 端口
到服务器上试着运行check_nt 看能不能连接我window host
[root@kcw libexec]# ./check_nt -H 192.168.0.137 -p 12489 -v UPTIME System Uptime - 2 day(s) 4 hour(s) 46 minute(s) |uptime=3166 #表示可以从我window主机取到数据
在nagios服务器端定义command.cfg的check_nt命令
vim /usr/local/nagios/objects/command.cfg # 'check_nt' command definition define command{ command_name check_nt command_line $USER1$/check_nt -H $HOSTADDRESS$ -p 12489 -v $ARG1$ $ARG2$ #$引用的是宏 也就是变量 -p指定端口 -s 指定密码 没有密码就不写了 -v 参数 }
定义主机
nagios自带的模板文件建议修改之前先备份一下
修改windos模板定义主机和模板
vim /usr/local/nagios/etc/objects/windows.cfg define host{ use windows-server,host-pnp ; Inherit default values from a template host_name winserver ; The name we're giving to this host alias My Windows Server ; A longer name associated with the host address 192.168.0.137 ; IP address of the host } define service{ use generic-service,srv-pnp host_name winserver service_description D:\ Drive Space check_command check_nt!USEDDISKSPACE!-l d -w 80 -c 90 }
完了启用windows.cfg文件
vim /usr/local/nagios/etc/nagios.cfg cfg_file=/usr/local/nagios/etc/objects/windows.cfg #添加这行
执行检查语法错误
[root@www share]# check Nagios Core 3.5.1 Copyright (c) 2009-2011 Nagios Core Development Team and Community Contributors Copyright (c) 1999-2009 Ethan Galstad Last Modified: 08-30-2013 License: GPL Website: http://www.nagios.org Reading configuration data... Read main config file okay... Processing object config file '/usr/local/nagios/etc/objects/commands.cfg'... Processing object config file '/usr/local/nagios/etc/objects/contacts.cfg'... Processing object config file '/usr/local/nagios/etc/objects/timeperiods.cfg'... Processing object config file '/usr/local/nagios/etc/objects/templates.cfg'... Processing object config file '/usr/local/nagios/etc/objects/localhost.cfg'... Processing object config file '/usr/local/nagios/etc/objects/linuxhost.cfg'... Read object config files okay... Running pre-flight check on configuration data... Checking services... Checked 24 services. Checking hosts... Checked 2 hosts. Checking host groups... Checked 1 host groups. Checking service groups... Checked 0 service groups. Checking contacts... Checked 1 contacts. Checking contact groups... Checked 1 contact groups. Checking service escalations... Checked 0 service escalations. Checking service dependencies... Checked 0 service dependencies. Checking host escalations... Checked 0 host escalations. Checking host dependencies... Checked 0 host dependencies. Checking commands... Checked 25 commands. Checking time periods... Checked 5 time periods. Checking for circular paths between hosts... Checking for circular host and service dependencies... Checking global event handlers... Checking obsessive compulsive processor commands... Checking misc settings... Total Warnings: 0 Total Errors: 0 Things look okay - No serious problems were detected during the pre-flight check
重启nagios
如下图
以上window监控是默认check_nt 来实现
如果想用check_nrpe来监控 那就
修改windows的安装文件NSP文件打开相应的NSClient++ 允许的IP和端口 然后重启即可!这里省略