8.1 下载需要的文件
Nagios-3.4.1.tar.gz
Nagios-plugins-1.4.16.tar.gz
Ndoutils-1.5.2.tar.gz
Npc2.0.4.tar.gz(这个不好找)
Nrpe-2.13.tar.gz
8.2 nagios和nagios plugins的安装
Tar zxvf nagios-3.4.1.tar.gz
Cd nagios-3.4.1
./configure �Cprefix=/var/www/localhost/htdocs/nagios
Make all
Mkdir /var/www/localhost/htdocs/nagios
Useradd nagios
Passwd nagios
Groupadd nagios
Usermod �CG nagios nagios
Usermod �CG nagios apache
Chown �CR nagios:nagios /var/www/localhost/htdocs/nagios
Make install
Make install-init
Make install-commandmode
Make install-config
Cd ..
Tar zxvf nagios-plugins- 1.4.16.tar.gz
Cd nagios-plugins-1.4.16
./configure �Cprefix=/var/www/localhost/htdocs/nagios/
Make
Make install
8.3 httpd.conf的修改
vi /etc/apache2/httpd.conf
添加
#setting for nagios 20120815 ScriptAlias /nagios/cgi-bin /var/www/localhost/htdocs/nagios/sbin <Directory /var/www/localhost/htdocs /nagios/sbin"> Options ExecCGI AllowOverride None Order allow,deny Allow from all AuthName "Nagios Access" AuthType Basic AuthUserFile /var/www/localhost/htdocs /nagios/etc/htpasswd # For this directory to access the authentication file |
Require valid-user </Directory>
Alias /nagios /var/www/localhost/htdocs /nagios/share <Directory /var/www/localhost/htdocs /nagios/share"> Options None AllowOverride None Order allow,deny Allow from all AuthName "Nagios Access" AuthType Basic AuthUserFile /var/www/localhost/htdocs/nagios/etc/htpasswd #For this directory to access the authentication file Require valid-user </Directory>
|
8.4 增加验证用户
Htpasswd �Cc /var/www/localhost/htdocs/nagios/etc/htpasswd nagios
查看验证问内容
Less /usr/local/nagios/etc/htpasswd
做的这里我们已经可以访问nagios 主页了
http://192.168.254.123/nagios
登录进去以后除了主页什么都打不开
8.5 nagios配置
Cd /var/www/localhost/htdocs/nagios/etc
Vi nagios.cfg
#cfg_file=/usr/local/nagios/etc/localhost.cfg cfg_file=/usr/local/nagios/etc/contactgroups.cfg cfg_file=/usr/local/nagios/etc/contacts.cfg cfg_file=/usr/local/nagios/etc/hostgroups.cfg cfg_file=/usr/local/nagios/etc/hosts.cfg cfg_file=/usr/local/nagios/etc/services.cfg cfg_file=/usr/local/nagios/etc/timeperiods.cfg check_external_commands=1 command_check_interval=10s #command_check_interval=-1
|
vi cgi.cfg
authorized_for_system_information=nagiosadmin,nagios authorized_for_configuration_information=nagiosadmin,nagios authorized_for_system_commands=nagios authorized_for_all_services=nagiosadmin,nagios authorized_for_all_hosts=nagiosadmin,test,nagios authorized_for_all_service_commands=nagiosadmin,nagios authorized_for_all_host_commands=nagiosadmin,nagios
|
Cd objects
Ls
看到如下配置文件
commands.cfg services.cfgwindows.cfgswitch.cfgcontacts.cfglocalhost.cfgtemplates.cfgprinter.cfgtimeperiods.cfg
|
备份好系统自带的文件开始编译
Mv contacts.cfg contacts.cfg.backup
Mv timeperiods.cfg timeperiods.cfg.backup
vi timeperiods.cfg(非必要,系统自带的模板timeperiods.cfg编译非常完善)
define timeperiod{ timeperiod_name24x7 alias24 Hours A Day,7Days A Week sunday00:00-24:00 monday00:00-24:00 tuesday00:00-24:00 wednesday00:00-24:00 thursday00:00-24:00 friday00:00-24:00 saturday00:00-24:00 }
|
vi contacts.cfg
define contact{ contact_namenagios aliasnagios admin service_notification_period24x7 host_notification_period24x7 service_notification_optionsw,u,c,r host_notification_optionsd,u,r service_notification_commandsnotify-service-by-email host_notification_commandsnotify-host-by-email pager137******** address1CHN address2SHA }
|
vi contactgroups.cfg
define contactgroup{ contactgroup_namenagios aliasnagiosAdministrators membersnagios }
|
vi hosts.cfg
define host{ host_namelocalhost aliaslocalhost address192.168.254.123 check_commandcheck-host-alive max_check_attempts5 check_period24x7 contact_groupsnagios notification_interval10 notification_period24x7 notification_optionsd,u,r }
|
Vi hostgroup.cfg
define hostgroup{ hostgroup_namehostgroups aliashostgroups memberslocalhost }
|
vi service.cfg
#service definition define service{ host_namelocalhost service_descriptioncheck-host-alive check_commandcheck-host-alive max_check_attempts5 normal_check_interval3 retry_check_interval2 check_period24x7 notification_interval10 notification_period24x7 notification_optionsw,u,c,r contact_groupsnagios }
|
Cd /var/www/localhost/htdocs/nagios/
Bin/nagios �Cv etc/nagios.cfg
测试成功的提示
Total Warnings: 0 Total Errors:0 Things look okay - No serious problems were detected during the pre-flight check otal Warnings: 0 Total Errors:0 Things look okay - No serious problems were detected during the pre-flight check Total Errors:0 Things look okay - No serious problems were detected during the pre-flight check |
如果Total Errors不是0,根据提示修改,如果看到Total errors是0,运行下面命令,可以将它写成脚本加到开机启动里面,记得补全路径,当前路径是在相对路径下运行的。
Bin/nagios �Cd /etc/nagios.cfg
安装完成访问http://192.168.254.123/nagios
Contact group 'admins' specified in service 'Hosts' for host 'windows server' is not defined anywhere!
解决方法:
解决方法:
8.6.1.1将templates.cfg配置中的admins组更改为contactgroups.cfg中定义的nagios
8.6.2.1 或者把定义的vi objects/services.cfg 中contact_groups nagios 改为admins
参考文档http://yahoon.blog.51cto.com/13184/41897
实验目的:Nagios对windows server实现服务监控,如下图!
NSClient下载:http://nsclient.org/nscp/downloads这里下载的版本是0.3.8(X64)
安装,这个不用说了,Windows双击安装,安装过程需要填写Nagios服务器地址,填上你的Nagios服务器地址( 这里是192.168.254.123),密码可以填写可以不填(这里没填写),其他选项全部选中,默认安装路径c:\program files
打开nsc.ini,做以下修改。
8.7.3 nagios配置文件修改,在监控服务器上
Cd /var/www/localhost/htdocs/nagios
Vi etc/object/windows.cfg基本上就是原配置,稍微做了修改
define host{ usewindows-server host_namewindows-server aliasMy Windows Server address192.168.254.1 } |
Host参数设置
define hostgroup{ hostgroup_namewindows-servers aliasWindows Servers } |
组别
define service{ usegeneric-service host_namewindows-server service_descriptionNSClient++0.3.8;Version check_commandcheck_nt!CLIENTVERSION } |
NSClient客户监控
define service{ usegeneric-service host_namewindows-server service_descriptionUptime check_commandcheck_nt!UPTIME } |
运行时间监控
define service{ usegeneric-service host_namewindows-server service_descriptionCPU Load check_commandcheck_nt!CPULOAD!-l 5,80,90 } |
CPU监控,80%警告,90%报警
define service{ usegeneric-service host_namewindows-server service_descriptionMemory Usage check_commandcheck_nt!MEMUSE!-w 80 -c 90 } |
内存监控,80%警告,90%报警。
define service{ usegeneric-service host_namewindows-server service_descriptionC:\Drive Space check_commandcheck_nt!USEDDISKSPACE!-l c -w 90 -c 95 } |
C盘使用监控,-l后跟盘符,90%警告,95%报警。
define service{ usegeneric-service host_namewindows-server service_descriptionnetlogon check_commandcheck_nt!SERVICESTATE!-d SHOWALL -l netlogon } |
原配置是监控W3SVC服务(IIS),测试机是我的PC,没有IIS所以改了个netlogon服务,所有的服务监控格式都是这样。
重启nagios服务
8.7.3.1出现critical错误
具体提示不记得的,检查的半天结果是windows下Mcafee防火墙挡住了,换了个虚拟机的IP地址就可以了。另外跨网段也有可能出现这种问题,解决方法是修改command.cfg,在命令后添加-t 30, 默认或者不填是10
8.7.3.2 出现以下提示
NSClient - ERROR: Could not get data for 5 perhaps we don't collect data this far back? NSClient - ERROR: Could not get value |
解决方法:
运行CMD,进入nsclient安装路径 nsclient++/test lodctr/r nsclient++/test 参考资料 http://www.nsclient.org/nscp/wiki/FAQ
|
原理同nagios下本机监控,不同的是需要在被监控机器上安装nrpe,nagios及相关插件来监控主机,然后通过监控服务器来获取数据并显示。
首先,由于我是在VMware下测试的,为了方便直接将监控主机做了个克隆,取名Testclient ,原来监控主机是Test(默认IP地址是192.168.254.123,有 cacti+nagios+ntop全套监控软件),开机Testclient
Vi /etc/conf.d/net修改IP地址
modules=("ifconfig") config_eth0=("192.168.254.124 netmask 255.255.255.0 brd 192.168.254.255") routes_eth0=("default via 192.168.254.2")
|
修改主机名称(非必要)
Hostname Testclient
添加nagios管理员的用户名密码,这是克隆的电脑可以省略。
Useradd nagios
Passwd nagios
安装nagios插件(这是克隆电脑可以省略)
cd nagios-plugins-1.4.9
./configure �Cprefix=/var/www/localhost/htdocs/nagios/
Make && make install
安装nrpe监控软件
Cd nrpe-2.1.3
./configure �Cprefix=/var/www/localhost/htdocs/nagios/
Make all
安装check_nrpe这个插件
make install-plugin
之前说过监控机需要安装check_nrpe这个插件,被监控机并不需要,我们在这里安装它是为了测试的目的
安装deamon
make install-daemon
安装配置文件
make install-daemon-config
安装xinetd脚本
make install-xinetd
编辑脚本
Vi /etc/xinetd/nrpe
service nrpe { flags= REUSE socket_type= stream port= 5666 wait= no user= nagios group= nagios server= /var/www/localhost/htdocs/nagios/bin/nrpe server_args= -c /var/www/localhost/htdocs/nagios/etc/nrpe.cfg --inetd log_on_failure+= USERID disable= no only_from= 127.0.0.1 192.168.254.123 } |
开启nrpe服务
Cd /var/www/localhost/htdocs/nagios/
Ln �Cs etc/nrpe nrpe
Vi nrpe
修改Allow hosts
allowed_hosts=127.0.0.1,192.168.254.123
bin/nrpe �Cd etc/nrpe.cfg
查看状态
Netstat �Cat | grep nrpe
tcp00 *:nrpe*:*LISTEN
Netstat �Can | grep :5666
tcp00 0.0.0.0:56660.0.0.0:*LISTEN
OK!
添加监控命令
command[check_users]=/var/www/localhost/htdocs/nagios/libexec/check_users -w 5 -c 10 command[check_load]=/var/www/localhost/htdocs/nagios/libexec/check_load -w 15,10,5 -c 30,25,20 command[check_hda1]=/var/www/localhost/htdocs/nagios/libexec/check_disk -w 20% -c 10% -p / command[check_zombie_procs]=/var/www/localhost/htdocs/nagios/libexec/check_procs -w 5 -c 10 -s Z command[check_total_procs]=/var/www/localhost/htdocs/nagios/libexec/check_procs -w 150 -c 200 command[check_free_swap]=/var/www/localhost/htdocs/nagios/libexec/check_swap -w 20% -c 10%
|
具体命令写法可以用/nagios/libexec/check_nrpe �Ch查看,注意绿色自己监控主机上用。
重启nrpe服务
其实很简单了,同本机监控
定义hosts
define host{ host_nameLinuxClient aliasZhengzhouPC address192.168.254.124 check_commandcheck-host-alive max_check_attempts5 check_period24x7 contact_groupsnagios notification_interval10 notification_period24x7 notification_optionsd,u,r } |
定义服务,列出其中一个
define service{ host_nameLinuxClient service_descriptioncheckfreeswap check_commandcheck_nrpe!check_free_swap max_check_attempts5 normal_check_interval3 retry_check_interval2 check_period24x7 contact_groupsnagios notification_interval10 notification_period24x7 notification_optionsw,u,c,r } |
贴张大功告成图片