Linux系统搭建Nagios监控平台


一、首先在Nagios监控的服务器部署

# 安装Nagios软件及其依赖的软件
[root@nagios ~]# yum install -y httpd php gcc glibc glibc-common net-snmp nagios nagios-plugins nagios-plugins-all nrpe nagios-plugins-nrpe gd gd-devel openssl openssl-devel
# 定义Nagios登陆的账号与密码
[root@nagios ~]# htpasswd -c /etc/nagios/passwd nagiosadmin
New password:
Re-type new password:
Adding password for user nagiosadmin
# 对配置文件进行检测
[root@nagios ~]# nagios -v /etc/nagios/nagios.cfg

Nagios Core 3.5.1
Copyright (c) 2009-2011 Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
Last Modified: 08-30-2013
License: GPL

Website: http://www.nagios.org
Reading configuration data...
   Read main config file okay...
Processing object config file '/etc/nagios/objects/commands.cfg'...
Processing object config file '/etc/nagios/objects/contacts.cfg'...
Processing object config file '/etc/nagios/objects/timeperiods.cfg'...
Processing object config file '/etc/nagios/objects/templates.cfg'...
Processing object config file '/etc/nagios/objects/localhost.cfg'...
Processing object config directory '/etc/nagios/conf.d'...
   Read object config files okay...

Running pre-flight check on configuration data...

Checking services...
        Checked 8 services.
Checking hosts...
        Checked 1 hosts.
Checking host groups...
        Checked 1 host groups.
Checking service groups...
        Checked 0 service groups.
Checking contacts...
        Checked 1 contacts.
Checking contact groups...
        Checked 1 contact groups.
Checking service escalations...
        Checked 0 service escalations.
Checking service dependencies...
        Checked 0 service dependencies.
Checking host escalations...
        Checked 0 host escalations.
Checking host dependencies...
        Checked 0 host dependencies.
Checking commands...
        Checked 24 commands.
Checking time periods...
        Checked 5 time periods.
Checking for circular paths between hosts...
Checking for circular host and service dependencies...
Checking global event handlers...
Checking obsessive compulsive processor commands...
Checking misc settings...

Total Warnings: 0
Total Errors:   0

Things look okay - No serious problems were detected during the pre-flight check
[root@nagios ~]# service httpd start
正在启动 httpd:                                           [确定]
[root@nagios ~]# service nagios start
Starting nagios: done.

通过浏览器进行访问,输入之前设定账号与密码,进行登陆

wKiom1V9IzHzlxilAAJgh6QFWEU914.jpg


登陆成功后,我们可以分别查看主机,以及所监控的服务

wKiom1V9I32R4QAwAATFkOodFEs728.jpg


监控的主机,默认的只有监控服务器主机一台

wKiom1V9I9OCi2xHAARbzcu1i-A149.jpg


监控的服务

wKioL1V9JcfiSHP7AAYOxGN0dRU799.jpg



二、监控服务器的基本架构以及搭建起来,接下来我们开始配置被监控主机,看一下如何添加主机

# 在客户端安装必要的软件
[root@web1 ~]# yum install -y nagios-plugins nagios-plugins-all nrpe nagios-plugins-nrpe openssl openssl-devel
[root@web1 ~]# vim /etc/nagios/nrpe.cfg
................................
# 这里需要添加允许访问的主机地址
allowed_hosts=127.0.0.1,192.168.1.132
.................................
# 启动服务
[root@web1 ~]# service nrpe start
Starting nrpe:                                             [确定]

# nagios的主机是以配置文件进行划分主机的,所以我们只要创建对应主机的配置文件
[root@nagios ~]# cd /etc/nagios/objects/
# 这个目录下有很多的配置文件,功能各不相同,我们会以本机默认配置为模板,定义主机配置文件
[root@nagios objects]# ls
commands.cfg  contacts.cfg  localhost.cfg  printer.cfg  switch.cfg  templates.cfg  timeperiods.cfg  windows.cfg
[root@nagios ~]# vim /etc/nagios/conf.d/web1.cfg
# 基本监测服务配置形态
efine host{
        use                     linux-server
        host_name               web1
        alias                   web1.com
        address                 192.168.1.130
        }

define service{
        use                             generic-service
        host_name                       web1
        service_description             PING
        check_command                   check_ping!100.0,20%!500.0,60%
        max_check_attempts 5
        normal_check_interval 1
        notification_interval           60
        }
        
define service{
        use                             generic-service
        host_name                       web1
        service_description             SSH
        check_command                   check_ssh
        notifications_enabled           0
        }
       
define service{
        use                             generic-service
        host_name                       web1
        service_description             HTTP
        check_command                   check_http
        notifications_enabled           1
        contact_groups                  admins
        notification_period   24x7
        notification_options            w,u,c,r
        }
# 对配置文件的正确性进行检查        
[root@nagios ~]# nagios -v /etc/nagios/nagios.cfg

Nagios Core 3.5.1
Copyright (c) 2009-2011 Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
Last Modified: 08-30-2013
License: GPL

Website: http://www.nagios.org
Reading configuration data...
   Read main config file okay...
Processing object config file '/etc/nagios/objects/commands.cfg'...
Processing object config file '/etc/nagios/objects/contacts.cfg'...
Processing object config file '/etc/nagios/objects/timeperiods.cfg'...
Processing object config file '/etc/nagios/objects/templates.cfg'...
Processing object config file '/etc/nagios/objects/localhost.cfg'...
Processing object config file '/etc/nagios/objects/windows.cfg'...
Processing object config directory '/etc/nagios/conf.d'...
Processing object config file '/etc/nagios/conf.d/web1.cfg'...
   Read object config files okay...

Running pre-flight check on configuration data...

Checking services...
        Checked 25 services.
Checking hosts...
        Checked 3 hosts.
Checking host groups...
        Checked 2 host groups.
Checking service groups...
        Checked 0 service groups.
Checking contacts...
        Checked 1 contacts.
Checking contact groups...
        Checked 1 contact groups.
Checking service escalations...
        Checked 0 service escalations.
Checking service dependencies...
        Checked 0 service dependencies.
Checking host escalations...
        Checked 0 host escalations.
Checking host dependencies...
        Checked 0 host dependencies.
Checking commands...
        Checked 25 commands.
Checking time periods...
        Checked 5 time periods.
Checking for circular paths between hosts...
Checking for circular host and service dependencies...
Checking global event handlers...
Checking obsessive compulsive processor commands...
Checking misc settings...

Total Warnings: 0
Total Errors:   0

Things look okay - No serious problems were detected during the pre-flight check

[root@nagios ~]# /etc/init.d/nagios restart
Running configuration check...done.
Stopping nagios: done.
Starting nagios: done.

         到主机选项中查看,增加了一个web1的主机

wKioL1V9kuyhcW_IAALzpH9QWp8950.jpg


图:一


wKiom1V-ObHzJIaVAAbXm2hU42A660.jpg

图:二

基本形态已经完成

功能增加:

1、监控负载与硬盘状态

# 在监控服务器上修改配置
[root@nagios ~]# vim /etc/nagios/objects/commands.cfg
..............................
# 在配置中增加以下内容
define command{
        command_name    check_nrpe
        command_line    $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
        }
..............................


[root@nagios ~]# vim /etc/nagios/conf.d/web1.cfg
# 增加监控系统负载和硬盘的状态
....................................
define service{
        use     generic-service
        host_name                       web1
        service_description             check_load
        check_command                   check_nrpe!check_load
        max_check_attempts 5
        normal_check_interval 1
        }


define service{
        use                             generic-service
        host_name                       web1
        service_description             check_disk_hda1
        check_command                   check_nrpe!check_hda1
        max_check_attempts 5
        normal_check_interval 1
        }
[root@nagios ~]# nagios -v /etc/nagios/nagios.cfg
..........................................
Checking for circular paths between hosts...
Checking for circular host and service dependencies...
Checking global event handlers...
Checking obsessive compulsive processor commands...
Checking misc settings...

Total Warnings: 0
Total Errors:   0

Things look okay - No serious problems were detected during the pre-flight check
[root@nagios ~]# /etc/init.d/nagios restart
Running configuration check...done.
Stopping nagios: done.
Starting nagios: done.


修改被监控主机的配置

[root@web ~]# vim /etc/nagios/nrpe.cfg
..........................................
# 后面的/dev/hda1修改为/dev/sda1
command[check_hda1]=/usr/lib/nagios/plugins/check_disk -w 20% -c 10% -p /dev/sda1
.........................................
[root@web ~]# /etc/init.d/nrpe restart
Shutting down nrpe:                                        [确定]
Starting nrpe:                                             [确定]


再次查看浏览器,刚才配置的两个监控项目,可以了


wKiom1V-YSCByLKRAAZTfzdK98g236.jpg


2、配置告警

# 修改监控服务器配置
[root@nagios ~]# vim /etc/nagios/objects/contacts.cfg

define contact{
        contact_name                    nagiosadmin            
        use                             generic-contact        
        alias                           Nagios Admin        
        email                           nagios@localhost   <== 这里修改为邮件地址   
        }
define contactgroup{
        contactgroup_name       admins
        alias                   Nagios Administrators
        members                 nagiosadmin
        }


# 修改要监控的服务        
[root@nagios ~]# vim /etc/nagios/conf.d/web1.cfg
.................................................

# 设置来监控HTTP服务
define service{
        use                             generic-service
        host_name                       web1
        service_description             HTTP
        check_command                   check_http
        notifications_enabled           1
        contact_groups                  admins
        notification_period   24x7
        notification_options            w,u,c,r
        }
..................................................

[root@nagios ~]# nagios -v /etc/nagios/nagios.cfg

Nagios Core 3.5.1
Copyright (c) 2009-2011 Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
Last Modified: 08-30-2013
License: GPL

...........................................................

Total Warnings: 0
Total Errors:   0

Things look okay - No serious problems were detected during the pre-flight check
[root@nagios ~]# /etc/init.d/nagios restart
Running configuration check...done.
Stopping nagios: done.
Starting nagios: done.

# 安装发送邮件的服务并启动
[root@nagios ~]# yum install -y sendmail
[root@nagios ~]# /etc/init.d/sendmail start
正在启动 sendmail:                                        [确定]
启动 sm-client:                                           [确定]


我们在客户机上停止http服务,来进行测试告警邮件

[root@web ~]# /etc/init.d/httpd stop
停止 httpd:                                               [确定]




你可能感兴趣的:(linux,服务器,软件,监控)