centos6+nagios3.3.1+nrpe2.12

Nagios使用说明文档

1nagios介绍

 

1.1  什么是nagios

  nagios是一款用于系统和网络监控的应用程序,它可以在你设定的条件下对主机和服务进行监控,在状态变差和变好的是会给出警告信息。

 

1.2  nagios的特征

  监控网络服务(smtppop3 http nntp ping )

  监控主机资源(cpu负荷,磁盘利用率,内存利用率等)

  简单的插件社稷使得用户可以方便的扩展自己服务的监测方法

  并行服务检查机制

  具备定义网络分层结构的能力,用“parent”主机定义来表达网络主机间的关系,这种关系可被用来发现和明晰主机当机和不可达状态

   当服务或主机文体产生与解决时将警告发送给联系人(通过EMAIL,短信,用户定义方式)

   具备定义事件句柄功能,它可以在主机或服务的事件发生时获取更多问题定位

   自动的日志回滚

   可以支持并实现对主机的冗余

   可选的web界面用于查看当前的网络状态,通知和故障历史,日志文件等

 

1.3 Nagios能做什么

监控windows主机

监控linux/unix主机

监控netware服务器

监控路有其和交换机

监控打印机

监控公众服务平台

2,nagios安装

  2.1 安装要求

硬件:nagios对硬件没有特别的要求,一般服务器及主机均可运行

系统:nagios所需要的运行的硬件必须可以运行linux(unix),并且有c编译器

软件:web服务(apache)

thomas boutell制作的GD库与开发库(gd gd-devel glibc glibc-common)

GCC编译器 (gcc)

 

2.2 软件准备

     Nagios主程序:nagios-cn-3.3.1.tar.tar.gz(本文档实用)

     nagios插件:nagios-plugins-1.4.15.tar.gz  nrpe-2.12.tar.gz

 

英文版下载地址:http://www.nagios.org/download/

     中文版下载地址:http://sourceforge.net/projects/nagios-cn/

 

2.3 安装nagios

  2.3.1 nagios用户和组

    #adduser nagios

    #mkdir /usr/local/nagios

    #chown nagios.nagios /usr/local/nagios

 

     2.3.2 编译

#tar zxvf nagios.3.3.1.tar.gz

#cd nagios

         #./configure —prefix=/usr/local/nagios --with-nagios-user=nagios --with-nagios-group=nagios --with-command-group=nagios

      2.3.3 安装

        #make all   

        #make install     安装主要的程序、CGIHTML文件等等。

        #make install-commandmode  赋予外部命令访问nagios配置文件的权限

        #make install-config  nagios的配置文件的例子复制到nagios的安装目录下

        #make install-init   nagios做成一个运行脚本,放入init.d中,使nagios可以随系统的开机而启动

      2.3.4 全部编译安装完毕后检查

# ls /usr/local/nagios
bin  etc  libexec  sbin  share  var


查看是否有上述几个目录,如果存在说明nagios安装成功。
下面来说明这五个目录的功能:
bin          nagios
执行程序所在目录,这个目录只有一个文件nagios
etc          nagios
配置文件位置,初始安装完成后,只有几个*.cfg-sample文件
libexec      nagios
程序脚本文件
sbin         nagios
Cgi文件所在目录,也就是执行外部命令所需文件所在的目录
share        nagios
网页文件所在的目录
var          nagios
日志文件、spid等文件所在的目录

 

      2.3.5nagios信息加到apache

        #vim /etc/httpd/conf/httpd.conf

        在配置文件的最后加入以下内容:

 <Directory "/usr/local/nagios/sbin">

Options ExecCGI

AllowOverride None

Order allow,deny

Allow from all

AuthName "Nagios Access"

AuthType Basic

AuthUserFile /usr/local/nagios/etc/htpasswd.users

Require valid-user

</Directory>

<Directory "/usr/local/nagios/share">

Options None

AllowOverride None

Order allow,deny

Allow from all

AuthName "Nagios Access"

AuthType Basic

AuthUserFile /usr/local/nagios/etc/htpasswd.users

Require valid-user

</Directory>

2.3.5 生成http用户验证文件,用户名为nagios

        #/usr/bin/htpasswd -c /usr/local/nagios/etc/htpasswd.users nagiosadmin

 

 2.3.6 重启apache

  #service httpd restart

 

 2.3.7 查看nagios监控页面

    打开浏览器输入地址:http://127.0.0.1/nagios/

    如果得到以下界面说明前面的安装没有问题

 

2.3.8配置nagios
# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg 来验证程序能否正常运行
分析nagios的配置文件
# vi /usr/local/nagios/etc/localhost.cfg
把下面的几行注释去掉
cfg_file=/usr/local/nagios/etc/objects/contactgroups.cfg 

//联系组配置文件路径
cfg_file=/usr/local/nagios/etc/objects/contacts.cfg      

//联系人配置文件路径
cfg_file=/usr/local/nagios/etc/objects/hostgroups.cfg    

//主机组配置文件路径
cfg_file=/usr/local/nagios/etc/objects/hosts.cfg         

//主机配置文件路径
cfg_file=/usr/local/nagios/etc/objects/services.cfg      

//服务配置文件路径
cfg_file=/usr/local/nagios/etc/objects/timeperiods.cfg   

//监视时段配置文件路径
check_external_command=0修改为1 作用是允许执行web界面下重启nagios,停止主机/服务检查等操作


cgi.cfg
文件 控制相关的cgi脚本
#vi /usr/local/nagios/etc/cgi.cfg
确保use_authentication=1
其中的各项authorized_for_都是定义登录的用户权限
全部设置为nagiosadmin即可,也可设置你自己htpasswd生成设置的用户名

objects
目录下的配置文件如下:
commands.cfg  localhost.cfg  printer.cfg  templates.cfg    windows.cfg
contacts.cfg     switch.cfg   timeperiods.cfg
commands.cfg
是定义了各种命令的功能
localhost.cfg   
是默认的主机监控策略
contacts.cfg    
中定义了发送警报的方式和联系人信息

templates.cfg   定义hostserviceuse使用的模板

timeperiods.cfg   各种报警时间的定义
等等
很多配置文件都已经默认设置好了基础的监控行为,如果有需要可以随时更改

 

具体配置实例可见nagios/usr/local/nagios/etc/localhost.cfg

 

2.4 nagios插件的使用

2.4.1 nagios插件安装

#tar zxvf nagios-plugins-1.4.15.tar.gz

#cd /nagios-plugins-1.4.15

#./configure —prefix=/usr/local/nagios   nagios-plugins是安装到nagios的主目录下的

#make  

#make install

#ls /usr/local/nagios/libexec (检查插件是否安装成功,如果安装成功可以在该目录下看到很多可执行程序)

 检查工作:

再次检查nagios主目录的属主,一定要是nagios,不能是root
如果属主不正确
#chown -R nagios.nagios /usr/local/nagios 
nagios
的用户不需要登录shell 所以如果为了安全
#vi /etc/passwd 
nagios:x:500:500::/home/nagios:/bin/bash
修改为:
nagios:x:500:500::/home/nagios:/bin/nologin

那么nagios用户则不能够登录shell

 

  2.4.2修改nagios的插件配置

1)   服务器端的客户端软件安装完成后需要在插件的命令行增加

# vim /data/nagios/etc/objects/commands.cfg

#check nrpe

define command{

        command_name check_nrpe

        command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$

        }

2) 如果在/usr/local/nagios/libexec中没有check_nrpe.需要将安装完nrpe后,复制nrpelibexec中得check_nrpe/usr/local/nagios/libexec

3,运行nagios

   3.1检查配置文件

验证配置文件的正确性在命令行模式输入以下命令:

# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg

….

Total Warnings: 0

Total Errors:   0

假如有报错会将报错信息显示在出来

3.2启动和停止nagios

3.2.1启动nagios

脚本启动: 

#/etc/rc.d/init.d/nagios start

手工启动: 

 #/usr/local/nagios/bin/nagios -d  /usr/local/nagios/etc/nagios.cfg

重启ngios 

#/etc/rc.d/init.d/nagios reload

停止nagios: 

#/etc/rc.d/init.d/nagios stop 或者直接杀死进程

 

4,客户端安装设置

4.1 客户端安装

4.1.1增加nagios用户
#useradd nagios

4.1.2 安装xinetd

 #yum install xinetd

4.1.3安装nagios插件
#tar -zxvf nagios-plugins-1.4.14.tar.gz
#cd nagios-plugins-1.4.14
#./configure --prefix=/usr/local/nagios
#make
#make install

4.1.4安装nrpe
#tar -zxvf nrpe-2.12.tar.gz
#cd nrpe-2.12
#./configure --prefix=/usr/local/nagios
#make all
#make install-plugin
#make install-daemon
#make install-daemon-config
#make install-xinetd
    nrpe安装为xinetd服务

 

  4.1.5 修改权限

   # chown -R nagios.nagios /usr/local/nagios

4.2修改配置

1)编辑nrpe配置文件,增加监控机地址:
#vi /etc/xinetd.d/nrpe
only_from  = 127.0.0.1 10.1.1.14  

注意,这里必须以空格分隔。

2)修改services文件,增加端口
#vi /etc/services
nrpe            5666/tcp                        #NRPE

3) 修改配置文件nrpe.cfg

#vim /usr/local/nagios/etc/nrpe.cfg

  allowed_hosts= 127.0.0.1

  修改为:allowed_hosts= 127.0.0.1,10.1.1.14

4) 重启xinetd服务
#service xinetd restart

5) 查看服务是否启动

# netstat -antp|grep 5666

tcp        0      0 :::5666                     :::*                        LISTEN      16690/xinetd     

6)测试nrpe服务
#/usr/local/nagios/libexec/check_nrpe -H localhost
NRPE v2.12

注意,如果出现Connection refused by host 需要安装yum intall openssl*

 



附件:

以下为check_mem.pl

#! /usr/bin/perl -w

#

# check_mem v1.4 plugin for nagios

#

# uses the output of `free` to find the percentage of memory used

#

# Copyright Notice: GPL

#

# History:

#

# v1.4 Garrett Honeycutt - [email protected]

#       + Fixed PerfData output to adhere to standards and show crit/warn values

#

# v1.3 Rouven Homann - [email protected]

#       + Memory installed, used and free displayed in verbose mode

#       + Bit Code Cleanup

#

# v1.2 Rouven Homann - [email protected]

#       + Bug fixed where verbose output was required (nrpe2)

#       + Bug fixed where perfomance data was not displayed at verbose output

#       + FindBin Module used for the nagios plugin path of the utils.pm

#

# v1.1 Rouven Homann - [email protected]

#       + Status Support (-c, -w)

#       + Syntax Help Informations (-h)

#       + Version Informations Output (-V)

#       + Verbose Output (-v)

#       + Better Error Code Output (as described in plugin guideline)

#

# v1.0 Garrett Honeycutt - [email protected]

#       + Initial Release

use strict;

use FindBin;

use lib $FindBin::Bin;

use utils qw($TIMEOUT %ERRORS &print_revision &support);

use vars qw($PROGNAME);

use Getopt::Long;

use vars qw($opt_V $opt_h $verbose $opt_w $opt_c);


$PROGNAME = "check_mem";

sub print_help ();

sub print_usage ();


Getopt::Long::Configure('bundling');

GetOptions ("V"   => \$opt_V, "version"    => \$opt_V,

        "h"   => \$opt_h, "help"       => \$opt_h,

        "v" => \$verbose, "verbose"  => \$verbose,

        "w=s" => \$opt_w, "warning=s"  => \$opt_w,

        "c=s" => \$opt_c, "critical=s" => \$opt_c);


if ($opt_V) {

    print_revision($PROGNAME,'$Revision: 1.4 $'); 

    exit $ERRORS{'UNKNOWN'};

}


if ($opt_h) {

     print_help();

    exit $ERRORS{'UNKNOWN'};

}


print_usage() unless (($opt_c) && ($opt_w));


my $critical = $1 if ($opt_c =~ /([0-9]+)/);

my $warning = $1 if ($opt_w =~ /([0-9]+)/);


my $verbose = $verbose;

 

my ($mem_percent, $mem_total, $mem_used) = &sys_stats();

my $free_mem = $mem_total - $mem_used;

if ($mem_percent>$critical) {

    if ($verbose) { print "CRITICAL: $mem_percent\% Used Memory - Total: $mem_total MB, used: $mem_used MB, free: $free_mem MB | MemUsed=$mem_percent\%;$warning;$critical\n";} 

    else { print "CRITICAL: $mem_percent\% Used Memory | MemUsed=$mem_percent\%;$warning;$critical\n";};

    exit $ERRORS{'CRITICAL'};

} elsif ($mem_percent>$warning) {

    if ($verbose) { print "WARNING: $mem_percent\% Used Memory - Total: $mem_total MB, used: $mem_used MB, free: $free_mem MB | MemUsed=$mem_percent\%;$warning;$critical\n";}

    else { print "WARNING: $mem_percent\% Used Memory | MemUsed=$mem_percent\%;$warning;$critical\n";};

    exit $ERRORS{'WARNING'};

} else {

    if ($verbose) { print "OK: $mem_percent\% Used Memory - Total: $mem_total MB, used: $mem_used MB, free: $free_mem MB | MemUsed=$mem_percent\%;$warning;$critical\n"; }

    else { print "OK: $mem_percent\% Used Memory | MemUsed=$mem_percent\%;$warning;$critical\n";};

    exit $ERRORS{'OK'};

}


sub sys_stats {

    my ($mem_total, $mem_used);


    chomp($mem_total = `free -mt | grep Mem | awk '{print \$2}'`);

    chomp($mem_used = `free -mt | grep cache | tail -1 | awk '{print \$3}'`);


    my $mem_percent = ($mem_used / $mem_total) * 100;


    return (sprintf("%.0f",$mem_percent),$mem_total,$mem_used);

}


sub print_usage () {

    print "Usage: $PROGNAME [-w <warn>] [-c <crit>] [-v] [-h]\n";

    exit $ERRORS{'UNKNOWN'} unless ($opt_h);

}


sub print_help () {

    print_revision($PROGNAME,'$Revision: 1.4 $');

    print "Copyright (c) 2005 Garrett Honeycutt/Rouven Homann\n";

    print "\n";

    print_usage();

    print "\n";

    print "-w <warn> = Memory usage to activate a warning message.\n";

    print "-c <crit> = Memory usage to activate a critical message.\n";

    print "-v = Verbose Output.\n";

    print "-h = This screen.\n\n";

    support();

}

以下是check_df.pl
#!/usr/bin/perl -w
#
#check_df
#
# AUTHOR:
#   Dan Harkless <[email protected]>       http://harkless.org/dan/software/
# COPYRIGHT:
#   This file is Copyright (C) 2005 by Dan Harkless, and is released under the
#   GNU General Public License <http://www.gnu.org/copyleft/gpl.html>.
#
# USAGE:
#   % check_df [<df_options>] [<percentage>]
#
# EXAMPLE:
#   % check_df -H 99
#
# DESCRIPTION: 
#   A script you can call from 'cron' to output (and thus email) a 'df' listing
#   if any of the filesystems' usage is greater than or equal to the percentage
#   you specify on the commandline.  If no percentage is specified, '95' is
#   assumed.
#
#   Any option strings (beginning with '-') will be passed along to df.
#
# DATE        MODIFICATION
# ==========  ==================================================================
# 2005-08-01  Added ability to pass options to df and removed default use of -k.
# 2004-11-17  Original.


$percentage = 95;

while (scalar @ARGV > 0) {
    $option = shift;

    if ($option =~ /^-/) {
        push @df_options, $option;
    }
    else {
        $percentage = $option;
    }
}

open(DF, "df @df_options|") or die;

use vars qw($header_line);
$header_line = <DF>;
push @df_output, $header_line;

while (<DF>) {
    push @df_output, $_;

    if (/(\d{1,3})%/) {
        if ($1 >= $percentage) {
            $at_or_above_percentage = 1;
        }
    }
}

close DF;

if ($at_or_above_percentage) {
    print @df_output;
}

你可能感兴趣的:(centos6+nagios3.3.1+nrpe2.12)