Nagios&Cacti

Nagios + Cacti 其实在易用性上是比不上zabbix的,但是对于仅仅需要报警而无需图表的服务监控,nagios 的确比较好,之前由于IDC迁移,就把之前老的那套nagios+cacti 环境重新部署了一次。

Nagios:

  • 准备工作:
apt-get install autoconf gcc libc6 build-essential bc gawk dc gettext \
libmcrypt-dev libssl-dev make unzip apache2 apache2-utils php5 libgd2-xpm-dev
/usr/sbin/useradd -m -s /bin/bash nagios #创建用户
/usr/sbin/groupadd nagcmd #创建ganioscmd 用户,用于执行一些外部命令,比如nrpe
/usr/sbin/usermod -a -G nagcmd nagios
/usr/sbin/usermod -a -G nagcmd www-data
  • 安装:
tar zxvf nagios-4.3.1.tar.gz
cd nagios-4.3.1.tar.gz
./configure --prefix=/opt/nagios --with-command-group=nagcmd --with-httpd-conf=/etc/apache2/sites-enabled
make all
make install
make install-init
make install-config
make install-commandmode
update-rc.d nagios defaults #初始化各种配置以及增加开启启动
  • nagios目录:
[email protected]:nagios# ls
bin  etc  libexec  log  sbin  share  var

其中nagios主要配置文件在etc 下,而插件主要则放在libexec下。

  • 配置nagios:
    公司的nagios 主要用来监控一些服务器的硬件状态,比如磁盘是否完好等等,而且均通过nrpe的方式进行监控,用于减少本地服务器负担。nagios的配置为分布式的,可以根据需要将多个配置注册在总的nagios.cfg 配置里。
# You can specify individual object config files as shown below:
cfg_file=/opt/nagios/etc/objects/commands.cfg
cfg_file=/opt/nagios/etc/objects/contacts.cfg
cfg_file=/opt/nagios/etc/objects/timeperiods.cfg
cfg_file=/opt/nagios/etc/objects/templates.cfg
#
cfg_file=/opt/nagios/etc/objects/service.cfg
cfg_file=/opt/nagios/etc/objects/group.cfg
# Definitions for monitoring the local (Linux) host
#cfg_file=/opt/nagios/etc/objects/localhost.cfg
cfg_file=/opt/nagios/etc/objects/host_debian.cfg
cfg_file=/opt/nagios/etc/objects/host_centos.cfg

然后对应编辑目录就行了,假设我要添加一台linux 服务器,用于监控硬盘信息,需要如下步骤:
1 .修改commands.cfg 配置,增加对应command:

# check hardware Disk
define command{
        command_name check_storage_disk_nrpe
        command_line /opt/nagios/libexec/check_storage_disk_nrpe $HOSTADDRESS$ check_storage_disk
}

libexec下放对应的脚本,大致意思就是nagios远程机器执行check_storage_disk 模块,而check_storage_disk 就是远程机器的一个监控脚本。

#!/bin/bash
PLUGINS=/opt/nagios/libexec
CHECK_NRPE=$PLUGINS/check_nrpe
host=$1
comm=$2
if [ $# -lt 2 ];then
    echo "Usage: $0 host command"
    exit 2
fi
#command_line    $USER1$/check_snmp_traffic $HOSTADDRESS$ public 3 " > 80 " " > 90 "
res=`$CHECK_NRPE -H$host -n -p57000 -c $comm`
if [ $? -ne 0 ];then
    if [ "CHECK_NRPE: Socket timeout after 10 seconds." == ${res} ];then
        echo "connect failed"
        exit 0
    else
        echo "Check Storage UNKNOWN"
        exit 3
    fi
fi
if [ "${res}" == "Storage Disk Normal" ];then
    echo "Check Storage OK"
    exit 0
else
    echo "${res}"
    exit 2
fi
echo $res
exit $EXIT

nrpe 插件可以在nagios.org里下载。
然后将该服务注册到service.cfg 中:

define service{
        use                             local-service
        hostgroup_name                  debian_servers
        service_description             hardware_disk_check
        check_command                   check_storage_disk_nrpe
        }

然后创建host 配置以及host group 配置:

define hostgroup{
        hostgroup_name  debian_servers
        alias           servers
        members         test
        }
define host{
        use                     linux-server
        host_name              test
        alias                   01
        address                 192.168.1.1
        }

nagios 登录是通过apache htpass 做验证的,比较简单,修改对应的cgi的密码就行。修改nagios登录用户需要修改apache的htpasswd之外,还需要修改cgi.cfg 里的用户认证。
然后检查nagios 配置:

/opt/nagios/bin/nagios -v /opt/nagios/etc/nagios.cfg 

然后启动nagios
nagios 编译安装默认没有在init下有启动服务的脚本:
这里贴一个:

#!/bin/sh
# 
# chkconfig: 345 99 01
# description: Nagios network monitor
#
# File : nagios
#
# Author : Jorge Sanchez Aymar ([email protected])
# 
# Changelog :
#
# 1999-07-09 Karl DeBisschop 
#  - setup for autoconf
#  - add reload function
# 1999-08-06 Ethan Galstad 
#  - Added configuration info for use with RedHat's chkconfig tool
#    per Fran Boon's suggestion
# 1999-08-13 Jim Popovitch 
#  - added variable for nagios/var directory
#  - cd into nagios/var directory before creating tmp files on startup
# 1999-08-16 Ethan Galstad 
#  - Added test for rc.d directory as suggested by Karl DeBisschop
# 2000-07-23 Karl DeBisschop 
#  - Clean out redhat macros and other dependencies
# 2003-01-11 Ethan Galstad 
#  - Updated su syntax (Gary Miller)
#
# Description: Starts and stops the Nagios monitor
#              used to provide network services status.
#
  
status_nagios ()
{

    if test -x $NagiosCGI/daemonchk.cgi; then
        if $NagiosCGI/daemonchk.cgi -l $NagiosRunFile; then
                return 0
        else
            return 1
        fi
    else
        if ps -p $NagiosPID > /dev/null 2>&1; then
                return 0
        else
            return 1
        fi
    fi

    return 1
}


printstatus_nagios()
{

    if status_nagios $1 $2; then
        echo "nagios (pid $NagiosPID) is running..."
    else
        echo "nagios is not running"
    fi
}


killproc_nagios ()
{

    kill $2 $NagiosPID

}


pid_nagios ()
{

    if test ! -f $NagiosRunFile; then
        echo "No lock file found in $NagiosRunFile"
        exit 1
    fi

    NagiosPID=`head -n 1 $NagiosRunFile`
}


# Source function library
# Solaris doesn't have an rc.d directory, so do a test first
if [ -f /etc/rc.d/init.d/functions ]; then
    . /etc/rc.d/init.d/functions
elif [ -f /etc/init.d/functions ]; then
    . /etc/init.d/functions
fi

prefix=/opt/nagios
exec_prefix=${prefix}
NagiosBin=${exec_prefix}/bin/nagios
NagiosCfgFile=${prefix}/etc/nagios.cfg
NagiosStatusFile=${prefix}/var/status.dat
NagiosRetentionFile=${prefix}/var/retention.dat
NagiosCommandFile=${prefix}/var/rw/nagios.cmd
NagiosVarDir=${prefix}/var
NagiosRunFile=${prefix}/var/nagios.lock
NagiosLockDir=/var/lock/subsys
NagiosLockFile=nagios
NagiosCGIDir=${exec_prefix}/sbin
NagiosUser=nagios
NagiosGroup=nagios
          

# Check that nagios exists.
if [ ! -f $NagiosBin ]; then
    echo "Executable file $NagiosBin not found.  Exiting."
    exit 1
fi

# Check that nagios.cfg exists.
if [ ! -f $NagiosCfgFile ]; then
    echo "Configuration file $NagiosCfgFile not found.  Exiting."
    exit 1
fi
          
# See how we were called.
case "$1" in

    start)
        echo -n "Starting nagios:"
        $NagiosBin -v $NagiosCfgFile > /dev/null 2>&1;
        if [ $? -eq 0 ]; then
            su - $NagiosUser -c "touch $NagiosVarDir/nagios.log $NagiosRetentionFile"
            rm -f $NagiosCommandFile
            touch $NagiosRunFile
            chown $NagiosUser:$NagiosGroup $NagiosRunFile
            $NagiosBin -d $NagiosCfgFile
            if [ -d $NagiosLockDir ]; then touch $NagiosLockDir/$NagiosLockFile; fi
            echo " done."
            exit 0
        else
            echo "CONFIG ERROR!  Start aborted.  Check your Nagios configuration."
            exit 1
        fi
        ;;

    stop)
        echo -n "Stopping nagios: "

        pid_nagios
        killproc_nagios nagios

        # now we have to wait for nagios to exit and remove its
        # own NagiosRunFile, otherwise a following "start" could
        # happen, and then the exiting nagios will remove the
        # new NagiosRunFile, allowing multiple nagios daemons
        # to (sooner or later) run - John Sellens
        #echo -n 'Waiting for nagios to exit .'
        for i in 1 2 3 4 5 6 7 8 9 10 ; do
            if status_nagios > /dev/null; then
            echo -n '.'
            sleep 1
            else
            break
            fi
        done
        if status_nagios > /dev/null; then
            echo ''
            echo 'Warning - nagios did not exit in a timely manner'
        else
            echo 'done.'
        fi

        rm -f $NagiosStatusFile $NagiosRunFile $NagiosLockDir/$NagiosLockFile $NagiosCommandFile
        ;;

    status)
        pid_nagios
        printstatus_nagios nagios
        ;;

    checkconfig)
        printf "Running configuration check..."
        $NagiosBin -v $NagiosCfgFile > /dev/null 2>&1;
        if [ $? -eq 0 ]; then
            echo " OK."
        else
            echo " CONFIG ERROR!  Check your Nagios configuration."
            exit 1
        fi
        ;;

    restart)
        printf "Running configuration check..."
        $NagiosBin -v $NagiosCfgFile > /dev/null 2>&1;
        if [ $? -eq 0 ]; then
            echo "done."
            $0 stop
            $0 start
        else
            echo " CONFIG ERROR!  Restart aborted.  Check your Nagios configuration."
            exit 1
        fi
        ;;

    reload|force-reload)
        printf "Running configuration check..."
        $NagiosBin -v $NagiosCfgFile > /dev/null 2>&1;
        if [ $? -eq 0 ]; then
            echo "done."
            if test ! -f $NagiosRunFile; then
                $0 start
            else
                pid_nagios
                if status_nagios > /dev/null; then
                    printf "Reloading nagios configuration..."
                    killproc_nagios nagios -HUP
                    echo "done"
                else
                    $0 stop
                    $0 start
                fi
            fi
        else
            echo " CONFIG ERROR!  Reload aborted.  Check your Nagios configuration."
            exit 1
        fi
        ;;

    *)
        echo "Usage: nagios {start|stop|restart|reload|force-reload|status|checkconfig}"
        exit 1
        ;;

esac
  
# End of this script

然后登录检查即可。

cacti

cacti 用于监控出图,其实nagios 可以通过pnp4nagios 进行出图,就是体验不是太好,cacti 用于定制化监控图表还是很不错的,虽然大家用的都是rrdtool。

  • 准备
apt-get install rrdtool  php5 mysql-server

其实php5不止要装那么点包,这个之后再说。
下载cacti 后解压进入目录,登录mysql 导入cacti 对应数据表:

mysql> create database cacti;
mysql>use cacti;
Query OK, 1 row affected (0.00 sec)
mysql> source cacti.sql;
mysql> GRANT ALL PRIVILEGES ON cacti.* TO 'cacti'@'127.0.0.1' IDENTIFIED BY 'cacti';

修改配置文件:

vi include/config.php
$database_type     = 'mysql';
$database_default  = 'cacti';
$database_hostname = '127.0.0.1';
$database_username = 'cacti';
$database_password = 'cacti';
$database_port     = '3306';
$database_ssl      = false;

之后登录ip/cacti 后会出现安装配置界面:
默认用户admin 密码admin


Nagios&Cacti_第1张图片
Paste_Image.png

这里会提示缺少哪些包,装上即可:

Nagios&Cacti_第2张图片
Paste_Image.png

新版本的cacti 有个问题在于mysql 是时区权限。就是上图那个报错,需要修复一下:

mysql> GRANT SELECT ON mysql.time_zone_name TO cacti@'127.0.0.1';
mysql_tzinfo_to_sql /usr/share/zoneinfo/ | mysql -u root -p mysql

之后next 变安装完成。

Nagios&Cacti_第3张图片
Paste_Image.png

之后就配置snmp 进行监控和出图啦。

地址收藏:
http://exchange.nagios.org
http://forums.cacti.net

你可能感兴趣的:(Nagios&Cacti)