ubuntu14.04&debian8.9下安装Naigos
Nagios是一个监视系统运行状态和网络信息的监视系统。Nagios能监视所指定的本地或远程主机以及服务,同时提供异常通知功能等。
安装环境:ubuntu14.04,全是最新的nagios和nagios插件
wget https://assets.nagios.com/downloads/nagioscore/releases/nagios-4.3.4.tar.gz#_ga=2.198978473.708726039.1506225859-901224012.1506225859
wget https://nagios-plugins.org/download/nagios-plugins-2.2.1.tar.gz#_ga=2.224003997.708726039.1506225859-901224012.1506225859
wget https://nchc.dl.sourceforge.net/project/pnp4nagios/PNP-0.6/pnp4nagios-0.6.26.tar.gz
wget https://github.com/NagiosEnterprises/nrpe/releases/download/nrpe-3.2.1/nrpe-3.2.1.tar.gz
1.编译安装nagios
Adding the Nagios User and Group:
useradd nagios
groupadd nagcmd
usermod -a -G nagcmd nagios
usermod -a -G nagios,nagcmd www-data
sudo apt-get install wget build-essential apache2 php5 php5-gd libgd-dev unzip
tar -zxvf nagios-4.3.4.tar.gz
cd nagios-4.3.4.tar.gz
./configure --with-command-group=nagcmd --with-mail=/usr/bin/sendmail --with-httpd-conf=/etc/apache2/
make all
make install
make install-init
make install-config
make install-commandmode
make install-webconf
cp -R contrib/eventhandlers/ /usr/local/nagios/libexec/
chown -R nagios:nagios /usr/local/nagios/libexec/eventhandlers
/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
修改apache2配置:
sudo a2enmod rewrite cgi
cp /etc/apache2/nagios.conf /etc/apache2/conf-enabled/
vi /etc/apache2/apache2.conf
新增如下内容:
ServerName localhost
重启apache2:
serviceapache2 restart
修改nagios启动脚本:
sudo vi /etc/init.d/nagios (加入如下内容:)
DESC="Nagios"
NAME=nagios
DAEMON=/usr/local/nagios/bin/$NAME
DAEMON_ARGS="-d /usr/local/nagios/etc/nagios.cfg"
PIDFILE=/usr/local/nagios/var/$NAME.lock
Add a default user for Web Interface Access(设置登录账户及密码):
htpasswd –c /usr/local/nagios/etc/htpasswd.users nagiosadmin
2.Nagios Plugin 安装:
tar –zxvf nagios-plugins-2.2.1.tar.gz
cd nagios-plugins-2.2.1
./configure --with-nagios-user=nagios --with-nagios-group=nagios
make
make install
设置nagios开机启动:
sudo update-rc.d nagios defaults
3.编译安装nrpe
NRPE是nagios的一个扩展,它被用于被监控的服务器上,向nagios监控平台提供该服务器的一些本地的情况。例如,cpu负载、内存使用、硬盘使用等等。NRPE可以称为nagios的for linux 客户端。
apt-get install openssl libssl-dev
tar-zxvf nrpe-3.2.1.tar.gz
cd nrpe-3.2.1
./configure --with-ssl=/usr/bin/openssl --with-ssl-lib=/usr/lib/x86_64-linux-gnu-enable-command-args
或
./configure
make all
make install-plugin
make install-daemon
make install-daemon-config
make install-config
make install-inetd
make install-init
/usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d
/usr/local/nagios/libexec/check_nrpe ‐H localhost
修改commands.cfg(新增):
vi /usr/local/nagios/etc/objects/commands.cfg
define command{
command_name check_nrpe
command_line /usr/local/nagios/libexec/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
}
这就说明check_nrpe连接nrpe daemon是正常的,其他被监控的主机都得这样check_nrpe
vi /etc/rc.local添加开机启动:
/usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d
监控网络流量check_iftraffic
安装snmp:
sudo
apt-get
install
snmpd snmp snmp-mibs-downloader
1.将原有“agentAddress udp:127.0.0.1:161”改为:
agentAddress 127.0.0.1,172.16.126.137为本机IP,即监控服务器要监控的主机IP*/
2.加入一行如下:
accessMyROSystem "" any noauth exact all none none
3.将原有“rocommunity public default -V systemonly” 的"-V systemonly" 参数去掉,变成:
rocommunity public default
4.将“#trap2sink localhost public”和“#informsink localhost public”前面的“#”去掉,改为:
trap2sink localhost public
informsink localhost public
5.重启SNMP服务:
/etc/init.d/snmpd restart
6.检验snmp获取数据:
snmpwalk -v 2c -c public 192.168.1.9
添加Nrpe.cfg
command[check_iftraffic]=/usr/local/nagios/libexec/check_iftraffic -i eth0 -w5 -c 10 -b 10 -u m
服务定义
define service{
use generic-service
host_name localhost
service_description iftraffic
check_command check_nrpe!check_iftraffic
}
提示unable to read output的, 被控机的/tmp下有个traffic_ifeth*,把他的属主改成nagios就可以了
4.Pnp4nagios安装
pnp4nagios是基于RRD轮循(环状)数据库中所提供的综合信息,以可视化图形的方式呈现给用户的一款nagios插件
apt-get install rrdtool librrds-perl
tar zxvf pnp4nagios-0.6.26.tar.gz
cd pnp4nagios-0.6.26
./configure --prefix=/usr/local/pnp4nagios
make all
makeinstall
makeinstall-webconf
makeinstall-config
makeinstall-init
配置pnp4nagios
cd /usr/local/pnp4nagios/etc/
mv misccommands.cfg-sample misccommands.cfg
mv nagios.cfg-sample nagios.cfg
mv rra.cfg-sample rra.cfg
cd pages/
mv web_traffic.cfg-sample web_traffic.cfg
配置pnp4nagios web页面:
cp /etc/httpd/conf.d/pnp4nagios.conf /etc/apache2/conf-enabled/
cd /usr/local/pnp4nagios/share/
mv install.php install.php.bak
可以开启process_perfdata debug模式修改/usr/local/pnp4nagios/etc/process_perfdata.cfg
LOG_LEVEL = 2
会在/usr/local/pnp4nagios/var/perfdata.log显示错误日志,方便查找原因
貌似Synchronous Mode在nagios4.x有bug,日志显示一直找不到nagios环境,但是看官方文档更改enable_environment_macros=1应该影响开启nagios环境。
Bulk Mode with NPCD模式配置:
修改nagios.cfg
vi /usr/local/nagios/etc/nagios.cfg
修改0—>1:
process_performance_data=1
enable_environment_macros=1
新增如下内容:
# service performance data
#
service_perfdata_file=/usr/local/pnp4nagios/var/service-perfdata
service_perfdata_file_template=DATATYPE::SERVICEPERFDATA\tTIMET::$TIMET$\tHOSTNAME::$HOSTNAME$\tSERVICEDESC::$SERVICEDESC$\tSERVICEPERFDATA::$SERVICEPERFDATA$\tSERVICECHECKCOMMAND::$SERVICECHECKCOMMAND$\tHOSTSTATE::$HOSTSTATE$\tHOSTSTATETYPE::$HOSTSTATETYPE$\tSERVICESTATE::$SERVICESTATE$\tSERVICESTATETYPE::$SERVICESTATETYPE$
service_perfdata_file_mode=a
service_perfdata_file_processing_interval=15
service_perfdata_file_processing_command=process-service-perfdata-file
#
# host performance data #
#
host_perfdata_file=/usr/local/pnp4nagios/var/host-perfdata
host_perfdata_file_template=DATATYPE::HOSTPERFDATA\tTIMET::$TIMET$\tHOSTNAME::$HOSTNAME$\tHOSTPERFDATA::$HOSTPERFDATA$\tHOSTCHECKCOMMAND::$HOSTCHECKCOMMAND$\tHOSTSTATE::$HOSTSTATE$\tHOSTSTATETYPE::$HOSTSTATETYPE$
host_perfdata_file_mode=a
host_perfdata_file_processing_interval=15
host_perfdata_file_processing_command=process-host-perfdata-file
修改commands.cfg
vi /usr/local/nagios/etc/objects/commands.cfg
注释掉已有的'process-host-perfdata'和'process-service-perfdata',新增如下内容:
# Bulk with NPCD mode
#
define command {
command_name process-service-perfdata-file
command_line /bin/mv /usr/local/pnp4nagios/var/service-perfdata /usr/local/pnp4nagios/var/spool/service-perfdata.$TIMET$
}
define command {
command_name process-host-perfdata-file
command_line /bin/mv /usr/local/pnp4nagios/var/host-perfdata /usr/local/pnp4nagios/var/spool/host-perfdata.$TIMET$
}
修改templates.cfg添加模板:
vi /usr/local/nagios/etc/objects/templates.cfg
define host {
name hosts-pnp
register 0
action_url /pnp4nagios/index.php/graph?host=$HOSTNAME$&srv=_HOST_
}
define service {
name services-pnp
register 0
action_url /pnp4nagios/index.php/graph?host=$HOSTNAME$&srv=$SERVICEDESC$
}
修改hosts.cfg和services.cfg,增加hosts-pnp和services-pnp配置:
define host{
use linux-server,hosts-pnp ; Name of host template to use
; This host definition will inherit all variables that are defined
; in (or inherited by) the linux-server host template definition.
host_name nagios-server
alias nagios
address 192.168.20.132
}
# Define a service to "ping" the local machine
define service{
use local-service,services-pnp ; Name of service template to use
host_name nagios-server
service_description PING
check_command check_ping!100.0,20%!500.0,60%
}
# Define a service to check the disk space of the root partition
# on the local machine. Warning if < 20% free, critical if
#< 10% free space on partition.
define service{
use local-service,services-pnp ; Name of service template to use
host_name nagios-server
service_description Root Partition
check_command check_local_disk!20%!10%!/
}
重启nagios,然后就出图了。
一般使用的是默认模板,可以根据不同需求自定义模板temple
Nagios网页界面之Visual Shell
Nagios 提供的网页界面是自带的CGI程序,目前只提供英文语言,如果动手修改其排版,或者汉化核心CGI程序(C语言),则每一次版本更新都需要进行更改,这将是非常繁锁的工作。为了解决这个问题,我们可以通过读取Nagios监控系统的状态信息,即var/status.dat文件,根据其中的字段值来重新设计显示界面。
Nagios Visual shell (Vshell)是一个使用PHP编写的轻量级开源Nagios网页前端,输出的XHTML+CSS代码可以通过W3C标准验证。个人认为,Vshell的界面配色很一般,看一会儿就是会觉得很辛苦,以后的版本还需要加强美工。
apt-get install php-apc
V-Shell takes advantage of the Alternative PHP Cache system (APC), which is of particular importance if you are managing a large installation.
You may need to increase your PHP 'memory_limit' setting in your system's php.ini file if you're using APC. For large systems try a setting as high as 256-512MB.
wget http://assets.nagios.com/downloads/exchange/nagiosvshell/vshell.tar.gz
tar zxf vshell.tar.gz
cd vshell
修改install.php
define('TARGETDIR',"/usr/local/vshell");
define('APACHECONF',"/etc/httpd/conf.d"); (centos)
define('APACHECONF',"/etc/apache2/conf.d"); (ubuntu)
chmod +x install.php
./install.php
Nginx端口映射:
location /nagios {
# root html;
# index index.html index.htm;
proxy_pass http://172.16.126.137:8081;
proxy_redirect off;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
}
告警配置
vi contacts.cfg
define contact{
contact_name nagiosadmin ; Short name of user
use generic-contact ; Inherit default values from generic-contact template (defined above)
alias Nagios Admin ; Full name of user
email [email protected] ; <<***** CHANGE THIS TO YOUR EMAIL ADDRESS ******
address4 CuiSongBiao
}
vi commands.cfg
# 'notify-host-by-email' command definition
define command{
command_name notify-host-by-email
command_line /usr/local/nagios/script/sendmail.py -t $CONTACTEMAIL$ -s "** $NOTIFICATIONTYPE$ Host Alert: $HOSTNAME$ is $HOSTSTATE$ **" -m "***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE$\nHost: $HOSTNAME$\nState: $HOSTSTATE$\nAddress: $HOSTADDRESS$\nInfo: $HOSTOUTPUT$\n\nDate/Time: $LONGDATETIME$\n"
}
# 'notify-service-by-email' command definition
define command{
command_name notify-service-by-email
command_line /usr/local/nagios/script/sendmail.py -t $CONTACTEMAIL$ -s "** $NOTIFICATIONTYPE$ Service Alert: $HOSTALIAS$/$SERVICEDESC$ is $SERVICESTATE$ **" -m "***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE$\n\nService: $SERVICEDESC$\nHost: $HOSTALIAS$\nAddress: $HOSTADDRESS$\nState: $SERVICESTATE$\n\nDate/Time: $LONGDATETIME$\n\nAdditional Info:\n\n$SERVICEOUTPUT$"
}
cd /usr/local/nagios/script
vi sendmail.py
#!/usr/bin/env python
#Script is Sendmail
#Version 1.0.5
#coding=utf-8
import smtplib
import string
import sys
import getopt
def usage():
print """sendmail is a send mail Plugins
Usage:
sendmail [-h|--help][-t|--to][-s|--subject][-m|--message]
Options:
--help|-h)
print sendmail help.
--to|-t)
Sets sendmail to email.
--subject|-s)
Sets the mail subject.
--message|-m)
Sets the mail body
Example:
./sendmail.py -t '[email protected]' -s 'hello' -m 'hello ,this is send mail test!'"""
sys.exit(3)
try:
options,args = getopt.getopt(sys.argv[1:],"ht:s:m:",["help","to=","subject=","message="])
except getopt.GetoptError:
usage()
for name,value in options:
if name in ("-h","--help"):
usage()
if name in ("-t","--to"):
# accept message user
TO = value
TO = TO.split(",")
if name in ("-s","--title"):
SUBJECT = value
if name in ("-m","--message"):
MESSAGE = value
MESSAGE = MESSAGE.split('\\n')
MESSAGE = '\n'.join(MESSAGE)
#SMTP HOST
HOST = "smtp.126.com"
#smtp port
PORT = "465"
#FROM mail user
USER = 'ops'
#FROM mail password
PASSWD = 'xxx123'
#FROM EMAIL
FROM = "[email protected]"
try:
BODY = string.join((
"From: %s" % FROM,
"To: %s" % TO,
"Subject: %s" % SUBJECT,
"",
MESSAGE),"\r\n")
smtp = smtplib.SMTP_SSL()
smtp.connect(HOST,PORT)
smtp.login(USER,PASSWD)
smtp.sendmail(FROM,TO,BODY)
smtp.quit()
except Exception, e:
print str(e)
print "please look help"
print "./sendmail.py -h"
2.1申请微信企业号:
CorpID = 'wwfd5cccccc6dbe60'
AgentId = '1000002'
Secret = 'FBdnNGxxxxxxxxEAk4M8UJpma3d23P0'
2.2设置联系人
vi contacts.cfg
define contact{
contact_name nagiosadmin ; Short name of user
use generic-contact ; Inherit default values from generic-contact template (defined above)
alias Nagios Admin ; Full name of user
email [email protected] ; <<***** CHANGE THIS TO YOUR EMAIL ADDRESS ******
address4 CuiSongBiao
}
2.3设置command
vi commands.cfg
# 'notify-host-by-weixin' command definition
define command{
command_name notify-host-by-weixin
command_line /usr/local/nagios/script/weixin.py $CONTACTADDRESS4$ "** $NOTIFICATIONTYPE$ Host Alert: $HOSTNAME$ is $HOSTSTATE$ **" "***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE$\nHost: $HOSTNAME$\nState: $HOSTSTATE$\nAddress: $HOSTADDRESS$\nInfo: $HOSTOUTPUT$\n\nDate/Time: $LONGDATETIME$\n"
}
# 'notify-service-by-weixin' command definition
define command{
command_name notify-service-by-weixin
command_line /usr/local/nagios/script/weixin.py $CONTACTADDRESS4$ "** $NOTIFICATIONTYPE$ Service Alert: $HOSTALIAS$/$SERVICEDESC$ is $SERVICESTATE$ **" "***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE$\n\nService: $SERVICEDESC$\nHost: $HOSTALIAS$\nAddress: $HOSTADDRESS$\nState: $SERVICESTATE$\n\nDate/Time: $LONGDATETIME$\n\nAdditional Info:\n\n$SERVICEOUTPUT$"
}
2.4 设置模板
vi templates.cfg
define contact{
name generic-contact ; The name of this contact template
service_notification_period 24x7 ; service notifications can be sent anytime
host_notification_period 24x7 ; host notifications can be sent anytime
service_notification_options w,u,c,r,f,s ; send notifications for all service states, flapping events, and scheduled downtime events
host_notification_options d,u,r,f,s ; send notifications for all host states, flapping events, and scheduled downtime events
service_notification_commands notify-service-by-email,notify-service-by-weixin ; send service notifications via email
host_notification_commands notify-host-by-email,notify-host-by-weixin ; send host notifications via email
register 0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL CONTACT, JUST A TEMPLATE!
}
2.5 微信企业号告警脚本
#!/usr/bin/env python
# _*_coding:utf-8 _*_
import urllib2
import json
import sys
reload(sys)
sys.setdefaultencoding('utf-8')
def gettoken(corpid, corpsecret):
gettoken_url = 'https://qyapi.weixin.qq.com/cgi-bin/gettoken?corpid=' + corpid + '&corpsecret=' + corpsecret
try:
token_file = urllib2.urlopen(gettoken_url)
except urllib2.HTTPError as e:
e.code
e.read().decode("utf8")
sys.exit()
token_data = token_file.read().decode('utf-8')
token_json = json.loads(token_data)
token_json.keys()
token = token_json['access_token']
return token
def senddata(access_token, user, party, agent, subject, content):
send_url = 'https://qyapi.weixin.qq.com/cgi-bin/message/send?access_token=' + access_token
send_values = "{\"touser\":\"" + user + "\",\"toparty\":\"" + party + "\",\"totag\":\"\",\"msgtype\":\"text\",\"agentid\":\"" + agent + "\",\"text\":{\"content\":\"" + subject + content + "\"},\"safe\":\"0\"}"
send_request = urllib2.Request(send_url, send_values)
response = json.loads(urllib2.urlopen(send_request).read())
print "send sucessed!!"
str(response)
if __name__ == '__main__':
user = str(sys.argv[1]) # 参数1:发送给用户的账号,必须关注企业号,并对企业号有发消息权限
# user = "nagios"
party = str('2') # 参数2:发送给组的id号,必须对企业号有权限
agent = str('1000002') # 参数3:企业号中的应用AgentId
subject = str(sys.argv[2]) # 参数4:标题【消息内容的一部分】
content = str(sys.argv[3]) # 参数5:文本具体内容
# content = "Hello this is test !!"
corpid = 'wwfdxxxxxxdbe60' # CorpID是企业号的标识
corpsecret = 'FBdnNGgfmp5xxxxxxxUJpma3d23P0' # corpsecret是Secret是管理组凭证密钥
try:
accesstoken = gettoken(corpid, corpsecret)
senddata(accesstoken, user, party, agent, subject, content)
except Exception, e:
print "error"
str(e) + "Error Please Check \"corpid\" or \"corpsecret\" Config"