为什么要用Ubuntu――方便,有现成的东西不用岂不是浪费。
首先需要安装好Ubuntu server,把各个软件更新到最新 sudo apt-get update
Ubuntu官方文档 http://askubuntu.com/questions/145518/how-do-i-install-nagios
我的这个文档,大部分是按照上面做的,不过我写的就是有点大白话了,看起来比较方便,哈哈。
安装:
sudo apt-get install -y nagios3
It will go through, and ask you about what mail server you want to use:
安装到一半的时候,系统会提示信息,选择OK
Pick one based upon your needs.
选择第二个信息"Internet Site"
It will then ask you about the domain name you want to have email sent from. Again, fill that out based upon your needs.
It will ask you what password you want to use - put in a secure password. This is for the admin accountnagiosadmin.
输入nagios的root密码
And then you'll need to verify your password.
然后你可以通过IE浏览器来访问,在URL处输入IP或者Hostaname,系统会提示让你输入用户名和密码
默认的用户名是nagiosadmin
Once the install is all done, you can head over to localhost/nagios3 (or whatever the IP address/domain name of the server you installed it on is) and you'll be asked to enter your password:
Once you've done that, you're in!
Nagios automatically adds in the 'localhost' to the config, and does load, current users, disk space, http and ssh checks.
Now there is one more thing we need to do before nagios is all ready - we need to have it accept external commands so we can acknowledge problems, add comments, etc.
按照下面的提示稍微配置一下你的ubuntu服务器
To do that, we need to edit a few files. Start by opening /etc/nagios3/nagios.cfg with the following command:
sudo nano /etc/nagios3/nagios.cfg
Search for check_external_commands, and turn the check_external_commands=0 intocheck_external_commands=1.
Now, restart apache by running
sudo service apache2 restart
Not done yet! We need to edit /etc/group. There should be a line like this in there:
nagios:x:114
Change it to
nagios:x:114:www-data
Save and close this file.
Now, we need to edit the /var/lib/nagios3/rw files permission with:
sudo chmod g+x /var/lib/nagios3/rw
And then (because of how permissions work) we need to edit the permissions of the directory above that with:
sudo chmod g+x /var/lib/nagios3
Now, restart nagios with:
sudo service nagios3 restart
配置
我在本机上安装了一个Ubuntu server,并在上面成功的配置了Nagios,其截图如下,可以说是有一半成功了,接下来就是还需要配置其他的东西:
上图我监控了nagios本机,以及另外一个ubuntu桌面,还有我的笔记本
下面是网上找到的配置文档,很详细,我就是按照这个文档做的,文档的连接地址如下:https://help.ubuntu.com/10.04/serverguide/nagios.html
第一步:在Nagios的服务器上安装相应的包,
在我的Ubuntu server安装 IP 192.168.40.60
sudo apt-get install nagios3 nagios-nrpe-plugin
安装到一半的时候,系统会提示让你输入一个密码,这是用来登录Nagios的密码
PS : 先安装好Apache2
sudo apt-get install apache2
启动apache的命令
sudo /etc/init.d/apache2 restart
你可以新建用户,也可以更改Nagios的默认密码,集体方法如下:
The user's credentials are stored in/etc/nagios3/htpasswd.users.
To change the nagiosadmin password, or add additional users to the Nagios CGI scripts, use thehtpasswd that is part of the apache2-utils package.
For example, to change the password for the nagiosadmin user enter:
sudo htpasswd /etc/nagios3/htpasswd.users nagiosadmin
To add a user:
sudo htpasswd /etc/nagios3/htpasswd.users steve
然后在第二个服务器上安装nagios-nrpe-server
我的第二个服务器是Ubuntu的桌面版
sudo apt-get install nagios-nrpe-server
Configuration Overview
There are a couple of directories containing Nagios configuration and check files.
For example: /usr/lib/nagios/plugins/check_dhcp -h
There are a plethora of checks Nagios can be configured to execute for any given host. For this example Nagios will be configured to check disk space, DNS, and a MySQL hostgroup. The DNS check will be on server02, and the MySQL hostgroup will include both server01 and server02.
Additionally, there are some terms that once explained will hopefully make understanding Nagios configuration easier:
By default Nagios is configured to check HTTP, disk space, SSH, current users, processes, and load on the localhost. Nagios will also ping check the gateway.
Large Nagios installations can be quite complex to configure. It is usually best to start small, one or two hosts, get things configured the way you like then expand.
上面介绍的都是nagios文件夹下的每个文件的作用,下面开始正式的配置,将需要监控的服务器添加到nagios文件夹里:
Configuration
从nagios服务器的/etc/nagios3/conf.d/localhost_nagios2.cfg 将其拷贝到 /etc/nagios3/conf.d/server02.cfg
这里面的Server02.cfg指的是你需要向里面添加的需要被监控的服务器的名字,比如我想要将我的Ubuntu桌面添加进来,我就可以
sudo cp /etc/nagios3/conf.d/localhost_nagios2.cfg /etc/nagios3/conf.d/ubuntu.cfg
复制成功后需要编辑该文件,刚刚发现了以新的编辑方法――nano
比如我要编辑/etc/nagios3/conf.d/ubuntu.cfg
我可以 nano /etc/nagios3/conf.d/ubuntu.cfg
编辑完,直接按ctrl+x 再按Y,再回车就可以了,用起来很方便
define host{
use generic-host ; Name of host template to use
host_name ubuntu
alias ubuntu
address 192.168.40.20
}
# check DNS service.
define service {
use generic-service
host_name ubuntu
service_description DNS
check_command check_dns!192.168.40.20
}
配置结束后可以重启nagios服务
sudo /etc/init.d/nagios3 restart
Now add a service definition for the MySQL check by adding the following to
/etc/nagios3/conf.d/services_nagios2.cfg
# check MySQL servers.
define service {
hostgroup_name mysql-servers
service_description MySQL
check_command check_mysql_cmdlinecred!nagios!secret!$HOSTADDRESS
use generic-service
notification_interval 0 ; set > 0 if you want to be renotified
}
A mysqsl-servers hostgroup now needs to be defined. Edit
/etc/nagios3/conf.d/hostgroups_nagios2.cfg
adding
# MySQL hostgroup.
define hostgroup {
hostgroup_name mysql-servers
alias MySQL servers
members localhost, server02
}
The Nagios check needs to authenticate to MySQL. To add a nagios user to MySQL enter
mysql -u root -p -e "create user nagios identified by 'secret';"
Restart nagios to start checking the MySQL servers
sudo /etc/init.d/nagios3 restart
都弄好之后,我们需要设置第二个服务器――ubuntu桌面
Lastly configure NRPE to check the disk space on ubuntu
On server01 add the service check to
/etc/nagios3/conf.d/ubuntu.cfg
# NRPE disk check.
define service {
use generic-service
host_name ubuntu
service_description nrpe-disk
check_command check_nrpe_1arg!check_all_disks!192.168.40.20
}
再在第二个服务器上――ubuntu桌面编辑 /etc/nagios/nrpe.cfg
在里面添加一句
allowed_hosts=192.168.40.60
这个IP是nagios的IP
然后再加入这句话(我也不清楚加在哪里)
command[check_all_disks]=/usr/lib/nagios/plugins/check_disk -w 20% -c 10% -e
最后重启NRPE服务
sudo /etc/init.d/nagios-nrpe-server restart
以上三点是在Ubuntu桌面内完成的
下面再重启nagios服务
sudo /etc/init.d/nagios3 restart
这样的话基本上配置就结束了,我们就可以登录网页来看了。
注:如何添加主机?
首先向nagios主机的host里面添加你要监控的主机的名字,以及IP信息
然后再编辑nagios服务器里面的文件,实现其监控的目的
进入到/etc/nagios3/conf.d
创建一个新文件,名字你可以自定义,比如说我要监控一个名字叫做fedora的机器,那我就创建一个fedora.cfg
为了方便,你可以直接复制一个之前其他的cfg文件,再修改里面的信息
这个是我的那个Windows的配置文件,可以把这个复制进去再修改
# A simple configuration file for monitoring the local host
# This can serve as an example for configuring other servers;
# Custom services specific to this host are added here, but services
# defined in nagios2-common_services.cfg may also apply.
#
define host{
use generic-host ; Name of host template to use
host_name windows hostname需要你先添加到/etc/hosts里面
alias windows
address 192.168.40.2 就是你添加的服务器的IP
}
# Define a service to check the disk space of the root partition
# on the local machine. Warning if < 20% free, critical if
# < 10% free space on partition.
define service{
use generic-service ; Name of service template to use
host_name windows
check_command check_all_disks!20%!10%
}
# Define a service to check the number of currently logged in
# users on the local machine. Warning if > 20 users, critical
# if > 50 users.
define service{
use generic-service ; Name of service template to use
host_name windows
service_description Current Users
check_command check_users!20!50
}
# Define a service to check the number of currently running procs
# on the local machine. Warning if > 250 processes, critical if
# > 400 processes.
GNU nano 2.2.6 File: windows.cfg
define service{
use generic-service ; Name of service template to use
host_name windows
service_description Total Processes
check_command check_procs!250!400
}
# Define a service to check the load on the local machine.
define service{
use generic-service ; Name of service template to use
host_name windows
service_description Current Load
check_command check_load!5.0!4.0!3.0!10.0!6.0!4.0
}
# NRPE disk check.
define service {
use generic-service
host_name windows
service_description nrpe-disk
check_command check_nrpe_1arg!check_all_disks!192.168.40.2
}