在Ubuntu配置Nagios

为什么要用Ubuntu――方便,有现成的东西不用岂不是浪费。

首先需要安装好Ubuntu server,把各个软件更新到最新 sudo apt-get update

Ubuntu官方文档 http://askubuntu.com/questions/145518/how-do-i-install-nagios

我的这个文档,大部分是按照上面做的,不过我写的就是有点大白话了,看起来比较方便,哈哈。

安装:

 

sudo apt-get install -y nagios3

It will go through, and ask you about what mail server you want to use:

安装到一半的时候,系统会提示信息,选择OK

Pick one based upon your needs.

选择第二个信息"Internet Site"

It will then ask you about the domain name you want to have email sent from. Again, fill that out based upon your needs.

It will ask you what password you want to use - put in a secure password. This is for the admin accountnagiosadmin.

输入nagiosroot密码

And then you'll need to verify your password.

 

然后你可以通过IE浏览器来访问,在URL处输入IP或者Hostaname,系统会提示让你输入用户名和密码

默认的用户名是nagiosadmin

Once the install is all done, you can head over to localhost/nagios3 (or whatever the IP address/domain name of the server you installed it on is) and you'll be asked to enter your password:

Once you've done that, you're in!

 

Nagios automatically adds in the 'localhost' to the config, and does load, current users, disk space, http and ssh checks.

Now there is one more thing we need to do before nagios is all ready - we need to have it accept external commands so we can acknowledge problems, add comments, etc.

 

按照下面的提示稍微配置一下你的ubuntu服务器

To do that, we need to edit a few files. Start by opening /etc/nagios3/nagios.cfg with the following command:

sudo nano /etc/nagios3/nagios.cfg

Search for check_external_commands, and turn the check_external_commands=0 intocheck_external_commands=1.

Now, restart apache by running

sudo service apache2 restart

Not done yet! We need to edit /etc/group. There should be a line like this in there:

nagios:x:114

Change it to

nagios:x:114:www-data

Save and close this file.

Now, we need to edit the /var/lib/nagios3/rw files permission with:

sudo chmod g+x /var/lib/nagios3/rw

And then (because of how permissions work) we need to edit the permissions of the directory above that with:

sudo chmod g+x /var/lib/nagios3

Now, restart nagios with:

sudo service nagios3 restart

配置

 

我在本机上安装了一个Ubuntu server,并在上面成功的配置了Nagios,其截图如下,可以说是有一半成功了,接下来就是还需要配置其他的东西:

 

 

上图我监控了nagios本机,以及另外一个ubuntu桌面,还有我的笔记本

 

下面是网上找到的配置文档,很详细,我就是按照这个文档做的,文档的连接地址如下:https://help.ubuntu.com/10.04/serverguide/nagios.html

 

第一步:在Nagios的服务器上安装相应的包,

在我的Ubuntu server安装   IP  192.168.40.60

 

sudo apt-get install nagios3 nagios-nrpe-plugin

安装到一半的时候,系统会提示让你输入一个密码,这是用来登录Nagios的密码

 

PS  先安装好Apache2

sudo apt-get install apache2

启动apache的命令

sudo /etc/init.d/apache2 restart

 

你可以新建用户,也可以更改Nagios的默认密码,集体方法如下:

The user's credentials are stored in/etc/nagios3/htpasswd.users

To change the nagiosadmin password, or add additional users to the Nagios CGI scripts, use thehtpasswd that is part of the apache2-utils package.

For example, to change the password for the nagiosadmin user enter:

sudo htpasswd /etc/nagios3/htpasswd.users nagiosadmin

To add a user:

sudo htpasswd /etc/nagios3/htpasswd.users steve

然后在第二个服务器上安装nagios-nrpe-server

我的第二个服务器是Ubuntu的桌面版

sudo apt-get install nagios-nrpe-server

 

Configuration Overview

There are a couple of directories containing Nagios configuration and check files.

  • /etc/nagios3: contains configuration files for the operation of the nagios daemon, CGI files, hosts, etc.
  • /etc/nagios-plugins: houses configuration files for the service checks.
  • /etc/nagios: on the remote host contains the nagios-nrpe-server configuration files.
  • /usr/lib/nagios/plugins/: where the check binaries are stored. To see the options of a check use the -h option.

    For example: /usr/lib/nagios/plugins/check_dhcp -h

There are a plethora of checks Nagios can be configured to execute for any given host. For this example Nagios will be configured to check disk space, DNS, and a MySQL hostgroup. The DNS check will be on server02, and the MySQL hostgroup will include both server01 and server02.

Additionally, there are some terms that once explained will hopefully make understanding Nagios configuration easier:

  • Host: a server, workstation, network device, etc that is being monitored.
  • Host Group: a group of similar hosts. For example, you could group all web servers, file server, etc.
  • Service: the service being monitored on the host. Such as HTTP, DNS, NFS, etc.
  • Service Group: allows you to group multiple services together. This is useful for grouping multiple HTTP for example.
  • Contact: person to be notified when an event takes place. Nagios can be configured to send emails, SMS messages, etc.

By default Nagios is configured to check HTTP, disk space, SSH, current users, processes, and load on the localhost. Nagios will also ping check the gateway.

Large Nagios installations can be quite complex to configure. It is usually best to start small, one or two hosts, get things configured the way you like then expand.

 

上面介绍的都是nagios文件夹下的每个文件的作用,下面开始正式的配置,将需要监控的服务器添加到nagios文件夹里:

 

Configuration

 

nagios服务器的/etc/nagios3/conf.d/localhost_nagios2.cfg 将其拷贝到 /etc/nagios3/conf.d/server02.cfg

 

这里面的Server02.cfg指的是你需要向里面添加的需要被监控的服务器的名字,比如我想要将我的Ubuntu桌面添加进来,我就可以

 

sudo cp /etc/nagios3/conf.d/localhost_nagios2.cfg /etc/nagios3/conf.d/ubuntu.cfg

复制成功后需要编辑该文件,刚刚发现了以新的编辑方法――nano

比如我要编辑/etc/nagios3/conf.d/ubuntu.cfg

我可以 nano /etc/nagios3/conf.d/ubuntu.cfg

编辑完,直接按ctrl+x 再按Y,再回车就可以了,用起来很方便

define host{

use generic-host ; Name of host template to use

host_name ubuntu

alias ubuntu

address 192.168.40.20

}

 

# check DNS service.

define service {

use generic-service

host_name ubuntu

service_description DNS

check_command check_dns!192.168.40.20

}

 

配置结束后可以重启nagios服务

sudo /etc/init.d/nagios3 restart

 

Now add a service definition for the MySQL check by adding the following to

/etc/nagios3/conf.d/services_nagios2.cfg

# check MySQL servers.

define service {

hostgroup_name mysql-servers

service_description MySQL

check_command check_mysql_cmdlinecred!nagios!secret!$HOSTADDRESS

use generic-service

notification_interval 0 ; set > 0 if you want to be renotified

}

mysqsl-servers hostgroup now needs to be defined. Edit 

/etc/nagios3/conf.d/hostgroups_nagios2.cfg 

adding

# MySQL hostgroup.

define hostgroup {

hostgroup_name mysql-servers

alias MySQL servers

members localhost, server02

}

The Nagios check needs to authenticate to MySQL. To add a nagios user to MySQL enter

mysql -u root -p -e "create user nagios identified by 'secret';"

Restart nagios to start checking the MySQL servers

sudo /etc/init.d/nagios3 restart

都弄好之后,我们需要设置第二个服务器――ubuntu桌面

Lastly configure NRPE to check the disk space on ubuntu

On server01 add the service check to 

/etc/nagios3/conf.d/ubuntu.cfg

# NRPE disk check.

define service {

use generic-service

host_name ubuntu

service_description nrpe-disk

check_command check_nrpe_1arg!check_all_disks!192.168.40.20

}

再在第二个服务器上――ubuntu桌面编辑 /etc/nagios/nrpe.cfg

在里面添加一句

allowed_hosts=192.168.40.60

这个IPnagiosIP

 

然后再加入这句话(我也不清楚加在哪里)

command[check_all_disks]=/usr/lib/nagios/plugins/check_disk -w 20% -c 10% -e

最后重启NRPE服务

sudo /etc/init.d/nagios-nrpe-server restart

以上三点是在Ubuntu桌面内完成的

下面再重启nagios服务

sudo /etc/init.d/nagios3 restart

这样的话基本上配置就结束了,我们就可以登录网页来看了。

 

注:如何添加主机?

首先向nagios主机的host里面添加你要监控的主机的名字,以及IP信息

然后再编辑nagios服务器里面的文件,实现其监控的目的

进入到/etc/nagios3/conf.d

创建一个新文件,名字你可以自定义,比如说我要监控一个名字叫做fedora的机器,那我就创建一个fedora.cfg

为了方便,你可以直接复制一个之前其他的cfg文件,再修改里面的信息

这个是我的那个Windows的配置文件,可以把这个复制进去再修改

# A simple configuration file for monitoring the local host

# This can serve as an example for configuring other servers;

# Custom services specific to this host are added here, but services

# defined in nagios2-common_services.cfg may also apply.

#

 

define host{

use generic-host ; Name of host template to use

host_name windows        hostname需要你先添加到/etc/hosts里面

alias windows

address 192.168.40.2    就是你添加的服务器的IP

}

 

# Define a service to check the disk space of the root partition

# on the local machine. Warning if < 20% free, critical if

# < 10% free space on partition.

 

define service{

use generic-service ; Name of service template to use

host_name windows

 

check_command check_all_disks!20%!10%

}

 

 

 

# Define a service to check the number of currently logged in

# users on the local machine. Warning if > 20 users, critical

# if > 50 users.

 

define service{

use generic-service ; Name of service template to use

host_name windows

service_description Current Users

check_command check_users!20!50

}

 

 

# Define a service to check the number of currently running procs

# on the local machine. Warning if > 250 processes, critical if

# > 400 processes.

GNU nano 2.2.6 File: windows.cfg

 

define service{

use generic-service ; Name of service template to use

host_name windows

service_description Total Processes

check_command check_procs!250!400

}

 

 

 

# Define a service to check the load on the local machine.

 

define service{

use generic-service ; Name of service template to use

host_name windows

service_description Current Load

check_command check_load!5.0!4.0!3.0!10.0!6.0!4.0

}

 

# NRPE disk check.

define service {

use generic-service

host_name windows

service_description nrpe-disk

check_command check_nrpe_1arg!check_all_disks!192.168.40.2

}

 

 

 

你可能感兴趣的:(ubuntu,nagios)