solaris10下nagios监控客户端错误解决

今天发现有一台数据库上的nrpe出现异常,大量的报警邮件,检查/var/adm/message发现如下错误:
Jan 25 16:27:31 dbbak inetd[16966]: [ID 702911 daemon.error] Failed to set credentials for the inetd_start method of instance svc:/
network/nrpe/tcp:default (chdir: No such file or directory)


执行
#/usr/local/nagios/libexec/check_nrpe -H localhost
CHECK_NRPE: Received 0 bytes from daemon.  Check the remote server logs for error messages.

google了一把共有两篇有用的文章:
http://forums.meulie.net/viewtopic.php?t=1892

http://www.utahsysadmin.com/2008/03/14/configuring-nagios-plugins-nrpe-on-solaris-10/

根据第一篇需要给nagios用户创建一个目录。
以前我给创建的nagios用户为:(为了安全起见)
nagios:x:103:102::/var/run/nagios:/bin/false
修改为:
nagios:x:103:102::/export/home/nagios:/bin/false

重新启动nrpe:
svcadm restart  svc:/network/nrpe/tcp:default

nrpe恢复正常。

第二篇是一个详细的nrpe在solaris上的配置过程,全文如下:
<a title="Permanent" Link: Configuring Nagios Plugins & NRPE on Solaris 10" href="http://www.utahsysadmin.com/2008/03/14/configuring-nagios-plugins-nrpe-on-solaris-10/" rel=bookmark>Configuring Nagios Plugins & NRPE on Solaris 10

Here’s a step by step installation of the Nagios plugin NRPE for Solaris 10 x86 (as the remote host):

useradd -c “nagios system user” -d /usr/local/nagios -m nagios
chown nagios:nagios /usr/local/nagios/
cd /usr/local/src # or wherever you like to put source code
wget http://internap.dl.sourceforge.net/sourceforge/nagios/nrpe-2.12.tar.gz
wget http://internap.dl.sourceforge.net/sourceforge/nagiosplug/nagios-plugins-1.4.11.tar.gz
gunzip nagios-plugins-1.4.11.tar.gz
tar -xvf nagios-plugins-1.4.11.tar
gunzip nrpe-2.12.tar.gz
tar -xvf nrpe-2.12.tar

First we’ll compile the nagios plugins:

cd nagios-plugins-1.4.11
./configure
make
make install
chown -R nagios:nagios /usr/local/nagios/libexec
cd ..

Run a quick check to make sure the plugins are working:

/usr/local/nagios/libexec/check_disk -w 10 -c 5 -p /

Next, we’ll compile NRPE. Normally at this point we would just run `cd nrpe-2.12; ./configure`. Unfortunately, the configure script can not find the SSH headers and libraries on Solaris 10. You get errors like this:

checking for SSL headers… configure: error: Cannot find ssl headers

checking for SSL libraries… configure: error: Cannot find ssl libraries

The answer to this is, of course, to tell configure where to find them:

cd nrpe-2.12
./configure �Cwith-ssl=/usr/sfw/ �Cwith-ssl-lib=/usr/sfw/lib/

Currently there is a bug in 2.12 that it assumes that all systems have 2 syslog facilities that Solaris doesn’t have, so if you try and compile it generates the following errors:

nrpe.c: In function `get_log_facility’:
nrpe.c:617: error: `LOG_AUTHPRIV’ undeclared (first use in this function)
nrpe.c:617: error: (Each undeclared identifier is reported only once
nrpe.c:617: error: for each function it appears in.)
nrpe.c:619: error: `LOG_FTP’ undeclared (first use in this function)
*** Error code 1
make: Fatal error: Command failed for target `nrpe’
Current working directory /usr/local/src/nrpe-2.12/src
*** Error code 1
make: Fatal error: Command failed for target `all’

Unfortunately, the fix at this time is to comment out the code that calls these two facilities, lines 616-619, in src/nrpe.c:

/*else if(!strcmp(varvalue,”authpriv”))
log_facility=LOG_AUTHPRIV;
else if(!strcmp(varvalue,”ftp”))
log_facility=LOG_FTP;*/

UPDATE: You no longer need to comment out these lines, just replace them with the following:

else if(!strcmp(varvalue,”authpriv”))
log_facility=LOG_AUTH;
else if(!strcmp(varvalue,”ftp”))
log_facility=LOG_DAEMON;

Now it will compile:

# make all
cd ./src/; make ; cd ..
gcc -g -O2 -I/usr/sfw//include/openssl -I/usr/sfw//include -DHAVE_CONFIG_H -o nrpe nrpe.c utils.c -L/usr/sfw/lib/ -lssl -lcrypto -lnsl -lsocket ./snprintf.o
gcc -g -O2 -I/usr/sfw//include/openssl -I/usr/sfw//include -DHAVE_CONFIG_H -o check_nrpe check_nrpe.c utils.c -L/usr/sfw/lib/ -lssl -lcrypto -lnsl -lsocket

*** Compile finished ***

Next install the new binaries:

# make install
cd ./src/ && make install
make install-plugin
.././install-sh -c -m 775 -o nagios -g nagios -d /usr/local/nagios/libexec
.././install-sh -c -m 775 -o nagios -g nagios check_nrpe /usr/local/nagios/libexec
make install-daemon
.././install-sh -c -m 775 -o nagios -g nagios -d /usr/local/nagios/bin
.././install-sh -c -m 775 -o nagios -g nagios nrpe /usr/local/nagios/bin

Optionally, if you want to use the sample config file run (Recommended if you don’t already have a standard config):

# make install-daemon-config
./install-sh -c -m 775 -o nagios -g nagios -d /usr/local/nagios/etc
./install-sh -c -m 644 -o nagios -g nagios sample-config/nrpe.cfg /usr/local/nagios/etc

Modify the nrpe.cfg file with your settings:

vi /usr/local/nagios/etc/nrpe.cfg

With Solaris 10, we don’t use either inetd or xinetd, but SMF. Thankfully, we can convert inetd entires into the SMF repository with the inetconv command. So first, add the following entry to /etc/services:

nrpe 5666/tcp # NRPE

Then add the following line to the end of /etc/inet/inetd.conf:

nrpe stream tcp nowait nagios /usr/sfw/sbin/tcpd /usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -i

Next, we need to convert it to SMF:

# inetconv
nrpe -> /var/svc/manifest/network/nrpe-tcp.xml
Importing nrpe-tcp.xml …Done
# inetconv -e
svc:/network/nrpe/tcp:default enabled

Check to make sure it went online:

# svcs svc:/network/nrpe/tcp:default
STATE STIME FMRI
online 15:53:39 svc:/network/nrpe/tcp:default
# netstat -a | grep nrpe
*.nrpe *.* 0 0 49152 0 LISTEN

Check the default installed parameters:

# inetadm -l svc:/network/nrpe/tcp:default
SCOPE NAME=VALUE
name=”nrpe”
endpoint_type=”stream”
proto=”tcp”
isrpc=FALSE
wait=FALSE
exec=”/usr/sfw/sbin/tcpd -c /usr/local/nagios/etc/nrpe.cfg -i”
arg0=”/usr/local/nagios/bin/nrpe”
user=”nagios”
default bind_addr=”"
default bind_fail_max=-1
default bind_fail_interval=-1
default max_con_rate=-1
default max_copies=-1
default con_rate_offline=-1
default failrate_cnt=40
default failrate_interval=60
default inherit_env=TRUE
default tcp_trace=FALSE
default tcp_wrappers=FALSE
default connection_backlog=10

Change it so that it uses tcp_wrappers:

# inetadm -m svc:/network/nrpe/tcp:default tcp_wrappers=TRUE

And check to make sure it took effect:

# inetadm -l svc:/network/nrpe/tcp:default
SCOPE NAME=VALUE
name=”nrpe”
endpoint_type=”stream”
proto=”tcp”
isrpc=FALSE
wait=FALSE
exec=”/usr/sfw/sbin/tcpd -c /usr/local/nagios/etc/nrpe.cfg -i”
arg0=”/usr/local/nagios/bin/nrpe”
user=”nagios”
default bind_addr=”"
default bind_fail_max=-1
default bind_fail_interval=-1
default max_con_rate=-1
default max_copies=-1
default con_rate_offline=-1
default failrate_cnt=40
default failrate_interval=60
default inherit_env=TRUE
default tcp_trace=FALSE
tcp_wrappers=TRUE
default connection_backlog=10

Modify your hosts.allow and hosts.deny to only allow your nagios server access to the NRPE port. Note that tcpd always looks at hosts.allow first, so even though we specify that everyone is rejected in the hosts.deny file, the ip addresses specified in hots.allow are allowed.
/etc/hosts.allow:

nrpe: LOCAL, 10.0.0.45

/etc/hosts.deny:

nrpe: ALL

Finally, check to make sure you have everything installed correctly (should return version information):

/usr/local/nagios/libexec/check_nrpe -H localhost
NRPE v2.12

Optionally, modify any firewalls between your nagios server and the remote host to allow port 5666.
Don’t forget to configure your nagios server to check your new service.

你可能感兴趣的:(客户端,message,NetWork,default,的)