九、监控oracle数据库
1
、在监控机 10.100.10.11
上添加一个 check_nrpe
的命令
# vi /usr/local/nagios/etc/objects/commands.cfg
define command {
command_name check_nrpe
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
}
2
、在监控机 10.11
上创建一个监控 oracle
服务的文件 oracle.cfg
。
# cd /usr/local/nagios/etc/objects
# vi oracle6.cfg
define host {
use linux-server
host_name ledbbackup01
alias Oracle 10g
address 10.100.10.6
}
define service {
use generic-service
host_name ledbbackup01
service_description TNS Check
check_command check_nrpe!check_oracle_tns
}
define service {
use generic-service
host_name ledbbackup01
service_description DB Check
check_command check_nrpe!check_oracle_db
}
define service {
use generic-service
host_name ledbbackup01
service_description Login Check
check_command check_nrpe!check_oracle_login
}
define service {
use generic-service
host_name ledbbackup01
service_description Cache Check
check_command check_nrpe!check_oracle_cache
}
define service {
use generic-service
host_name ledbbackup01
service_description Tablespace Check
check_command check_nrpe!check_oracle_tablespace
}
3
、在 nagios
的配置文件里添加上这个监控的文件
# cd /usr/loca/nagios/etc
# vi nagios.cfg
cfg_file=/usr/local/nagios/etc/objects/oracle6.cfg
4
、在 oracle
服务器上安装 nrpe
,具体步骤如上,不再详述。
添加几条监控的命令:
# vi /usr/local/nagios/etc/nrpe.cfg
command[check_oracle_tns]=/usr/local/nagios/libexec/check_oracle --tns legold01
command[check_oracle_db]=/usr/local/nagios/libexec/check_oracle --db legold01
command[check_oracle_login]=/usr/local/nagios/libexec/check_oracle --login legold01
command[check_oracle_cache]=/usr/local/nagios/libexec/check_oracle --cache legold01 nagios 123.com 80 90
command[check_oracle_tablespace]=/usr/local/nagios/libexec/check_oracle --tablespace legold01 nagios 123.com USERS 90 80
具体的命令格式可参考 check_oracle
的 help
文档。
5
、在 oracle
服务器上将 check_oracle
插件修改一下:将 $ORACLE_HOME
以及 $PATH
手动加入,避免出现问题。
export ORACLE_HOME=/home/oracle/app/product/11.2.0/db_1
export PATH=$PATH:$ORACLE_HOME/bin
此时,可以启动 nrpe
服务了:
# service nrped start
对了,别忘记了在 10.11
监控机上将 nagios
服务重启一下:
# service nagios restart
6
此时,打开浏览器查看:
http://10.100.10.11/nagios
好像出错了,仔细察看了一下 nagios
的错误日志,发现监控机 10.11
上没有 check_nrpe
的插件,所以还要进行如下步骤:
7
、 copy oracle
服务器 10.6
上的 check_nrpe
到 10.11
上
在 10.11
上:
# cd /usr/local/nagios/libexec
# ./check_nrpe
./check_nrpe: error while loading shared libraries: libssl.so.6: cannot open shared object file: No such file or directory
没有 libssl.so.6
的库文件,
解决方法:
在 10.6
上:
# find / -name libssl.so.6
/lib64/libssl.so.6
/lib/libssl.so.6
8
、在 10.11
上:
# cd /usr/local/nagios/libexec
# ./check_nrpe
./check_nrpe: error while loading shared libraries: libcrypto.so.6: cannot open shared object file: No such file or directory
还是缺少库文件,不用怕,继续到 10.6
上拷贝,这叫越挫越勇!
在 10.6
上:
# find / -name libcrypto.so.6
/lib64/libcrypto.so.6
/lib/libcrypto.so.6
9
、好吧,再回到 10.11
上尝试一下:
#./check_nrpe
Incorrect command line arguments supplied
NRPE Plugin for Nagios
Version: 2.13
Last Modified: 11-11-2011
License: GPL v2 with exemptions (-l for more info)
SSL/TLS Available: Anonymous DH Mode, OpenSSL 0.9.6 or higher required
Usage: check_nrpe -H <host> [-n] [-u] [-p <port>] [-t <timeout>] [-c <command>] [-a <arglist...>]
Options:
-n
= Do no use SSL
-u
= Make socket timeouts return an UNKNOWN state instead of CRITICAL
<host>
= The address of the host running the NRPE daemon
[port]
= The port on which the daemon is running (default=5666)
[timeout] = Number of seconds before connection times out (default=10)
[command] = The name of the command that the remote daemon should run
[arglist] = Optional arguments that should be passed to the command. Multiple
arguments should be separated by a space. If provided, this must be
the last option supplied on the command line.
此时就说明 check_nrpe
可以用了!真的好难得哦!
# ./check_nrpe -H 10.100.10.6 -p 5666
NRPE v2.13
10
、重新启动 nagios
服务:
# service nagios restart
输入 ip
地址查看一下:
http://10.100.10.11/nagios
好了,就先记录到这里吧,作为以后的参考文档。
岁月静好,岁月静好。
(*^__^*) 嘻嘻……