Nagios监控MySQL报错:NRPE: Unable to read output的详细解决过程

前言:nagios界面上,看到监控mysql服务报错如下:

Warning:NRPE: Unable to read output

 

 

1,去nagios监控服务器上check下

1.1,执行check_nrpe命令远程调用

在监控端nagios服务器上执行check_nrpe检查mysql状态报错如下:

[root@mysqlvm2 ~]# /usr/lib/nagios/plugins/check_nrpe  -H192.xx.180.xx -c check_mysql_status

NRPE: Unable to read output

You have new mail in /var/spool/mail/root

[root@mysqlvm2 ~]#

 

1.2,检查下别的check服务

在nagios服务器端检查别的监控项比如check_users,正常如下:

[root@mysqlvm2 ~]# /usr/lib/nagios/plugins/check_nrpe  -H192.xx.180.xx -c check_users

USERS OK - 2 users currently logged in |users=2;8;15;0

[root@mysqlvm2 ~]#

 

这里证明,nagios流程是正常的,能检测到别的监控项比如check_users,但是check_mysql故障报错,还是需要去mysql服务器上再去分析问题到底出在哪里。

 

 

2,在被监控端mysql服务器check

2.1,调用本地的check_nrpe服务,报一样的错误如下:

[root@mysqldb ~]# /usr/lib/nagios/plugins/check_nrpe -Hlocalhost -c check_mysql_status

NRPE: Unable to read output

[root@mysqldb ~]#

 

去单独执行/etc/nagios/nrpe.cfg里面的check_mysql_status命令。

先用cat找到check_mysql的命令行

[root@mysqldb ~]# cat /etc/nagios/nrpe.cfg |grep check_mysql_status

command[check_mysql_status]=/usr/bin/sudo  /usr/lib/nagios/plugins/check_mysql -unagios -P3306 -s /usr/local/mysql/mysql.sock -Hlocalhost --password='nagiosq@0512' -d test 

执行,正常显示如下:

[root@ mysqldb ~]# /usr/bin/sudo  /usr/lib/nagios/plugins/check_mysql -unagios -P3306 -s /usr/local/mysql/mysql.sock -Hlocalhost --password='nagiosq@0512' -d test

Uptime: 1122870  Threads: 108  Questions: 11559152  Slow queries: 1278  Opens: 3190  Flush tables: 1  Open tables: 395  Queries per second avg: 10.294|Connections=844188c;;; Open_files=49;;; Open_tables=395;;; Qcache_free_memory=209024;;; Qcache_hits=51724c;;; Qcache_inserts=73877c;;; Qcache_lowmem_prunes=5599c;;; Qcache_not_cached=2572345c;;; Qcache_queries_in_cache=1985;;; Queries=11559153c;;; Questions=10724833c;;; Table_locks_waited=0c;;; Threads_connected=107;;; Threads_running=2;;; Uptime=1122870c;;;

[root@ mysqldb ~]#

从这里可以看到check_mysql脚本没有问题,是正常的。

 

2.2,检查下check_mysql的执行权限

[root@mysqldb ~]# ll /usr/lib/nagios/plugins/check_mysql

-rwxrwxr-x. 1 root root 168272 7月   8 14:54 /usr/lib/nagios/plugins/check_mysql

[root@mysqldb ~]#

看到是最后一个x,表明有执行权限,sudo到nagios账号,看是否能执行,如下所示:

[root@mysqldb ~]# su - nagios

-bash-4.1$ /usr/lib/nagios/plugins/check_mysql -unagios -P3306 -s /usr/local/mysql/mysql.sock -Hlocalhost --password='nagiosq@0512' -d test

Uptime: 1124403  Threads: 106  Questions: 11586454  Slow queries: 1278  Opens: 3190  Flush tables: 1  Open tables: 395  Queries per second avg: 10.304|Connections=846235c;;; Open_files=49;;; Open_tables=395;;; Qcache_free_memory=211696;;; Qcache_hits=51786c;;; Qcache_inserts=73915c;;; Qcache_lowmem_prunes=5732c;;; Qcache_not_cached=2578541c;;; Qcache_queries_in_cache=1890;;; Queries=11586455c;;; Questions=10750088c;;; Table_locks_waited=0c;;; Threads_connected=105;;; Threads_running=2;;; Uptime=1124403c;;;

-bash-4.1$

这里表明既是naigos账号也可以执行check_mysql脚本的,check_mysql脚本路径以及执行权限都没有问题,都是可以的。

 

2.3,检查sudo的里面的nagios权限配置

Nagios远程调用运行原理是,通过nagios账号来执行所有的check_xxx脚本的,但是我的nrpe客户端是root账号安装的,所以的check_xxx脚本也是root用户所属,nagios在远程调用的时候是否默认执行了su – root,然后在执行check_msyql脚本命令?所以去编辑sudo配置,修改如下,把Defaults    requiretty注释掉,然后添加一行nagios ALL=(ALL) NOPASSWD:/usr/lib/nagios/plugins/check_mysql,

 

vim  /etc/sudoers


#表示不需要终端控制

#Defaults    requiretty 

 

#表示通过nagios命令执行check_mysql命令不需要密码。

nagios ALL=(ALL) NOPASSWD:/usr/lib/nagios/plugins/check_mysql

 

修改完,wq!强行保存退出vim编辑,然后执行本次check_npre操作检查,已经恢复正常如下:

[root@mysqldb ~]# /usr/lib/nagios/plugins/check_nrpe -Hlocalhost -c check_mysql_status

Uptime: 1123659  Threads: 110  Questions: 11573270  Slow queries: 1278  Opens: 3190  Flush tables: 1  Open tables: 395  Queries per second avg: 10.299|Connections=845248c;;; Open_files=49;;; Open_tables=395;;; Qcache_free_memory=227704;;; Qcache_hits=51751c;;; Qcache_inserts=73892c;;; Qcache_lowmem_prunes=5656c;;; Qcache_not_cached=2575554c;;; Qcache_queries_in_cache=1943;;; Queries=11573271c;;; Questions=10737891c;;; Table_locks_waited=0c;;; Threads_connected=109;;; Threads_running=2;;; Uptime=1123659c;;;

 

再去nagios服务器端执行check_nrpe检查,正常如下:

[root@mysqlvm2 ~]# /usr/lib/nagios/plugins/check_nrpe  -H192.xx.180.xx -c check_mysql_status

Uptime: 1123673  Threads: 110  Questions: 11573464  Slow queries: 1278  Opens: 3190  Flush tables: 1  Open tables: 395  Queries per second avg: 10.299|Connections=845264c;;; Open_files=49;;; Open_tables=395;;; Qcache_free_memory=227704;;; Qcache_hits=51751c;;; Qcache_inserts=73892c;;; Qcache_lowmem_prunes=5656c;;; Qcache_not_cached=2575596c;;; Qcache_queries_in_cache=1943;;; Queries=11573465c;;; Questions=10738069c;;; Table_locks_waited=0c;;; Threads_connected=109;;; Threads_running=2;;; Uptime=1123673c;;;

[root@mysqlvm2 ~]#

 

2.4,再去nagios监控界面,查看mysql选项已经恢复正常,如下图所示:
Nagios监控MySQL报错:NRPE: Unable to read output的详细解决过程_第1张图片


3 附带一些其他原因

  引起NRPE: Unable to read output报错的原因有很多种,google了下其它的情况如下:

  (1),客户端配置文件/etc/nagios/npre.cfg里面没有添加nagios服务器IP地址,比如 allowed_hosts=127.0.0.1,IP后面IP没有填写或者填写有误。

  (2),查客户端nrpe的权限是否可读,可被nagios执行,如果nagios权限不够,需要赋予X可执行权限。

  (3),nrpe.cfg里面commands命令路径是否正确,比如有些既有rpm方式安装的也有源码安装的,两者路径不一样,源码安装ngios客户端路径是/usr/local/nagios/libexec/check_mysql,而rpm包安装路径是/usr/lib/nagios/plugins/check_mysql。

  (4),客户端配置文件里面有2个一模一样的命令,比如/etc/nagios/nrpe.cfg里面有如下2个check_zombie_procs配置命令:

command[check_zombie_procs]=/usr/lib/nagios/plugins/check_procs -w 10 -c 15 -s Z
command[check_zombie_procs]=/usr/lib/nagios/plugins/check_iostat -w

        那么就会报NRPE: Unable to read output的错误,因为两个命令混乱了,不知道去执行哪一个了。

 

 来自: http://blog.itpub.net/blog/post/id/1217246/

 

参考文章:http://blog.csdn.net/kakane/article/details/9615795

你可能感兴趣的:(Nagios监控MySQL报错:NRPE: Unable to read output的详细解决过程)