最近倒腾了一下ndoutils这个工具,想把nagios的数据都倒进mysql里,方便以后多个nagios主机的数据集中存储到一个数据库里,再取出来做数据对比和分析就很方便了。

安装过程很顺利,采用源码安装,make完之后直接拷贝配置文件和bin文件到对应目录就可以了,网上能找到很多类似的教程。

最后运行的时候发现nagios的数据只有部分导入了数据库,很多表都是空的,包括配置文件的数据,实时的主机状态和服务状态的table也是数据不全。在配置上折腾了很长时间,反复对比过跟别人和官方的配置也找不出问题,debug文件也没多少内容可看,都是一些mysql的执行语句。

最后发现/var/log/messages文件里有这么一句,然后顺着这提示来找解决方案,最终解决了问题。

这里分析一下:

ndo2db: Warning: Retrying message send. This can occur because you have too few messages allowed or too few total bytes allowed in message queues. You are currently using 64 of 16 messages and 65536 of 65536 bytes in the queue. See README for kernel tuning options.

上面这警告出现在系统日志里,然后网上也能搜出匹配的解决方案出来。

比如:http://bbs.ixdba.net/viewthread.php?tid=633


其实在源代码文件夹里的README文件里最下面的内容就有提及了,我在安装的时候配置完后直接就急着启动,而没有注意需要调系统参数

README文件里提到如下:

************************
TUNING KERNEL PARAMETERS
************************

NDOUTILS uses a single message queue to communicate between the broker
module and the NDO2DB daemon. Depending on the operating system, there
may be parameters that need to be tuned in order for this communication
to work correctly. The discussion below applies specifically to Linux,
but may apply generally to other Unices as well.

There are three Linux kernel parameters that determine the resources
provided to the messaging subsystem:
    * kernel.msgmax is the maximum size of a single message in a
        message queue
    * kernel.msgmni is the maximum number of messages allowed in any
        one message queue
    * kernel.msgmnb is the total number of bytes allow in all messages
        in any one message queue

To see the current values for any of these parameters, cat
/proc/sys/kernel/msg{max|mni|mnb}.

In order for NDOUTILS to work at all, kernel.msgmax must be greater than
the size of the queue_msg struct (currently 1026 bytes). Most Linux
distributions set kernel.msgmax to a default of 65536.

If there are insufficient resources for sending messages between the
broker and the daemon, you will see an entry similar to the following
in your logs. (This is logged via the syslog facility, using the level
LOG_ERR and the default facility.)

    ndo2db: Warning: Retrying message send. This can occur because
    you have too few messages allowed or too few total bytes
    allowed in message queues. You are currently using 16 of 16
    mesages and 65536 of 65536 bytes in the queue.  See README for
    kernel tuning options.

If you see this entry, the message will likely eventually be sent,
but retrying uses system resources, and there is the possibility that
more messages will queued than can be handled, causing the broker
module to stall.

If you are close to or have exceeded the number of messages, you may
need to increase kernel.msgmni. If you are close to or have exceeded
the number of bytes in the queue, you may need to increase
kernel.msgmnb. In some cases you may need to increase both.

A conservative approach would be to double the necessary value, stop
and restart both the NDO2DB daemon and Nagios Core, and watch for any
further messages. Note that if NDO2DB is started after Nagios Core,
you may see the warning above as the broker module first attempts to
flush its backlog of messages.

To increase a value, echo the value to /proc/sys/kernel/msgmni or
/proc/sys/kernel/msgmnb as appropriate.

For example, to increase the number of messages allowed in the queue
to 32, use the command 'echo 32 > /proc/sys/kernel/msgmni' (without
the quotes).

Once you have determine the correct parameters, you can make them
permanent by editing /etc/sysctl.conf. Add or update the line of
the form 'kernel.msg{mni|mnb} = ' with the value(s) determined
above. The next time the system is booted, the values of the
parameters in /etc/sysctl.conf will be loaded.


解决方法:

在/etc/sysctl.conf里加上

# Controls the default maxmimum size of a mesage queue
kernel.msgmnb = 131072000

# Controls the maximum size of a message, in bytes
kernel.msgmax = 131072000

#kernel.msgmni is the maximum number of messages allowed in any one message queue
kernel.msgmni = 65536000