在使用percona zabbix mysql模版插件的过程中,碰到的一些问题记录在此,后续如果再碰到的话,也一起记录下来,好记星不如烂笔头,这是真理啊~
调试报错:
[root@db_master_2 zabbix_agentd.d]# /usr/bin/php -q /var/lib/zabbix/percona/scripts/ss_get_mysql_stats.php --host localhost --items gg ERROR: Can't connect to local MySQL server through socket '/var/lib/mysql/mysql.sock' (2)[root@db_master_2 zabbix_agentd.d]# [root@db_master_2 zabbix_agentd.d]# |
解决方案:
做软连接,ln -s /usr/local/mysql/mysql.sock /var/lib/mysql/mysql.sock,做完软连接后需要重启zabbix_agentd才能生效。
[root@db_m2_slave1 ~]# mkdir -p /var/lib/mysql/ [root@db_m2_slave1 ~]# ln -s /usr/local/mysql/mysql.sock /var/lib/mysql/mysql.sock [root@db_m2_slave1 ~]# [root@db_m2_slave1 ~]# killall zabbix_agentd [root@db_m2_slave1 ~]# /usr/sbin/zabbix_agentd -c /etc/zabbix/zabbix_agentd.conf [root@db_m2_slave1 ~]# |
现象是agent能获得数据,但是server端获得不了数据:
(1)agentd端mysql服务器能获得数据:
[root@db_m2_slave1 ~]# /usr/bin/php -q/var/lib/zabbix/percona/scripts/ss_get_mysql_stats.php --host localhost --itemsgg
gg:6
[root@db_m2_slave1 ~]#
(2)zabbix-server端获取不数据
[root@zabbix_serv_121_12 scripts]#/usr/local/zabbix/bin/zabbix_get -s 192.161.3.72 -p10050 -k "MySQL.Threads-connected"
ERROR: run the command manually toinvestigate the problem: /usr/bin/php -q/var/lib/zabbix/percona/scripts/ss_get_mysql_stats.php --host localhost --itemsgg
[root@zabbix_serv_121_12 scripts]#
那么问题在哪里呢?这要从zabbix-sever和zabbix-agentd的原理流程分析起了,sever是通过zabbix的根目录去调用/etc/zabbix/zabbix_agentd.d/userparameter_percona_mysql.conf里面获取参数MySQL.Threads-connected的,所以我们去找这个userparameter_percona_mysql.conf的此参数值的获取方法。
root@db_m2_slave1 ~]# more /etc/zabbix/zabbix_agentd.d/userparameter_percona_mysql.conf |grep MySQL.Threads-connected UserParameter=MySQL.Threads-connected,/var/lib/zabbix/percona/scripts/get_mysql_stats_wrapper.sh iu [root@db_m2_slave1 ~]# |
然后执行此参数方法:查看执行记录,果然报错,调用不出来记录:
[root@db_m2_slave1 ~]# sh /var/lib/zabbix/percona/scripts/get_mysql_stats_wrapper.sh kt ERROR: run the command manually to investigate the problem: /usr/bin/php -q /var/lib/zabbix/percona/scripts/ss_get_mysql_stats.php --host localhost --items gg
|
使用bash跟踪问题,查看到问题在于“+ '[' -e /tmp/localhost-mysql_zabbix_stats.txt ']'
”后报错,如下所示:
[root@db_m2_slave1 ~]# bash -x /var/lib/zabbix/percona/scripts/get_mysql_stats_wrapper.sh kt + echo '' + ITEM=kt + HOST=localhost ++ dirname /var/lib/zabbix/percona/scripts/get_mysql_stats_wrapper.sh + DIR=/var/lib/zabbix/percona/scripts + CMD='/usr/bin/php -q /var/lib/zabbix/percona/scripts/ss_get_mysql_stats.php --host localhost --items gg' + CACHEFILE=/tmp/localhost-mysql_zabbix_stats.txt + '[' kt = running-slave ']' + '[' -e /tmp/localhost-mysql_zabbix_stats.txt ']' + /usr/bin/php -q /var/lib/zabbix/percona/scripts/ss_get_mysql_stats.php --host localhost --items gg + '[' -e /tmp/localhost-mysql_zabbix_stats.txt ']' + echo 'ERROR: run the command manually to investigate the problem: /usr/bin/php -q /var/lib/zabbix/percona/scripts/ss_get_mysql_stats.php --host localhost --items gg' ERROR: run the command manually to investigate the problem: /usr/bin/php -q /var/lib/zabbix/percona/scripts/ss_get_mysql_stats.php --host localhost --items gg |
那么就去看下这个文件在不在,果然不在,那么建个空文件,然后赋予zabbix帐号权限:
[root@db_m2_slave1 ~]# vim/tmp/localhost-mysql_zabbix_stats.txt
[root@db_m2_slave1 ~]# chown -Rzabbix:zabbix /tmp/localhost-mysql_zabbix_stats.txt
[root@db_m2_slave1 ~]#
然后再bash调用执行命令:
[root@db_m2_slave1 ~]# bash -x /var/lib/zabbix/percona/scripts/get_mysql_stats_wrapper.sh kt + echo '' + ITEM=kt + HOST=localhost ++ dirname /var/lib/zabbix/percona/scripts/get_mysql_stats_wrapper.sh + DIR=/var/lib/zabbix/percona/scripts + CMD='/usr/bin/php -q /var/lib/zabbix/percona/scripts/ss_get_mysql_stats.php --host localhost --items gg' + CACHEFILE=/tmp/localhost-mysql_zabbix_stats.txt + '[' kt = running-slave ']' + '[' -e /tmp/localhost-mysql_zabbix_stats.txt ']' ++ stat -c %Y /tmp/localhost-mysql_zabbix_stats.txt + TIMEFLM=1463491777 ++ date +%s + TIMENOW=1463491788 ++ expr 1463491788 - 1463491777 + '[' 11 -gt 300 ']' + '[' -e /tmp/localhost-mysql_zabbix_stats.txt ']' + cat /tmp/localhost-mysql_zabbix_stats.txt + sed 's/ /\n/g; s/-1/0/g' + grep kt + awk -F: '{print $2}' [root@db_m2_slave1 ~]# |
然后在zabbix-server上测试验证:
[root@zabbix_serv_121_12 scripts]# /usr/local/zabbix/bin/zabbix_get -s 192.161.3.72 -p10050 -k "MySQL.Threads-connected" 9 [root@zabbix_serv_121_12 scripts]#
|
监控图上面的图突然断了,没有显示,如zabbix-serber上check下,报错:
[root@zabbix_serv_121_12 scripts]# /usr/local/zabbix/bin/zabbix_get -s 192.161.3.72 -p10050 -k "MySQL.Threads-connected" rm: cannot remove `/tmp/localhost-mysql_cacti_stats.txt': Operation not permitted 8 [root@zabbix_serv_121_12 scripts]# |
去agent端授予权限,发现文件不存在:
[root@db_m2_slave1 zabbix_agentd.d]# chown -R zabbix:zabbix localhost-mysql_cacti_stats.txt chown: cannot access `localhost-mysql_cacti_stats.txt': No such file or directory [root@db_m2_slave1 zabbix_agentd.d]# |
为什么文件会丢失呢?去分析执行文件sh脚本,看到有rm -f$CACHEFILE;有删除操作,而CACHEFILE的定义是CACHEFILE="/tmp/$HOST-mysql_cacti_stats.txt",也就是说这里rm了,那我可以用情况命令echo“”> $CACHEFILE;来取代下,尝试看看,脚本修改如下:
echo "" >> /tmp/$HOST-mysql_cacti_stats.txt
ITEM=$1 HOST=localhost DIR=`dirname $0` CMD="/usr/bin/php -q $DIR/ss_get_mysql_stats.php --host $HOST --items gg" #CACHEFILE="/tmp/zabbix/$HOST-mysql_cacti_stats.txt:3317" CACHEFILE="/tmp/$HOST-mysql_cacti_stats.txt" if [ "$ITEM" = "running-slave" ]; then # Check for running slave #RES=`HOME=~zabbix mysql -e 'SHOW SLAVE STATUS\G' | egrep '(Slave_IO_Running|Slave_SQL_Running):' | awk -F: '{print $2}' | tr '\n' ','` RES=`/usr/local/mysql/bin/mysql -e 'SHOW SLAVE STATUS\G' | egrep '(Slave_IO_Running|Slave_SQL_Running):' | awk -F: '{print $2}' | tr '\n' ','` if [ "$RES" = " Yes, Yes," ]; then echo 1 else echo 0 fi exit elif [ -e $CACHEFILE ]; then # Check and run the script #TIMEFLM=`stat -c %Y /tmp/zabbix/$HOST-mysql_cacti_stats.txt:3317` TIMEFLM=`stat -c %Y /tmp/$HOST-mysql_cacti_stats.txt` TIMENOW=`date +%s` if [ `expr $TIMENOW - $TIMEFLM` -gt 300 ]; then #rm -f $CACHEFILE,这里也可以直接注释掉不加下面的echo "" > $CACHEFILE echo "" > $CACHEFILE $CMD 2>&1 > /dev/null fi else $CMD 2>&1 > /dev/null Fi # Parse cache file if [ -e $CACHEFILE ]; then cat $CACHEFILE | sed 's/ /\n/g; s/-1/0/g'| grep $ITEM | awk -F: '{print $2}' else echo "ERROR: run the command manually to investigate the problem: $CMD" fi |
然后重启agentd,再去zabbix-server 检测有值了,如下所示:
[root@zabbix_serv_121_12 scripts]# /usr/local/zabbix/bin/zabbix_get -s 192.161.3.72 -p10050 -k "MySQL.innodb-transactions" 1131684198 [root@zabbix_serv_121_12 scripts]# /usr/local/zabbix/bin/zabbix_get -s 192.161.3.72 -p10050 -k "MySQL.Threads-connected" 4 [root@zabbix_serv_121_12 scripts]# |