线上zabbix监控redis和redis集群

2018-10-31 15:25:46
公司最近网站改版,增加了redis服务器,现领导要求需要测试redis的监控,于是从网上找了个redis的监控脚本,简单的修改了一下,测试中没出现任何问题。生产环境中,可根据实际要监控的参数值修改脚本内容,并添加相应的触发器。
Redis有自带的redis-cli客户端,通过info命令可以查询到redis的运行情况,我们可以写个shell脚本,通过zabbix来调用这个脚本实现redis的监控。

一、info命令的使用
要获得redis的当前情况,可以使用info命令。
命令格式:

redis-cli -h [hostname] -p [port] -a [password] info [参数]
redis-cli -h [hostname] -p [port] -c [参数]

1、查询server信息

redis-cli -h 127.0.0.1 -p 6379 -a 'password' info server

2、查询客户端连接情况

redis-cli -h 127.0.0.1 -p 6379 -a 'password' info clients

3、查询内存使用情况

redis-cli -h 127.0.0.1 -p 6379 -a 'password' info memory

4、查询CPU使用情况

redis-cli -h 127.0.0.1 -p 6379 -a 'password' info cpu

5、查询redis集群情况

redis-cli -h 127.0.0.1 -p 6379 -c cluster nodes
redis-cli -h 127.0.0.1 -p 6379 -c info server

二、创建redis监控脚本
1、编写监控脚本(第一个脚本是参考网上的,第二个脚本是根据系统实际情况改编的。)
第一个网上参考脚本:

vim /etc/zabbix/zabbix_agentd.d/redis_status.sh

#!/bin/bash
REDISCLI="/usr/local/bin/redis-cli"
HOST="127.0.0.1"
PORT=6379
PASS="password"

if [[ $# == 1 ]];then
case $1 in
version)
result=$REDISCLI -h $HOST -a $PASS -p $PORT info server | grep -w "redis_version" | awk -F':' '{print $2}'
echo $result
;;
uptime)
result=$REDISCLI -h $HOST -a $PASS -p $PORT info server | grep -w "uptime_in_seconds" | awk -F':' '{print $2}'
echo $result
;;
connected_clients)
result=$REDISCLI -h $HOST -a $PASS -p $PORT info clients | grep -w "connected_clients" | awk -F':' '{print $2}'
echo $result
;;
blocked_clients)
result=$REDISCLI -h $HOST -a $PASS -p $PORT info clients | grep -w "blocked_clients" | awk -F':' '{print $2}'
echo $result
;;
used_memory)
result=$REDISCLI -h $HOST -a $PASS -p $PORT info memory | grep -w "used_memory" | awk -F':' '{print $2}'
echo $result
;;
used_memory_rss)
result=$REDISCLI -h $HOST -a $PASS -p $PORT info memory | grep -w "used_memory_rss" | awk -F':' '{print $2}'
echo $result
;;
used_memory_peak)
result=$REDISCLI -h $HOST -a $PASS -p $PORT info memory | grep -w "used_memory_peak" | awk -F':' '{print $2}'
echo $result
;;
used_memory_lua)
result=$REDISCLI -h $HOST -a $PASS -p $PORT info memory | grep -w "used_memory_lua" | awk -F':' '{print $2}'
echo $result
;;
used_cpu_sys)
result=$REDISCLI -h $HOST -a $PASS -p $PORT info cpu | grep -w "used_cpu_sys" | awk -F':' '{print $2}'
echo $result
;;
used_cpu_user)
result=$REDISCLI -h $HOST -a $PASS -p $PORT info cpu | grep -w "used_cpu_user" | awk -F':' '{print $2}'
echo $result
;;
used_cpu_sys_children)
result=$REDISCLI -h $HOST -a $PASS -p $PORT info cpu | grep -w "used_cpu_sys_children" | awk -F':' '{print $2}'
echo $result
;;
used_cpu_user_children)
result=$REDISCLI -h $HOST -a $PASS -p $PORT info cpu | grep -w "used_cpu_user_children" | awk -F':' '{print $2}'
echo $result
;;
rdb_last_bgsave_status)
result=$REDISCLI -h $HOST -a $PASS -p $PORT info Persistence | grep -w "rdb_last_bgsave_status" | awk -F':' '{print $2}' | grep -c ok
echo $result
;;
aof_last_bgrewrite_status)
result=$REDISCLI -h $HOST -a $PASS -p $PORT info Persistence | grep -w "aof_last_bgrewrite_status" | awk -F':' '{print $2}' | grep -c ok
echo $result
;;
aof_last_write_status)
result=$REDISCLI -h $HOST -a $PASS -p $PORT info Persistence | grep -w "aof_last_write_status" | awk -F':' '{print $2}' | grep -c ok
echo $result
;;
)
echo -e "\033[33mUsage: $0 {connected_clients|blocked_clients|used_memory|used_memory_rss|used_memory_peak|used_memory_lua|used_cpu_sys|used_cpu_user|used_cpu_sys_children|used_cpu_user_children|rdb_last_bgsave_status|aof_last_bgrewrite_status|aof_last_write_status}\033[0m"
;;
esac
elif [[ $# == 2 ]];then
case $2 in
keys)
result=$REDISCLI -h $HOST -a $PASS -p $PORT info | grep -w "$1" | grep -w "keys" | awk -F'=|,' '{print $2}'
echo $result
;;
expires)
result=$REDISCLI -h $HOST -a $PASS -p $PORT info | grep -w "$1" | grep -w "keys" | awk -F'=|,' '{print $4}'
echo $result
;;
avg_ttl)
result=$REDISCLI -h $HOST -a $PASS -p $PORT info | grep -w "$1" | grep -w "avg_ttl" | awk -F'=|,' '{print $6}'
echo $result
;;
)
echo -e "\033[33mUsage: $0 {db0 keys|db0 expires|db0 avg_ttl}\033[0m"
;;
esac
fi

第二个根据自己系统环境改编的脚本:

vim /etc/zabbix/zabbix_agentd.d/redis_status.sh

#!/bin/bash
REDISCLI="/usr/bin/redis-cli"
HOST="127.0.0.1"
PORT=6379

if [[ $# == 1 ]];then
case $1 in
cluster_state)
result=$REDISCLI -h $HOST -p $PORT -c cluster info | grep -w "cluster_state" | awk -F':' '{print $2}'| grep -c ok
echo $result
;;
cluster_slots_assigned)
result=$REDISCLI -h $HOST -p $PORT -c cluster info | grep -w "cluster_slots_assigned" | awk -F':' '{print $2}'
echo $result
;;
cluster_slots_ok)
result=$REDISCLI -h $HOST -p $PORT -c cluster info | grep -w "cluster_slots_ok" | awk -F':' '{print $2}'
echo $result
;;
cluster_slots_pfail)
result=$REDISCLI -h $HOST -p $PORT -c cluster info | grep -w "cluster_slots_pfail" | awk -F':' '{print $2}'
echo $result
;;
cluster_slots_fail)
result=$REDISCLI -h $HOST -p $PORT -c cluster info | grep -w "cluster_slots_fail" | awk -F':' '{print $2}'
echo $result
;;
cluster_known_nodes)
result=$REDISCLI -h $HOST -p $PORT -c cluster info | grep -w "cluster_known_nodes" | awk -F':' '{print $2}'
echo $result
;;
cluster_size)
result=$REDISCLI -h $HOST -p $PORT -c cluster info | grep -w "cluster_size" | awk -F':' '{print $2}'
echo $result
;;
cluster_current_epoch)
result=$REDISCLI -h $HOST -p $PORT -c cluster info | grep -w "cluster_current_epoch" | awk -F':' '{print $2}'
echo $result
;;
cluster_my_epoch)
result=$REDISCLI -h $HOST -p $PORT -c cluster info | grep -w "cluster_my_epoch" | awk -F':' '{print $2}'
echo $result
;;
cluster_stats_messages_ping_sent)
result=$REDISCLI -h $HOST -p $PORT -c cluster info | grep -w "cluster_stats_messages_ping_sent" | awk -F':' '{print $2}'
echo $result
;;
cluster_stats_messages_pong_sent)
result=$REDISCLI -h $HOST -p $PORT -c cluster info | grep -w "cluster_stats_messages_pong_sent" | awk -F':' '{print $2}'
echo $result
;;
cluster_stats_messages_sent)
result=$REDISCLI -h $HOST -p $PORT -c cluster info | grep -w "cluster_stats_messages_sent" | awk -F':' '{print $2}'
echo $result
;;
cluster_stats_messages_ping_received)
result=$REDISCLI -h $HOST -p $PORT -c cluster info | grep -w "cluster_stats_messages_ping_received" | awk -F':' '{print $2}'
echo $result
;;
cluster_stats_messages_pong_received)
result=$REDISCLI -h $HOST -p $PORT -c cluster info | grep -w "cluster_stats_messages_pong_received" | awk -F':' '{print $2}'
echo $result
;;
cluster_stats_messages_received)
result=$REDISCLI -h $HOST -p $PORT -c cluster info | grep -w "cluster_stats_messages_received" | awk -F':' '{print $2}'
echo $result
;;
*)
echo -e "\033[33mUsage: $0 {cluster_state|cluster_slots_assigned|cluster_slots_ok|cluster_slots_pfail|cluster_slots_fail|cluster_known_nodes|cluster_size|cluster_current_epoch|cluster_known_nodes|cluster_size|cluster_current_epoch|cluster_my_epoch|cluster_stats_messages_ping_sent|cluster_stats_messages_pong_sent|cluster_stats_messages_sent|cluster_stats_messages_ping_received|cluster_stats_messages_pong_received|cluster_stats_messages_received}\033[0m"
;;
esac
fi

好接下来继续:
2、赋予脚本可执行权限

chmod +x /etc/zabbix/zabbix_agentd.d/redis_status.sh

3、脚本测试
查看redis的客户端连接数

/etc/zabbix/zabbix_agentd.d/redis_status.sh connected_clients

三、创建redis监控配置文件
1、编写redis监控配置文件

vim /etc/zabbix/zabbix_agentd.d/redis.conf

UserParameter=Redis.status[*],/data/zabbix/scripts/redis_status.sh $1
UserParameter=Redisfile,redis-cli -h 127.0.0.1 -p 6379 -c cluster nodes | awk -F ',' '{print $2}' | grep -c 'fail'

2、重启zabbix-agent

ps -ef | grep zabbix_agentd
kill -9 进程
/data/zabbix/sbin/zabbix_agentd -c /data/zabbix/etc/zabbix_agentd.conf

或者:

systemctl restart zabbix-agent

3、在zabbix server端测试

zabbix_get -s 192.168.2.235 -p 10050 -k "Redis.Info[used_cpu_user]"

四、创建并导入监控模板

1、创建监控模板文件
redis-template.xml文件内容如下



2.0
2014-08-07T10:04:35Z


RedisMontior


Templates







{RedisMontior:Redis.Status.last(0)}=0
Redis is down

0
5

0





Redis Client
900
200
0.0000
100.0000
1
1
0
1
0
0.0000
0.0000
0
0
0
0


0
0
C80000
0
2
0

RedisMontior
Redis.Info[blocked_clients]



1
0
00C800
0
2
0

RedisMontior
Redis.Info[connected_clients]





Redis CPU
900
200
0.0000
100.0000
1
1
0
1
0
0.0000
0.0000
0
0
0
0


0
2
C80000
0
2
0

RedisMontior
Redis.Info[used_cpu_sys]



1
2
00C800
0
2
0

RedisMontior
Redis.Info[used_cpu_user]



2
2
0000C8
0
2
0

RedisMontior
Redis.Info[used_cpu_sys_children]



3
2
C800C8
0
2
0

RedisMontior
Redis.Info[used_cpu_user_children]





Redis DbKeys
900
200
0.0000
100.0000
1
1
0
1
0
0.0000
0.0000
0
0
0
0


0
2
C80000
0
2
0

RedisMontior
Redis.Info[db0,avg_ttl]



1
2
00C800
0
2
0

RedisMontior
Redis.Info[db0,expires]



2
2
0000C8
0
2
0

RedisMontior
Redis.Info[db0,keys]





Redis Memory
900
200
0.0000
100.0000
1
1
0
1
0
0.0000
0.0000
0
0
0
0


0
2
C80000
0
2
0

RedisMontior
Redis.Info[used_memory]



1
2
00C800
0
2
0

RedisMontior
Redis.Info[used_memory_lua]



2
2
0000C8
0
2
0

RedisMontior
Redis.Info[used_memory_peak]



3
2
C800C8
0
2
0

RedisMontior
Redis.Info[used_memory_rss]





Redis WriteStatus
900
200
0.0000
100.0000
1
1
0
1
0
0.0000
0.0000
0
0
0
0


0
2
C80000
0
2
0

RedisMontior
Redis.Info[aof_last_bgrewrite_status]



1
2
0000C8
0
2
0

RedisMontior
Redis.Info[rdb_last_bgsave_status]



2
2
00C800
0
2
0

RedisMontior
Redis.Info[aof_last_write_status]





模板下载地址:
https://pan.baidu.com/s/1Iqqr_ad1V_Vzt9H4DyDoeA
密码:m15h
2、导入监控模板
配置—模板—导入

点击“选择文件”,找到redis-template.xml文件,将其导入

五、给主机添加监控模板


六、监控效果图



脚本、模板下载地址:
https://pan.baidu.com/s/1UnS4e4FFv1rZW-ctdrdQHA
密码:1glw

以上内容,如有问题 随时沟通
QQ: 936172842

转载于:https://blog.51cto.com/13120271/2317181

你可能感兴趣的:(线上zabbix监控redis和redis集群)