序
本文主要研究一下redis的slowlog与latency monitor
slowlog
redis在2.2.12版本引入了slowlog,用于记录超过指定执行时间的命令,这个执行时间不包括诸如与客户端通信的IO操作耗时,是实实在在的命令执行的耗时。主要有如下操作:
查看slowlog的数量
127.0.0.1:6379> slowlog len
(integer) 1024
查看slowlog的执行耗时阈值
127.0.0.1:6379> config get slowlog-log-slower-than
1) "slowlog-log-slower-than"
2) "1000"
设置slowlog的执行耗时阈值
127.0.0.1:6379> config set slowlog-log-slower-than 1000
OK
查看slowlog保存数量的阈值
127.0.0.1:6379> config get slowlog-max-len
1) "slowlog-max-len"
2) "1024"
设置slowlog保存数量的阈值
127.0.0.1:6379> config set slowlog-max-len 1024
OK
查询slowlog
127.0.0.1:6379> slowlog get 1
1) 1) (integer) 76016
2) (integer) 1537250266
3) (integer) 48296
4) 1) "COMMAND"
第一行是命令id,第二行是timestamp,第三行是执行耗时,第四行是命令及参数
清除slowlog记录
127.0.0.1:6379> slowlog reset
OK
latency monitor
redis在2.8.13版本引入了latency monitoring,这里主要是监控latency spikes(延时毛刺
)。它基于事件机制进行监控,command事件是监控命令执行latency,fast-command事件是监控时间复杂度为O(1)及O(logN)命令的latency,fork事件则监控redis执行系统调用fork(2)的latency。主要有如下操作:
设置/开启latency monitor
127.0.0.1:6379> config set latency-monitor-threshold 100
OK
读取latency monitor配置
127.0.0.1:6379> config get latency-monitor-threshold
1) "latency-monitor-threshold"
2) "100"
获取最近的latency
127.0.0.1:6379> debug sleep 1
OK
(1.01s)
127.0.0.1:6379> debug sleep .25
OK
127.0.0.1:6379> latency latest
1) 1) "command"
2) (integer) 1537268070
3) (integer) 250
4) (integer) 1010
返回事件名、发生的时间戳、最近的延时(毫秒
)、最大的延时(毫秒
)
查看某一事件的延时历史
127.0.0.1:6379> latency history command
1) 1) (integer) 1537268064
2) (integer) 1010
2) 1) (integer) 1537268070
2) (integer) 250
查看事件延时图
127.0.0.1:6379> latency reset command
(integer) 0
127.0.0.1:6379> debug sleep .1
OK
127.0.0.1:6379> debug sleep .2
OK
127.0.0.1:6379> debug sleep .3
OK
127.0.0.1:6379> debug sleep .4
OK
127.0.0.1:6379> debug sleep .5
OK
(0.50s)
127.0.0.1:6379> latency graph
(error) ERR syntax error
127.0.0.1:6379> latency graph command
command - high 500 ms, low 100 ms (all time high 500 ms)
--------------------------------------------------------------------------------
_#
_||
_|||
_||||
22117
4062s
ssss
重置/清空事件数据
127.0.0.1:6379> latency reset command
(integer) 1
127.0.0.1:6379> latency history command
(empty list or set)
127.0.0.1:6379> latency latest
(empty list or set)
诊断建议
127.0.0.1:6379> latency doctor
Dave, I have observed latency spikes in this Redis instance. You don't mind talking about it, do you Dave?
1. command: 6 latency spikes (average 257ms, mean deviation 142ms, period 3.83 sec). Worst all time event 500ms.
I have a few advices for you:
- Check your Slow Log to understand what are the commands you are running which are too slow to execute. Please check http://redis.io/commands/slowlog for more information.
- Deleting, expiring or evicting (because of maxmemory policy) large objects is a blocking operation. If you have very large objects that are often deleted, expired, or evicted, try to fragment those objects into multiple smaller objects.
小结
- redis的slowlog在2.2.12版本引入,latency monitor在2.8.13版本引入
- slowlog仅仅是记录纯命令的执行耗时,不包括与客户端的IO交互及redis的fork等耗时
- latency monitor监控的latency spikes则范围广一点,不仅包括命令执行,也包括fork(2)系统调用,key过期等操作的耗时
doc
- slowlog
- latency-monitor
- latency-monitor中文版