Redis缓存分析:大Key分析

业务节点将缓存 or 持久化数据写入Redis之后,在多数DBA or DBA DevOps的工作场景中,不可避免的要涉及Redis缓存分析,其主要包含大Key热key的分析。

基于redis-cli的bigkeys分析

redis-cli -h ${HowUger_redis_addr} -p ${HowUger_redis_port} -a ${HowUger_redis_auth} --bigkeys

# Scanning the entire keyspace to find biggest keys as well as
# average sizes per key type.  You can use -i 0.1 to sleep 0.1 sec
# per 100 SCAN commands (not usually needed).
[00.00%] Biggest zset   found so far 'testzset' with 129 members
[00.00%] Biggest hash   found so far 'h2' with 513 fields
[00.00%] Biggest set    found so far 'si1' with 5 members
[00.00%] Biggest hash   found so far 'h4' with 514 fields
[00.00%] Biggest string found so far 'key' with 9 bytes
-------- summary -------
Sampled 9 keys in the keyspace!
Total key length in bytes is 27 (avg len 3.00)
Biggest string found 'key' has 9 bytes
Biggest    set found 'si1' has 5 members
Biggest   hash found 'h4' has 514 fields
Biggest   zset found 'testzset' has 129 members
1 strings with 9 bytes (11.11% of keys, avg size 9.00)
0 lists with 0 items (00.00% of keys, avg size 0.00)
2 sets with 8 members (22.22% of keys, avg size 4.00)
4 hashs with 1541 fields (44.44% of keys, avg size 385.25)
2 zsets with 132 members (22.22% of keys, avg size 66.00)
0 streams with 0 entries (00.00% of keys, avg size 0.00)

简单总结下:

  1. Bigkeys实际是使用scan方式对Redis中所有key进行扫描统计,因此无需担心其对Redis造成阻塞。
  2. Bigkeys主要应用于统计以string为主要存储类型的key分析场景
  3. Bigkeys对于list,set,zset和hash等其他Redis数据类型的统计是以元素个数作为衡量标准的,可以说对于非string类型key的统计bigkeys的结论是模糊且没有意义的。

基于debug object key的序列化分析

HowUger_redis:orz> hmset myhash k1 v1 k2 v2 k3 v3
OK
HowUger_redis:orz> debug object myhash
Value at:0x7f005c6920a0 refcount:1 encoding:ziplist serializedlength:36 lru:3341677 lru_seconds_idle:2

debug object key基本输出说明如下:

  • Value at:key的内存地址
  • refcount:引用次数
  • encoding:编码类型
  • serializedlength:序列化长度
  • lru_seconds_idle:空闲时间

官方不建议在客户端使用debug object key命令,原话是这样婶儿滴:

DEBUG OBJECT is a debugging command that should not be used by clients.

说几个问题:

  1. serializedlength是key序列化后的长度,并不是key在内存中的真正长度。就像一个数组在json_encode后的长度与其在内存中的真正长度并不相同。
  2. serializedlength会对string做一些可能的压缩。如果有些string的压缩比特别高,那么在比较时就会出事情
  3. 计算serializedlength的代价相对高,少尝试,慎用别开玩笑。(多数云供应商的DCS服务会直接禁用客户端调用debug object命令哦)

基于rdbtools的rdb文件分析

rdb -c memory /tmp/HowUger_redis_dump.rdb

database,type,key,size_in_bytes,encoding,num_elements,len_largest_element,expiry
0,hash,data:index_flow_yingshi,10492,hashtable,1,8992,2019-01-14T08:20:10.236000
0,hash,data:index_movie,22068,hashtable,7,2896,2019-01-14T07:29:19.685000
0,string,block:index_module_novel,8296,string,7694,7694,2019-01-13T00:27:46.128000
0,string,block:index_bottom_baike_aikan,8296,string,7632,7632,2019-01-14T02:27:11.850000
0,string,block:index_bottom_tools,5224,string,4549,4549,2019-01-13T01:02:09.171000
0,string,block:index_module_travel,7272,string,6408,6408,2019-01-13T00:43:39.478000
...

总结说明:

  1. python2.4以上版本
  2. git仓
  3. 基于离线数据rdb文件
  4. 包含但不仅限于rdb,还有redis-memory-for-key,RdbParser等待
  5. 加分依赖包python-lzf
  6. rdb文件分析、生成json或csv格式报告、单key内存分析、rdb解析器等等,更多具体功能看git吧。
  7. 通过分析rdb文件中的key及value,反算出该kv在内存中的大小。计算时考虑了数据类型的影响,key本身长度的影响,内存分配等多种因素。虽然得出的大小不是真实值,但用于key大小的统计已经是完全足够了。

你可能感兴趣的:(Redis缓存分析:大Key分析)