写这篇文章是因为近期在准备双11大促资源的盘点,盘点过程中发现部门占用的redis空间总共720G已经接近占满了,正常情况下第一反应是联系采购新的服务器扩容内存,但是因为好奇我们的redis集群内部数据占用情况而打算先分析一下,这才有这篇文章,也给所有想对redis内存存储一窥究竟的同学提供一个思路。
在github上有两个分析redis rdb文件的开源工具redis-rdb-tools,rdr。redis-rdb-tools用于分析所有key及占用空间;rdr能够分析出所有key但是没法计算key占用空间,不过额外提供图形化界面。
安装介绍
Rdbtools is a parser for Redis' dump.rdb files.
The parser generates events similar to an xml sax parser, and is very efficient memory wise.
In addition, rdbtools provides utilities to :
1. Generate a Memory Report of your data across all databases and keys
2. Convert dump files to JSON
3. Compare two dump files using standard diff tools
Pre-Requisites :
1. python-lzf is optional but highly recommended to speed up parsing.
2. redis-py is optional and only needed to run test cases.
To install from PyPI (recommended) :
1.pip install rdbtools python-lzf
To install from source :
1.git clone https://github.com/sripathikrishnan/redis-rdb-tools
2.cd redis-rdb-tools
3.sudo python setup.py install
功能分析
生成所有的key/value
> rdb --command json /var/redis/6379/dump.rdb
[{
"user003":{"fname":"Ron","sname":"Bumquist"},
"lizards":["Bush anole","Jackson's chameleon","Komodo dragon","Ground agama","Bearded dragon"],
"user001":{"fname":"Raoul","sname":"Duke"},
"user002":{"fname":"Gonzo","sname":"Dr"},
"user_list":["user003","user002","user001"]},{
"baloon":{"helium":"birthdays","medical":"angioplasty","weather":"meteorology"},
"armadillo":["chacoan naked-tailed","giant","Andean hairy","nine-banded","pink fairy"],
"aroma":{"pungent":"vinegar","putrid":"rotten eggs","floral":"roses"}
}]
生成所有key/value以及对应的存储空间
> rdb -c memory /var/redis/6379/dump.rdb --bytes 128 -f memory.csv
> cat memory.csv
database,type,key,size_in_bytes,encoding,num_elements,len_largest_element
0,list,lizards,241,quicklist,5,19
0,list,user_list,190,quicklist,3,7
2,hash,baloon,138,ziplist,3,11
2,list,armadillo,231,quicklist,5,20
2,hash,aroma,129,ziplist,3,11
安装介绍
下载:http://ohjx11q65.bkt.clouddn.com/rdr
赋予执行权限:$ chmod a+x ./rdr
功能分析
$ ./rdr keys example.rdb
portfolio:stock_follower_count:ZH314136
portfolio:stock_follower_count:ZH654106
portfolio:stock_follower:ZH617824
portfolio:stock_follower_count:ZH001019
portfolio:stock_follower_count:ZH346349
portfolio:stock_follower_count:ZH951803
portfolio:stock_follower:ZH924804
portfolio:stock_follower_count:INS104806
工具使用的截图
定位hash数据结构的key
from rediscluster import StrictRedisCluster
from redis import Redis
import time
import threading
conn_list = [
"1.1.1.1:2222",
"2.2.2.2:222",
]
def scan_key(conn):
data = conn.split(":")
print data[0],data[1]
redis_client = Redis(data[0], data[1])
cursor = 0
count = 0
while True:
// 根据前缀如fuck_redis进行扫描
scan_result = redis_client.scan(cursor, "fuck_redis*", 400)
cursor = scan_result[0]
key_list = scan_result[1]
if len(key_list) > 0:
for key in key_list:
# redis_cluster.delete(key)
print key, redis_client.ttl(key)
if cursor == 0:
break
for conn in conn_list:
// 启动多线程进行key的扫描
new_thread = threading.Thread(target=scan_key, args=(conn,))
new_thread.start()
删除hash数据结构的key
from rediscluster import StrictRedisCluster
nodes = [
{"host": "1.1.1.1", "port": "6451"}
]
// redis3.0内部通过集群方式进行删除
redis_cluster = StrictRedisCluster(startup_nodes=nodes, decode_responses=True)
// 待删除的hash key
key_list = [xxx__ooo_0"]
// 遍历所有待删除的key进行温和删除
for hash_key in key_list:
cursor = 0
count = 0
while True:
// 温和的采用hscan进行遍历删除
scan_result = redis_cluster.hscan(hash_key, cursor, count=200)
cursor = scan_result[0]
result = scan_result[1]
for key in result:
redis_cluster.hdel(hash_key, key)
if count%200 == 0:
print hash_key, key
if cursor == 0:
redis_cluster.delete(hash_key)
break
深入到细节才能从本质上解决问题,不加分析就直接扩容其实在某种程度上掩盖了所有问题,不可取!