1.首先它默认的散列算法是MurmurHash 2.0
想了解这个算法的同学请移步至:http://murmurhash.googlepages.com/
2. 接下来具体介绍下是怎么实现的.
既然要做水平拆分,那就要有多个redis服务实例,同时要有一定的散列规则,这两个条件不仅仅体现在数据库方面,其他nosql数据库,或者是基于一些搜索引擎要做散列也都是这么做的
2.1我先用自然语言描述一下:它散列的规则是基于你传入的key为基础的,之后通过hash算法,获取到hash值,再从切片的集合中,获取到指定的hash值对应的redis服务器,然后发起请求.而这里比较复杂一些的是,这个集合是怎么生成的以及怎么获取对应hash值的服务的.我们还是让代码来说话吧:
以下是切片集合初始化的代码
Java代码
private void initialize(List shards) { nodes = new TreeMap(); for (int i = 0; i != shards.size(); ++i) { final S shardInfo = shards.get(i); if (shardInfo.getName() == null) for (int n = 0; n < 160 * shardInfo.getWeight(); n++) { nodes.put(this.algo.hash("SHARD-" + i + "-NODE-" + n), shardInfo); } else for (int n = 0; n < 160 * shardInfo.getWeight(); n++) { nodes.put(this.algo.hash(shardInfo.getName() + "*" + shardInfo.getWeight() + n), shardInfo); } resources.put(shardInfo, shardInfo.createResource()); } }
接下来是,根据key获取切片的代码:
Java代码
public S getShardInfo(byte[] key) { SortedMap tail = nodes.tailMap(algo.hash(key)); if (tail.size() == 0) { return nodes.get(nodes.firstKey()); } return tail.get(tail.firstKey()); }
(1)我们可以看到,切片集合的key生成策略(默认情况下)为("SHARD-" + i + "-NODE-" + n)的hash值,其中i,表示切片的编号,n表示0-160的数字,这么做应该是为了散列的更加均匀一些吧(有高手知道的指点一下^^).
(2)而获取切片的部分呢??看起来有点怪怪的,但是我解释下你就很容易懂了,首先key是你传入的key对应的字节数组(按utf-8的编码),首先将这个key进行hash,之后,它配合我们的切片集合中做了一个tailMap的操作,这个操作的作用是获取大于等于指定值的集合(有序),不太懂?看代码吧-_-!!!
Java代码
TreeMap treeMap = new TreeMap(); treeMap.put(1L, "A"); treeMap.put(2L, "B"); treeMap.put(4L, "E"); treeMap.put(3L, "C"); treeMap.put(5L, "F"); System.out.println(treeMap.tailMap(3L));
输出的结果是:{3=C, 4=E, 5=F}
之后将返回结果集合的第一个value
OK,了解了原理,要实现python版本的就不难了,首先我们要安装两个库--pyhash,redis(都支持easy_install)
实现的代码如下:
Python代码
import redis,pyhash; hasher = pyhash.murmur2_x64_64a(); ''''' this is a shard redis which transplant from jedis(redis java client) ''' class ShardJedis(object): def __init__(self,redisServerStrs): self.nodesMap = {}; for i in range(len(redisServerStrs)): for n in range(160): hashKey = "SHARD-" + str(i) + "-NODE-" + str(n); mapKeyHash = self.getHash(hashKey); mapKeyHash = self.changePyLong2JavaLong(mapKeyHash); self.nodesMap[mapKeyHash] = redisServerStrs[i]; ''''' change py long value to java long ''' def changePyLong2JavaLong(self,pyLong): if(pyLong > 9223372036854775807):#max long value in java pyLong = (pyLong+2**63)%2**63 - 2**63; return pyLong; ''''' get hash value by murmur2_x64_64a ''' def getHash(self,key): hashCode = hasher(key,seed=0x1234ABCD); hashCode = self.changePyLong2JavaLong(hashCode); return hashCode; def getShardInfo(self,key): hashKey = self.getHash(key); nodeKeys = self.nodesMap.keys(); nodeKeys.sort(); resultKey = nodeKeys[0]; print nodeKeys; for nodeKey in nodeKeys: if(nodeKey >= hashKey): resultKey = nodeKey; break; return self.nodesMap.get(resultKey); ''''' get redis client by key ''' def getRedis(self,key): redisInfo = self.getShardInfo(key); redisInfos = redisInfo.split(":"); print redisInfo; return redis.StrictRedis(host = redisInfos[0],port = int(redisInfo[1]));
以下是测试代码:
Python代码
def testGetRedis(self): serverInfos = ["localhost:10000", "localhost:10001", "localhost:10002", "localhost:10003"]; sharedJedis = SharedJedis(serverInfos); redis = sharedJedis.getRedis("aa"); print (redis)
这个是python2的实现版本,功能比较简单,如果有什么问题,望指正!