原创作品,转载请标明:http://blog.csdn.net/Xiejingfa/article/details/51072326
在前面的一篇文章【Redis源码剖析】 - Redis内置数据结构之双向链表中,我们介绍了Redis封装的一种“传统”双向链表list,分别使用prev、next指针来指向当前节点的前一个节点和下一个节点。这种数据结构需要花费额外的空间存储两个指针,空间利用率不高。比如,考虑比较极端的情况,如果每个节点保存的只是一个char类型的字符,在32位机上每个节点需要占用9个字节,而其中只有1个字节用来存储数据,其余8个字节需要用来存储指针。为了解决这个问题,Redis提供了一种称作ziplist的数据结构,称作压缩列表。ziplist使用一串字符串来实现双向链表结构。与list结构相比,ziplist可以减少存储空间(主要是节省了链表指针的存储空间开销),ziplist在实现上并不存储上指向上一个链表节点和指向下一个链表节点的指针,而转为存储上一个节点长度和当前节点长度从而在大多数情况下节省很多空间开销。
ziplist结构主要定义在ziplist.h和ziplist.c两个文件中,下面我们就来看看ziplist在Redis中的实现。
ziplist的存储结构如下图:
上图中各个域的作用如下:
为了快速取出ziplist中各个域的数据,ziplist.c中定义了下面的宏操作:
/* Utility macros */
/* 下面是一些可以用于直接定位的工具宏 */
/* 访问ziplist的zlbytes字段 */
#define ZIPLIST_BYTES(zl) (*((uint32_t*)(zl)))
/* 访问ziplist的zltail字段*/
#define ZIPLIST_TAIL_OFFSET(zl) (*((uint32_t*)((zl)+sizeof(uint32_t))))
/* 获取ziplist的zllen字段 */
#define ZIPLIST_LENGTH(zl) (*((uint16_t*)((zl)+sizeof(uint32_t)*2)))
/* ziplist头部长度: 4字节的zlbytes + 4字节的zltail + 2字节的zllen */
#define ZIPLIST_HEADER_SIZE (sizeof(uint32_t)*2+sizeof(uint16_t))
/* 获取ziplist的第一个节点的首地址 */
#define ZIPLIST_ENTRY_HEAD(zl) ((zl)+ZIPLIST_HEADER_SIZE)
/* 获取ziplist的最后一个节点的首地址 */
#define ZIPLIST_ENTRY_TAIL(zl) ((zl)+intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl)))
/* 获取ziplist的结尾符 */
#define ZIPLIST_ENTRY_END(zl) ((zl)+intrev32ifbe(ZIPLIST_BYTES(zl))-1)
前面我们看到,一个ziplist可以包含多个节点(在ziplist称作entry),每个节点可以保存一个整数或一个字符数组(或称作字符串,但这里并不是C风格字符串,因为其结尾不含‘\0’。为了方面后面我们统一称为字符串)。
节点的定义如下:
/* 压缩链表结构体 */
typedef struct zlentry {
// prevrawlen为上一个链表节点占用的长度
// prevrawlensize为存储上一个链表节点的长度数值所需要的字节数
unsigned int prevrawlensize, prevrawlen;
// len为当前链表节点占用的长度
// lensize为存储当前链表节点长度数值所需要的字节数
unsigned int lensize, len;
// 当前链表节点的头部大小(prevrawlensize + lensize),即非数据域的大小
unsigned int headersize;
// 编码方式
unsigned char encoding;
// 压缩链表以字符串的形式保存,该指针指向当前节点起始位置
unsigned char *p;
} zlentry;
如果光看zlentry结构体的定义,我们很难想象每个节点在ziplist是如何存储的,这里我们暂且将zlentry抛在一边,先来看看节点在ziplist的存储结构。如下图:
ziplist中的节点包括三部分信息:
(1)、第一部分存储的是上一个链表节点的长度prev_entry_length,利用这个值就可以通过指针运算直接跳转到前一个节点。
prev_entry_length域,也就是上一个链表节点的长度占用的字节数根据编码类型而定,可能是1个字节或者5个字节:
(2)第二部分存储的是编码类型encoding和当前节点的长度cur_entry_length。其中encoding域占用2个bit,这样encoding就有四种不同的取值,分别是00、01、10、11。redis规定如下:
具体如下:
对于字符串编码,规则如下:
编码 | 编码长度 | 含义 |
---|---|---|
00pppppp | 1 byte | 表示长度小于等于63(只有后六位存放字符串长度,2^6 - 1 = 63)字节的字符串,后6位用于存储字符串长度。 |
01pppppp,qqqqqqqq | 2 bytes | 表示长度小于等于16383(2^14 - 1)字节的字符串,后14用于存储字符串长度 |
10______,qqqqqqqq,rrrrrrrr,ssssssss,tttttttt | 5 bytes | 表示长度大于等于16384字节的字符串,前1个字节的后6位无意义,后4个字节用来存储字符串长度 |
对于整形编码,规则如下:
编码 | 编码长度 | 含义 |
---|---|---|
11000000 | 1 byte | int16_t整型 |
11010000 | 1 byte | int32_t整型 |
11100000 | 1 byte | int64_t整型 |
11110000 | 1 byte | 24bit有符号整数 |
11111110 | 1 byte | 8bit有符号整型 |
1111xxxx | 1 byte | 4bit无符号整型,表示[0,12]范围的数 |
(3)、第三部分为数据data域,即该节点真正存放的数据,其类型由第二部分的encoding和cur_entry_length决定。
介绍了ziplist节点的存储结构,我们可以看到zlentry结构和节点在ziplist的真实的存储结构并不是一一对应的。为了便于理解我们可以将第一部分prev_entry_length域看做对prevrawlensize、prevrawlen字段的抽象,将第二部分cur_entry_length域看做是对lensize、len字段的抽象。另外,我们经常需要跳过节点的header部分(第一部分和第二部分)读取节点真正存储的数据,所以zlentry结构定义了headersize字段记录节点头部长度。我们看看ziplist怎么从一段字符数组转换为zlentry结构:
/* 将p指向的内容解析为一个链表节点zlentry结构并返回 */
static zlentry zipEntry(unsigned char *p) {
zlentry e;
// 求出上一个节点的长度prevrawlen和存储该数值所占用的字节数prevrawlensize
ZIP_DECODE_PREVLEN(p, e.prevrawlensize, e.prevrawlen);
// 求出当前节点的长度len和存储该长度值所占用的字节数lensize,以及编码方式encoding
ZIP_DECODE_LENGTH(p + e.prevrawlensize, e.encoding, e.lensize, e.len);
e.headersize = e.prevrawlensize + e.lensize;
e.p = p;
return e;
}
到这里我们可以看到,整个ziplist的存储结构是这样的:
当一个新的节点插入链表时,如果原节点的prevlen不足以保存新节点的长度,那么就需要对原节点的空间进行扩展,也就是从1个字节扩展到5个字节。特别是这种扩展操作又可能导致下一个节点需要扩展……如此反复,这种情况在多个连续节点的长度都接近254(上一节点长度小于254只要1个字节保存即可)的时候很可能发生。
__ziplistCascadeUpdate就是用来处理这种级联扩展操作。
另外,还可能出现相反的情况:因为插入节点的长度比较小而引起连续的缩小操作。但是,为了避免出现“扩展-缩小-扩展-缩小”这种“抖动”情况反复出现,redis对这种因插入节点的长度较小而引起的缩小操作采取“不处理”的策略,也就是任由prevlen比所需的长度长。
该函数返回更新后的ziplist,参数p指向需要扩展prevlensize的节点首地址。该函数具体如下:
static unsigned char *__ziplistCascadeUpdate(unsigned char *zl, unsigned char *p) {
size_t curlen = intrev32ifbe(ZIPLIST_BYTES(zl)), rawlen, rawlensize;
size_t offset, noffset, extra;
unsigned char *np;
zlentry cur, next;
// 从p指向的节点开始遍历到ziplist列表尾部
while (p[0] != ZIP_END) {
// 取得当前节点
cur = zipEntry(p);
// 当前节点的占用的字节数
rawlen = cur.headersize + cur.len;
// 存储rawlen所需要的字节数
rawlensize = zipPrevEncodeLength(NULL,rawlen);
/* Abort if there is no next entry. */
// 如果到达表尾,直接退出
if (p[rawlen] == ZIP_END) break;
// 获得下一个节点
next = zipEntry(p+rawlen);
/* Abort when "prevlen" has not changed. */
// 如果下一个节点的prevlen等于当前节点的rawlen,则此后的节点都无需调整,直接退出
if (next.prevrawlen == rawlen) break;
// 下一个节点的长度空间不足,需要进行扩展操作
if (next.prevrawlensize < rawlensize) {
/* The "prevlen" field of "next" needs more bytes to hold * the raw length of "cur". */
// 下面的ziplistResize发生了空间的重新分配,所以需要记录p对于zl的偏移量
offset = p-zl;
// 求出需要扩展的字节数
extra = rawlensize-next.prevrawlensize;
zl = ziplistResize(zl,curlen+extra);
// ziplistResize发生了空间的重新分配,这里重新获取p指针
p = zl+offset;
/* Current pointer and offset for next element. */
// 新的下一个节点的首地址
np = p+rawlen;
noffset = np-zl;
/* Update tail offset when next element is not the tail element. */
// zl+intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))为最后一个节点的首地址
// 如果下一个节点不是最后一个节点,发生扩展操作需要更新最后一个节点的偏移量
if ((zl+intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))) != np) {
ZIPLIST_TAIL_OFFSET(zl) =
intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))+extra);
}
/* Move the tail to the back. */
// 这里将原节点的下一个节点的数据区域到ziplist尾部的全部数据向后偏移,空余出rawlensize个字节
// 用来存放上一个节点的长度
memmove(np+rawlensize,
np+next.prevrawlensize,
curlen-noffset-next.prevrawlensize-1);
// 空余出来的rawlensize个字节用来存储上一个节点的长度
zipPrevEncodeLength(np,rawlen);
/* Advance the cursor */
// 指向下一个节点
p += rawlen;
// 更新当前节点的长度
curlen += extra;
} else {
// 如果下一节点的长度空间有冗余,则不进行压缩以防止“抖动”现象。
if (next.prevrawlensize > rawlensize) {
/* This would result in shrinking, which we want to avoid. * So, set "rawlen" in the available bytes. */
zipPrevEncodeLengthForceLarge(p+rawlen,rawlen);
} else {
zipPrevEncodeLength(p+rawlen,rawlen);
}
/* Stop here, as the raw length of "next" has not changed. */
break;
}
}
return zl;
}
删除节点后也需要考虑级联扩展的情况。
/* Delete "num" entries, starting at "p". Returns pointer to the ziplist. */
/* 从p指针开始删除num个节点*/
static unsigned char *__ziplistDelete(unsigned char *zl, unsigned char *p, unsigned int num) {
unsigned int i, totlen, deleted = 0;
size_t offset;
int nextdiff = 0;
zlentry first, tail;
// 需要删除的首个节点
first = zipEntry(p);
for (i = 0; p[0] != ZIP_END && i < num; i++) {
// 偏移到下一个节点
p += zipRawEntryLength(p);
// 统计待删除节点数量
deleted++;
}
// 得到待删除节点的字节总数
totlen = p-first.p;
if (totlen > 0) {
// 注意此时p指向的是待删除节点后第一个不被删除的节点
if (p[0] != ZIP_END) {
/* Storing `prevrawlen` in this entry may increase or decrease the
* number of bytes required compare to the current `prevrawlen`.
* There always is room to store this, because it was previously
* stored by an entry that is now being deleted. */
// 计算待删除的第一个节点first的prevrawlensize与p节点prevrawlensize的差值
nextdiff = zipPrevLenByteDiff(p,first.prevrawlen);
// 根据nextdiff的值对p进行前移或后移操作,用来保存frist节点上一个节点的长度,即first.prevrawlen
p -= nextdiff;
// 删除后first节点前一个节点的下一个节点就是p节点,更新p节点的prevrawlen数值
zipPrevEncodeLength(p,first.prevrawlen);
/* Update offset for tail */
// 更新最后一个节点的偏移量
ZIPLIST_TAIL_OFFSET(zl) =
intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))-totlen);
/* When the tail contains more than one entry, we need to take
* "nextdiff" in account as well. Otherwise, a change in the
* size of prevlen doesn't have an effect on the *tail* offset. */
tail = zipEntry(p);
// 如果p节点不是尾节点,尾节点的偏移量还需要加上nextdiff值
if (p[tail.headersize+tail.len] != ZIP_END) {
ZIPLIST_TAIL_OFFSET(zl) =
intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))+nextdiff);
}
/* Move tail to the front of the ziplist */
// 删除first.p到p节点之间的节点,其实就是简单的数据移动操作
// 这里为什么要减1呢,因为zlend不需要处理,在后面的ziplistResize中重新设置了zlend
memmove(first.p,p,
intrev32ifbe(ZIPLIST_BYTES(zl))-(p-zl)-1);
} else {
/* The entire tail was deleted. No need to move memory. */
// 如果已经删除到zlend,值最后一个节点就是first节点的前一个节点,需要更新其偏移量
ZIPLIST_TAIL_OFFSET(zl) =
intrev32ifbe((first.p-zl)-first.prevrawlen);
}
/* Resize and update length */
// 重新调整ziplist大小,更新其长度
offset = first.p-zl;
zl = ziplistResize(zl, intrev32ifbe(ZIPLIST_BYTES(zl))-totlen+nextdiff);
ZIPLIST_INCR_LENGTH(zl,-deleted);
p = zl+offset;
/* When nextdiff != 0, the raw length of the next entry has changed, so
* we need to cascade the update throughout the ziplist */
// 如果nextdiff的值不为0,说明p节点的长度发生改变,需要执行级联更新操作
if (nextdiff != 0)
zl = __ziplistCascadeUpdate(zl,p);
}
return zl;
}
/* 在p节点前插入一个新节点。各参数的含义如下: zl:ziplist首地址 p:插入位置 s:待插入字符串的首地址 slen:带插入字符串长度 */
static unsigned char *__ziplistInsert(unsigned char *zl, unsigned char *p, unsigned char *s, unsigned int slen) {
size_t curlen = intrev32ifbe(ZIPLIST_BYTES(zl)), reqlen;
unsigned int prevlensize, prevlen = 0;
size_t offset;
int nextdiff = 0;
unsigned char encoding = 0;
long long value = 123456789; /* initialized to avoid warning. Using a value that is easy to see if for some reason we use it uninitialized. */
zlentry tail;
/* Find out prevlen for the entry that is inserted. */
if (p[0] != ZIP_END) {
// 如果p节点后面还有节点,取出p节点前一个节点的长度信息和存储该长度值所需要的字节数信息
ZIP_DECODE_PREVLEN(p, prevlensize, prevlen);
} else {
// 如果p节点为ziplist结束标识,则取出尾节点,即最后一个节点
unsigned char *ptail = ZIPLIST_ENTRY_TAIL(zl);
if (ptail[0] != ZIP_END) {
prevlen = zipRawEntryLength(ptail);
}
}
/* See if the entry can be encoded */
// 尝试看能否将s保存为整数,如果可以则返回1,且value和encoding分别保存新值和编码信息
if (zipTryEncoding(s,slen,&value,&encoding)) {
/* 'encoding' is set to the appropriate integer encoding */
// 如果s可以保存为整数,则进一步计算保存该数值所需要的字节数
reqlen = zipIntSize(encoding);
} else {
/* 'encoding' is untouched, however zipEncodeLength will use the * string length to figure out how to encode it. */
// 如果s不能保存为整数,则直接使用其字符串长度
reqlen = slen;
}
/* We need space for both the length of the previous entry and * the length of the payload. */
// 计算编码prevlen所需要的字节数,prevlen用于保存前一个节点的长度
reqlen += zipPrevEncodeLength(NULL,prevlen);
// 计算编码slen所需要的长度
reqlen += zipEncodeLength(NULL,encoding,slen);
/* When the insert position is not equal to the tail, we need to * make sure that the next entry can hold this entry's length in * its prevlen field. */
// 当插入的位置不是ziplist尾部时,需要确保下一个节点(即p节点)的prevlen能够用来保存即将插入节点的长度
// 这里计算两者差值
nextdiff = (p[0] != ZIP_END) ? zipPrevLenByteDiff(p,reqlen) : 0;
/* Store offset because a realloc may change the address of zl. */
// ziplistResize操作会重新分配空间,需要事前记录p节点偏移量
offset = p-zl;
zl = ziplistResize(zl,curlen+reqlen+nextdiff);
// 重新取得p节点
p = zl+offset;
/* Apply memory move when necessary and update tail offset. */
if (p[0] != ZIP_END) {
/* Subtract one because of the ZIP_END bytes */
/* 将原来 p-nextdiff 开始的数据全部后移,中间出现reqlen个字节保存即将插入的数据 主要需要考虑一下几种情况: nextdiff == 0:p节点中用来存储原先前一个节点长度信息的数据区域正好保存待插入节点的长度 nextdiff == 4:原先p节点只需要1个字节来存储上一个节点的长度,现在需要5个字节。那就将p-4后面的数据偏移到p+reqlen nextdiff == -4:原先p节点需要5个字节来存储上一个节点的长度,现在只需要1个字节。那就将p+4后面的数据偏移到p+reqlen */
memmove(p+reqlen,p-nextdiff,curlen-offset-1+nextdiff);
/* Encode this entry's raw length in the next entry. */
// 为p节点的prevlen设置新值,即待插入节点的长度
zipPrevEncodeLength(p+reqlen,reqlen);
/* Update offset for tail */
// 更新尾节点偏移量
ZIPLIST_TAIL_OFFSET(zl) =
intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))+reqlen);
/* When the tail contains more than one entry, we need to take * "nextdiff" in account as well. Otherwise, a change in the * size of prevlen doesn't have an effect on the *tail* offset. */
tail = zipEntry(p+reqlen);
// 同样,如果p节点不是尾节点,尾节点的偏移量还需要加上nextdiff值
if (p[reqlen+tail.headersize+tail.len] != ZIP_END) {
ZIPLIST_TAIL_OFFSET(zl) =
intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))+nextdiff);
}
} else {
/* This element will be the new tail. */
// 如果p节点指向zlend,更新zltail值,待添加节点为尾部节点
ZIPLIST_TAIL_OFFSET(zl) = intrev32ifbe(p-zl);
}
/* When nextdiff != 0, the raw length of the next entry has changed, so * we need to cascade the update throughout the ziplist */
// 同样,如果nextdiff的值不为0,说明原节点节点(此时的首地址为p+reqlen)的长度发生改变,需要执行级联更新操作
if (nextdiff != 0) {
offset = p-zl;
zl = __ziplistCascadeUpdate(zl,p+reqlen);
p = zl+offset;
}
// 下面才是真正执行插入操作
/* Write the entry */
// 填写上一节点的长度
p += zipPrevEncodeLength(p,prevlen);
// 填写当前节点的长度
p += zipEncodeLength(p,encoding,slen);
// 根据编码方式执行相应的插入操作
if (ZIP_IS_STR(encoding)) {
memcpy(p,s,slen);
} else {
zipSaveInteger(p,value,encoding);
}
// 长度加1
ZIPLIST_INCR_LENGTH(zl,1);
return zl;
}
ziplist最关键的操作就是上面介绍的三个函数,这三个函数的重点其实就是“空间的再分配 + memmove操作维护后续节点的链接性”。剩余的一些我们就不一一分析了。我把注释版的代码贴在本文后面供大家参考。
这里总结一下ziplist的优缺点:
下面是完整的注释版代码(ziplist.c):
/* The ziplist is a specially encoded dually linked list that is designed * to be very memory efficient. It stores both strings and integer values, * where integers are encoded as actual integers instead of a series of * characters. It allows push and pop operations on either side of the list * in O(1) time. However, because every operation requires a reallocation of * the memory used by the ziplist, the actual complexity is related to the * amount of memory used by the ziplist. * ziplist是一个经过特殊编码的双向链接表,其内存操作非常高效。ziplist可以存放字符串和整型,并 * 且它在头部和尾部支持O(1)的push和pop操作。但是每次操作涉及内存的重新分配释放,所以ziplist得 * 实际复杂度与其使用的内存空间相关 * * ---------------------------------------------------------------------------- * * ziplist的整体存储结构如下: * ZIPLIST OVERALL LAYOUT: * The general layout of the ziplist is as follows: * <zlbytes><zltail><zllen><entry><entry><zlend> * * <zlbytes> is an unsigned integer to hold the number of bytes that the * ziplist occupies. This value needs to be stored to be able to resize the * entire structure without the need to traverse it first. * * zlbytes是一个4字节无符号整形,存储的是整个ziplist占用的字节数。它主要用于重新分配内存时使用, * 这样就不必遍历整个列表以确定其长度。 * * <zltail> is the offset to the last entry in the list. This allows a pop * operation on the far side of the list without the need for full traversal. * zltail是一个4字节无符号整形,存储的是链表最后一个节点的偏移值,即链表开头地址 + zltail的值为 * 最后一个节点的起始地址。这样对链表尾部执行pop操作时就无需遍历链表(以找到最后一个节点) * * <zllen> is the number of entries.When this value is larger than 2**16-2, * we need to traverse the entire list to know how many items it holds. * zllen是一个2字节无符号整形,存储的是链表中的节点总数。当这个值超过2^16-2时就需要遍历整个链表 * 来获取链表的节点总数 * * <zlend> is a single byte special value, equal to 255, which indicates the * end of the list. * zlend是一个链表尾部的占位符,表示链表结束。它占用1个字节,值为255。 * * * 链表节点存储结构 * ZIPLIST ENTRIES: * Every entry in the ziplist is prefixed by a header that contains two pieces * of information. First, the length of the previous entry is stored to be * able to traverse the list from back to front. Second, the encoding with an * optional string length of the entry itself is stored. * * ziplist中每个节点的头部包含两部分的信息,第一个是上一个节点占用的长度,这样就可以从后往前遍历整个 * 列表。第二个是编码类型和当前链表节点占用的长度。 * * 也就是说ziplist的节点结构为: * <上一个链表结点占用的长度><编码方式 & 当前链表结点占用的长度><当前结点数据> * * * The length of the previous entry is encoded in the following way: * If this length is smaller than 254 bytes, it will only consume a single * byte that takes the length as value. When the length is greater than or * equal to 254, it will consume 5 bytes. The first byte is set to 254 to * indicate a larger value is following. The remaining 4 bytes take the * length of the previous entry as value. * * 上一个节点占用的长度按照如下的方式组织: * (1)、当长度值小于254时使用1个字节存储,该字节存储的数值就是上一个节点的长度值。 * (2)、当长度值大于或等于254时使用5个字节存储,第1个字节的数值为254,表示上一个节点的长度值大于等于254 * 接下来的4个字节才是真正的长度 * * The other header field of the entry itself depends on the contents of the * entry. When the entry is a string, the first 2 bits of this header will hold * the type of encoding used to store the length of the string, followed by the * actual length of the string. When the entry is an integer the first 2 bits * are both set to 1. The following 2 bits are used to specify what kind of * integer will be stored after this header. An overview of the different * types and encodings is as follows: * * 链表节点头部的第二部分内容取决于链表本身存储的内容。 * 当节点存储的是一个字符串,该部分的前2位有00、01、10共3中不同的类型 * 当节点存储的是一个整数,该部分的前2为都被设置为1(即为11),接下来的2位代表实际存储的是什么类型的整型数值 * * * |00pppppp| - 1 byte * String value with length less than or equal to 63 bytes (6 bits). * 长度小于等于63(只有后六位存放字符串长度,2^6 - 1 = 63)字节的字符串,后6位用于存储字符串长度。 * |01pppppp|qqqqqqqq| - 2 bytes * String value with length less than or equal to 16383 bytes (14 bits). * 长度小于等于16383(2^14 - 1)字节的字符串,后14用于存储字符串长度 * |10______|qqqqqqqq|rrrrrrrr|ssssssss|tttttttt| - 5 bytes * String value with length greater than or equal to 16384 bytes. * 长度大于等于16384字节的字符串,前1个字节的后6位无意义,后4个字节用来存储字符串长度 * |11000000| - 1 byte * Integer encoded as int16_t (2 bytes). * int16_t整型 * |11010000| - 1 byte * Integer encoded as int32_t (4 bytes). * int32_t整型 * |11100000| - 1 byte * Integer encoded as int64_t (8 bytes). * int64_t整型 * |11110000| - 1 byte * Integer encoded as 24 bit signed (3 bytes). * 24bit有符号整数 * |11111110| - 1 byte * Integer encoded as 8 bit signed (1 byte). * 8bit有符号整型 * |1111xxxx| - (with xxxx between 0000 and 1101) immediate 4 bit integer. * Unsigned integer from 0 to 12. The encoded value is actually from * 1 to 13 because 0000 and 1111 can not be used, so 1 should be * subtracted from the encoded 4 bit value to obtain the right value. * 4bit无符号整数,表示从[0,12]范围的数 * |11111111| - End of ziplist. * * All the integers are represented in little endian byte order. * 所有整数用小端模式表示 * * ---------------------------------------------------------------------------- * * Copyright (c) 2009-2012, Pieter Noordhuis <pcnoordhuis at gmail dot com> * Copyright (c) 2009-2012, Salvatore Sanfilippo <antirez at gmail dot com> * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are met: * * * Redistributions of source code must retain the above copyright notice, * this list of conditions and the following disclaimer. * * Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * Neither the name of Redis nor the names of its contributors may be used * to endorse or promote products derived from this software without * specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE * POSSIBILITY OF SUCH DAMAGE. */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdint.h>
#include <limits.h>
#include "zmalloc.h"
#include "util.h"
#include "ziplist.h"
#include "endianconv.h"
#include "redisassert.h"
#define ZIP_END 255 // ziplist结束标识
#define ZIP_BIGLEN 254 // ziplist节点头部第一部分需要使用到的开始标识,当上一个节点长度值大于或等于254时使用5个字节存储
/* Different encoding/length possibilities */
/* 所有编码方式汇总 */
#define ZIP_STR_MASK 0xc0 // 字符串编码 < 0xc0 (1100,0000)
#define ZIP_INT_MASK 0x30 // |11000000]
#define ZIP_STR_06B (0 << 6) // |00pppppp|
#define ZIP_STR_14B (1 << 6) // |01pppppp|qqqqqqqq|
#define ZIP_STR_32B (2 << 6) // |10______|qqqqqqqq|rrrrrrrr|ssssssss|tttttttt|
#define ZIP_INT_16B (0xc0 | 0<<4) // |11000000| - int16_t整形类型 (2 bytes).
#define ZIP_INT_32B (0xc0 | 1<<4) // |11010000| - int32_t整形类型 (4 bytes).
#define ZIP_INT_64B (0xc0 | 2<<4) // |11100000| - int64_t整形类型t (8 bytes).
#define ZIP_INT_24B (0xc0 | 3<<4) // |11110000| - 24bit有符号整数 (3 bytes).
#define ZIP_INT_8B 0xfe // |11111110| - 8bit有符号整形 (1 byte).
/* 4 bit integer immediate encoding */
#define ZIP_INT_IMM_MASK 0x0f
#define ZIP_INT_IMM_MIN 0xf1 /* 11110001 */
#define ZIP_INT_IMM_MAX 0xfd /* 11111101 */
#define ZIP_INT_IMM_VAL(v) (v & ZIP_INT_IMM_MASK)
#define INT24_MAX 0x7fffff
#define INT24_MIN (-INT24_MAX - 1)
/* Macro to determine type */
/* 判断是否是字符串编码 */
#define ZIP_IS_STR(enc) (((enc) & ZIP_STR_MASK) < ZIP_STR_MASK)
/* Utility macros */
/* 下面是一些可以用于直接定位的工具宏 */
/* 访问ziplist的zlbytes字段 */
#define ZIPLIST_BYTES(zl) (*((uint32_t*)(zl)))
/* 访问ziplist的zltail字段*/
#define ZIPLIST_TAIL_OFFSET(zl) (*((uint32_t*)((zl)+sizeof(uint32_t))))
/* 获取ziplist的zllen字段 */
#define ZIPLIST_LENGTH(zl) (*((uint16_t*)((zl)+sizeof(uint32_t)*2)))
/* ziplist头部长度: 4字节的zlbytes + 4字节的zltail + 2字节的zllen */
#define ZIPLIST_HEADER_SIZE (sizeof(uint32_t)*2+sizeof(uint16_t))
/* 获取ziplist的第一个节点 */
#define ZIPLIST_ENTRY_HEAD(zl) ((zl)+ZIPLIST_HEADER_SIZE)
/* 获取ziplist的最后一个节点 */
#define ZIPLIST_ENTRY_TAIL(zl) ((zl)+intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl)))
/* 获取ziplist的结尾符 */
#define ZIPLIST_ENTRY_END(zl) ((zl)+intrev32ifbe(ZIPLIST_BYTES(zl))-1)
/* We know a positive increment can only be 1 because entries can only be * pushed one at a time. */
/* 更新ziplist长度,如果incr为正数则只能取1,因为每次只能push一个节点 */
#define ZIPLIST_INCR_LENGTH(zl,incr) { \
if (ZIPLIST_LENGTH(zl) < UINT16_MAX) \
ZIPLIST_LENGTH(zl) = intrev16ifbe(intrev16ifbe(ZIPLIST_LENGTH(zl))+incr); \
}
/* 压缩链表结构体 */
typedef struct zlentry {
// prevrawlen为上一个链表节点占用的长度
// prevrawlensize为存储上一个链表节点的长度数值所需要的字节数
unsigned int prevrawlensize, prevrawlen;
// len为当前链表节点占用的长度
// lensize为存储当前链表节点长度数值所需要的字节数
unsigned int lensize, len;
// 当前链表节点的头部大小(prevrawlensize + lensize),即非数据域的大小
unsigned int headersize;
// 编码方式
unsigned char encoding;
// 压缩链表以字符串的形式保存,该指针指向当前节点起始位置
unsigned char *p;
} zlentry;
/* Extract the encoding from the byte pointed by 'ptr' and set it into * 'encoding'. */
/* 从ptr指向的字符串中提取出编码方式,可以以此判断是否为字符串编码 */
#define ZIP_ENTRY_ENCODING(ptr, encoding) do { \
(encoding) = (ptr[0]); \
if ((encoding) < ZIP_STR_MASK) (encoding) &= ZIP_STR_MASK; \
} while(0)
/* Return bytes needed to store integer encoded by 'encoding' */
/* 返回指定整型编码方式所占用的字节长度 */
static unsigned int zipIntSize(unsigned char encoding) {
switch(encoding) {
case ZIP_INT_8B: return 1; // 1byte
case ZIP_INT_16B: return 2; // 2bytes
case ZIP_INT_24B: return 3; // 3bytes
case ZIP_INT_32B: return 4; // 4bytes
case ZIP_INT_64B: return 8; // 8bytes
default: return 0; /* 4 bit immediate */
}
assert(NULL);
return 0;
}
/* Encode the length 'rawlen' writing it in 'p'. If p is NULL it just returns * the amount of bytes required to encode such a length. */
/* 将编码方式encoding和数据长度rawlen进行编码并写入p指向的缓冲区中,返回保存该编码所占用的字节数 */
static unsigned int zipEncodeLength(unsigned char *p, unsigned char encoding, unsigned int rawlen) {
unsigned char len = 1, buf[5];
if (ZIP_IS_STR(encoding)) {
// 处理字符串编码
/* Although encoding is given it may not be set for strings, * so we determine it here using the raw length. */
if (rawlen <= 0x3f) {
// 字符串长度小于等于63,为00编码类型
if (!p) return len;
buf[0] = ZIP_STR_06B | rawlen;
} else if (rawlen <= 0x3fff) {
// 字符串长度小于等于16383(2^14 - 1),为01编码类型
len += 1;
if (!p) return len;
buf[0] = ZIP_STR_14B | ((rawlen >> 8) & 0x3f);
buf[1] = rawlen & 0xff;
} else {
// 长度大于等于16384,为10编码类型
len += 4;
if (!p) return len;
buf[0] = ZIP_STR_32B;
buf[1] = (rawlen >> 24) & 0xff;
buf[2] = (rawlen >> 16) & 0xff;
buf[3] = (rawlen >> 8) & 0xff;
buf[4] = rawlen & 0xff;
}
} else {
// 处理整形编码
/* Implies integer encoding, so length is always 1. */
if (!p) return len;
buf[0] = encoding;
}
/* Store this length at p */
memcpy(p,buf,len);
return len;
}
/* Decode the length encoded in 'ptr'. The 'encoding' variable will hold the * entries encoding, the 'lensize' variable will hold the number of bytes * required to encode the entries length, and the 'len' variable will hold the * entries length. */
/* 从ptr指向的字符串中取出链表节点的编码、保存节点长度所需要的字节数、节点长度,并分别保存在 * encoding、lensize、len3个变量中。 */
#define ZIP_DECODE_LENGTH(ptr, encoding, lensize, len) do { \
ZIP_ENTRY_ENCODING((ptr), (encoding)); \
if ((encoding) < ZIP_STR_MASK) {
// 处理三种不同的字符串编码 \
if ((encoding) == ZIP_STR_06B) {
// 00编码方式,占用1个字节,该字节的后6位为字符串长度 \
(lensize) = 1; \
(len) = (ptr)[0] & 0x3f; // 通过bit操作获取字符串长度 \
} else if ((encoding) == ZIP_STR_14B) {
// 01编码方式,占用2个字节 \
(lensize) = 2;
/* 这里详细解释一下怎么通过位操作取得字符串长度(下面的操作类似,不再解释) 01编码方式,占用2个字节,第一个字节的后6bit和后一个字节的8bit表示字符串 长度。(ptr)[0] & 0x3f操作获取第一个字节的后6bit,然后左移8位再与第二个 字节相或就是字符串的长度 */ \
(len) = (((ptr)[0] & 0x3f) << 8) | (ptr)[1]; \
} else if (encoding == ZIP_STR_32B) {
// 10编码方式,占用5个字节 \
(lensize) = 5; \
(len) = ((ptr)[1] << 24) | \
((ptr)[2] << 16) | \
((ptr)[3] << 8) | \
((ptr)[4]); \
} else { \
assert(NULL); \
} \
} else {
// 处理整形编码 \
(lensize) = 1; \
(len) = zipIntSize(encoding); \
} \
} while(0);
/* Encode the length of the previous entry and write it to "p". Return the * number of bytes needed to encode this length if "p" is NULL. */
/* 将链表中上一个节点的长度值编码放入p指针指向的缓冲区中,返回编码后所占有的字节数 */
static unsigned int zipPrevEncodeLength(unsigned char *p, unsigned int len) {
if (p == NULL) {
return (len < ZIP_BIGLEN) ? 1 : sizeof(len)+1;
} else {
if (len < ZIP_BIGLEN) {
// 如果上一个节点的长度值小于254,只需要用一个字节表示即可
p[0] = len;
return 1;
} else {
// 当长度值大于或等于254时使用5个字节存储,第1个字节的数值为254,
// 表示上一个节点的长度值大于等于254,接下来的4个字节才是真正的长度
p[0] = ZIP_BIGLEN;
memcpy(p+1,&len,sizeof(len));
memrev32ifbe(p+1);
return 1+sizeof(len);
}
}
}
/* Encode the length of the previous entry and write it to "p". This only * uses the larger encoding (required in __ziplistCascadeUpdate). */
/* 该函数与zipPrevEncodeLength功能相似,只不过它只处理len >= 254的情形 */
static void zipPrevEncodeLengthForceLarge(unsigned char *p, unsigned int len) {
if (p == NULL) return;
p[0] = ZIP_BIGLEN;
memcpy(p+1,&len,sizeof(len));
memrev32ifbe(p+1);
}
/* Decode the number of bytes required to store the length of the previous * element, from the perspective of the entry pointed to by 'ptr'. */
/* 获取存储上一个节点长度值所占用的字节数。这个很简单,只要看每个节点第一个字节的数据即可。 具体来说如果第一字节数值小于254,则只需要1个字节即可存储上一个节点的长度,否则需要5个 字节(第1个字节的数值为254,接下来的4个字节才是真正的长度) */
#define ZIP_DECODE_PREVLENSIZE(ptr, prevlensize) do { \
if ((ptr)[0] < ZIP_BIGLEN) { \
(prevlensize) = 1; \
} else { \
(prevlensize) = 5; \
} \
} while(0);
/* Decode the length of the previous element, from the perspective of the entry * pointed to by 'ptr'. */
/* 获取上一个节点的长度 */
#define ZIP_DECODE_PREVLEN(ptr, prevlensize, prevlen) do { \
// 使用ZIP_DECODE_PREVLENSIZE获取保存上一个节点长度值所需要的字节数
ZIP_DECODE_PREVLENSIZE(ptr, prevlensize); \
if ((prevlensize) == 1) { \
// 上一个节点的长度值小于254,第一个字节的内容就是长度值
(prevlen) = (ptr)[0]; \
} else if ((prevlensize) == 5) { \
assert(sizeof((prevlensize)) == 4); \
// 上一个节点的长度值大于等于254,第2-5个字节为长度值
memcpy(&(prevlen), ((char*)(ptr)) + 1, 4); \
memrev32ifbe(&prevlen); \
} \
} while(0);
/* Return the difference in number of bytes needed to store the length of the * previous element 'len', in the entry pointed to by 'p'. */
/* 计算存储数值len所占用的字节数与当前节点头部保存上一个节点长度所占用字节数的差值 */
static int zipPrevLenByteDiff(unsigned char *p, unsigned int len) {
unsigned int prevlensize;
// 获取保存上一个节点长度值所需要的字节数,并保存在prevlensize中
ZIP_DECODE_PREVLENSIZE(p, prevlensize);
// zipPrevEncodeLength(NULL, len)为存储len占用的字节数
return zipPrevEncodeLength(NULL, len) - prevlensize;
}
/* Return the total number of bytes used by the entry pointed to by 'p'. */
/* 计算一个节点所占用的总字节数 */
static unsigned int zipRawEntryLength(unsigned char *p) {
unsigned int prevlensize, encoding, lensize, len;
// 求出存储上一个节点长度值所占用的字节数
ZIP_DECODE_PREVLENSIZE(p, prevlensize);
// 求出存储当前节点长度所占用的字节数和当前节点数据域的字节数
ZIP_DECODE_LENGTH(p + prevlensize, encoding, lensize, len);
return prevlensize + lensize + len;
}
/* Check if string pointed to by 'entry' can be encoded as an integer. * Stores the integer value in 'v' and its encoding in 'encoding'. */
/* 判断entry指向的内容是否可以编码为一个整型数据,并把该数值存放在v中,把其编码方式存放在encoding中 */
static int zipTryEncoding(unsigned char *entry, unsigned int entrylen, long long *v, unsigned char *encoding) {
long long value;
if (entrylen >= 32 || entrylen == 0) return 0;
/* string2ll定义在util.h中,它的作用是将一个字符串转换为一个long long类型整数值。如果成功返回1,失败返回0 */
if (string2ll((char*)entry,entrylen,&value)) {
/* Great, the string can be encoded. Check what's the smallest * of our encoding types that can hold this value. */
// 下面的操作根据整型值确定编码方式
if (value >= 0 && value <= 12) {
*encoding = ZIP_INT_IMM_MIN+value;
} else if (value >= INT8_MIN && value <= INT8_MAX) {
*encoding = ZIP_INT_8B;
} else if (value >= INT16_MIN && value <= INT16_MAX) {
*encoding = ZIP_INT_16B;
} else if (value >= INT24_MIN && value <= INT24_MAX) {
*encoding = ZIP_INT_24B;
} else if (value >= INT32_MIN && value <= INT32_MAX) {
*encoding = ZIP_INT_32B;
} else {
*encoding = ZIP_INT_64B;
}
*v = value;
return 1;
}
return 0;
}
/* Store integer 'value' at 'p', encoded as 'encoding' */
/* 将给定整数存入指针p指向的缓冲区中,不同大小范围的整数采取不同长度存储的方式来极大减小了小数的空间使用,这里的encoding就确定了整数范围。 */
static void zipSaveInteger(unsigned char *p, int64_t value, unsigned char encoding) {
int16_t i16;
int32_t i32;
int64_t i64;
/* 下面memrev16ifbe、memrev32ifbe、memrev64ifbe函数定义在endianconv.h中,将大端模式的数转化为小端模式 */
if (encoding == ZIP_INT_8B) {
((int8_t*)p)[0] = (int8_t)value;
} else if (encoding == ZIP_INT_16B) {
i16 = value;
memcpy(p,&i16,sizeof(i16));
memrev16ifbe(p);
} else if (encoding == ZIP_INT_24B) {
i32 = value<<8;
memrev32ifbe(&i32);
memcpy(p,((uint8_t*)&i32)+1,sizeof(i32)-sizeof(uint8_t));
} else if (encoding == ZIP_INT_32B) {
i32 = value;
memcpy(p,&i32,sizeof(i32));
memrev32ifbe(p);
} else if (encoding == ZIP_INT_64B) {
i64 = value;
memcpy(p,&i64,sizeof(i64));
memrev64ifbe(p);
} else if (encoding >= ZIP_INT_IMM_MIN && encoding <= ZIP_INT_IMM_MAX) {
/* Nothing to do, the value is stored in the encoding itself. */
} else {
assert(NULL);
}
}
/* Read integer encoded as 'encoding' from 'p' */
/* 按encoding指定的方式从p中读取一个整型数值 */
static int64_t zipLoadInteger(unsigned char *p, unsigned char encoding) {
int16_t i16;
int32_t i32;
int64_t i64, ret = 0;
if (encoding == ZIP_INT_8B) {
ret = ((int8_t*)p)[0];
} else if (encoding == ZIP_INT_16B) {
memcpy(&i16,p,sizeof(i16));
memrev16ifbe(&i16);
ret = i16;
} else if (encoding == ZIP_INT_32B) {
memcpy(&i32,p,sizeof(i32));
memrev32ifbe(&i32);
ret = i32;
} else if (encoding == ZIP_INT_24B) {
i32 = 0;
memcpy(((uint8_t*)&i32)+1,p,sizeof(i32)-sizeof(uint8_t));
memrev32ifbe(&i32);
ret = i32>>8;
} else if (encoding == ZIP_INT_64B) {
memcpy(&i64,p,sizeof(i64));
memrev64ifbe(&i64);
ret = i64;
} else if (encoding >= ZIP_INT_IMM_MIN && encoding <= ZIP_INT_IMM_MAX) {
ret = (encoding & ZIP_INT_IMM_MASK)-1;
} else {
assert(NULL);
}
return ret;
}
/* Return a struct with all information about an entry. */
/* 将p指向的内容解析为一个链表节点zlentry结构并返回 */
static zlentry zipEntry(unsigned char *p) {
zlentry e;
// 求出上一个节点的长度prevrawlen和存储该数值所占用的字节数prevrawlensize
ZIP_DECODE_PREVLEN(p, e.prevrawlensize, e.prevrawlen);
// 求出当前节点的长度len和存储该长度值所占用的字节数lensize,以及编码方式encoding
ZIP_DECODE_LENGTH(p + e.prevrawlensize, e.encoding, e.lensize, e.len);
e.headersize = e.prevrawlensize + e.lensize;
e.p = p;
return e;
}
/* Create a new empty ziplist. */
/* 创建一个空的ziplist */
unsigned char *ziplistNew(void) {
// ZIPLIST_HEADER_SIZE为ziplist的头部长度,+1指加上一个字节的结尾符
unsigned int bytes = ZIPLIST_HEADER_SIZE+1;
unsigned char *zl = zmalloc(bytes);
// 设置ziplist头部的各个属性:zlbytes、zltail、zllen、结尾符 */
ZIPLIST_BYTES(zl) = intrev32ifbe(bytes);
ZIPLIST_TAIL_OFFSET(zl) = intrev32ifbe(ZIPLIST_HEADER_SIZE);
ZIPLIST_LENGTH(zl) = 0;
zl[bytes-1] = ZIP_END;
return zl;
}
/* Resize the ziplist. */
/* 重新调整ziplist的大小 */
static unsigned char *ziplistResize(unsigned char *zl, unsigned int len) {
zl = zrealloc(zl,len);
// 更新zlbytes值
ZIPLIST_BYTES(zl) = intrev32ifbe(len);
zl[len-1] = ZIP_END;
return zl;
}
/* When an entry is inserted, we need to set the prevlen field of the next * entry to equal the length of the inserted entry. It can occur that this * length cannot be encoded in 1 byte and the next entry needs to be grow * a bit larger to hold the 5-byte encoded prevlen. This can be done for free, * because this only happens when an entry is already being inserted (which * causes a realloc and memmove). However, encoding the prevlen may require * that this entry is grown as well. This effect may cascade throughout * the ziplist when there are consecutive entries with a size close to * ZIP_BIGLEN, so we need to check that the prevlen can be encoded in every * consecutive entry. * * Note that this effect can also happen in reverse, where the bytes required * to encode the prevlen field can shrink. This effect is deliberately ignored, * because it can cause a "flapping" effect where a chain prevlen fields is * first grown and then shrunk again after consecutive inserts. Rather, the * field is allowed to stay larger than necessary, because a large prevlen * field implies the ziplist is holding large entries anyway. * * The pointer "p" points to the first entry that does NOT need to be * updated, i.e. consecutive fields MAY need an update. */
/* 级联更新ziplist 当一个新的节点插入链表时,如果原节点的prevlen不足以保存新节点的长度,那么就需要对原节点得空间进行扩展, 也就是从1个字节扩展到5个字节。特别是这种扩展操作又可能导致下一个节点需要扩展......这种情况在多个连续节点 的长度都接近254(上一节点长度小于254只要1个字节保存即可)的时候很可能发生。 __ziplistCascadeUpdate就是用来处理这种级联扩展操作 另外,还可能出现相反的情况:因为插入节点的长度比较小而引起连续的缩小操作。但是,为了避免出现“扩展-缩小-扩展-缩小” 这种“抖动”情况反复出现,redis对这种因插入节点的长度较小而引起的缩小操作采取“不处理”的策略,也就是任由prevlen比 所需的长度长 该函数返回更新后的ziplist,参数p指向需要扩展prevlensize的节点首地址 */
static unsigned char *__ziplistCascadeUpdate(unsigned char *zl, unsigned char *p) {
size_t curlen = intrev32ifbe(ZIPLIST_BYTES(zl)), rawlen, rawlensize;
size_t offset, noffset, extra;
unsigned char *np;
zlentry cur, next;
// 从p指向的节点开始遍历到ziplist列表尾部
while (p[0] != ZIP_END) {
// 取得当前节点
cur = zipEntry(p);
// 当前节点的占用的字节数
rawlen = cur.headersize + cur.len;
// 存储rawlen所需要的字节数
rawlensize = zipPrevEncodeLength(NULL,rawlen);
/* Abort if there is no next entry. */
// 如果到达表尾,直接退出
if (p[rawlen] == ZIP_END) break;
// 获得下一个节点
next = zipEntry(p+rawlen);
/* Abort when "prevlen" has not changed. */
// 如果下一个节点的prevlen等于当前节点的rawlen,则此后的节点都无需调整,直接退出
if (next.prevrawlen == rawlen) break;
// 下一个节点的长度空间不足,需要进行扩展操作
if (next.prevrawlensize < rawlensize) {
/* The "prevlen" field of "next" needs more bytes to hold * the raw length of "cur". */
// 下面的ziplistResize发生了空间的重新分配,所以需要记录p对于zl的偏移量
offset = p-zl;
// 求出需要扩展的字节数
extra = rawlensize-next.prevrawlensize;
zl = ziplistResize(zl,curlen+extra);
// ziplistResize发生了空间的重新分配,这里重新获取p指针
p = zl+offset;
/* Current pointer and offset for next element. */
// 新的下一个节点的首地址
np = p+rawlen;
noffset = np-zl;
/* Update tail offset when next element is not the tail element. */
// zl+intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))为最后一个节点的首地址
// 如果下一个节点不是最后一个节点,发生扩展操作需要更新最后一个节点的偏移量
if ((zl+intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))) != np) {
ZIPLIST_TAIL_OFFSET(zl) =
intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))+extra);
}
/* Move the tail to the back. */
// 这里将原节点的下一个节点的数据区域到ziplist尾部的全部数据向后偏移,空余出rawlensize个字节
// 用来存放上一个节点的长度
memmove(np+rawlensize,
np+next.prevrawlensize,
curlen-noffset-next.prevrawlensize-1);
// 空余出来的rawlensize个字节用来存储上一个节点的长度
zipPrevEncodeLength(np,rawlen);
/* Advance the cursor */
// 指向下一个节点
p += rawlen;
// 更新当前节点的长度
curlen += extra;
} else {
// 如果下一节点的长度空间有冗余,则不进行压缩以防止“抖动”现象。
if (next.prevrawlensize > rawlensize) {
/* This would result in shrinking, which we want to avoid. * So, set "rawlen" in the available bytes. */
zipPrevEncodeLengthForceLarge(p+rawlen,rawlen);
} else {
zipPrevEncodeLength(p+rawlen,rawlen);
}
/* Stop here, as the raw length of "next" has not changed. */
break;
}
}
return zl;
}
/* Delete "num" entries, starting at "p". Returns pointer to the ziplist. */
/* 从p指针开始删除num个节点*/
static unsigned char *__ziplistDelete(unsigned char *zl, unsigned char *p, unsigned int num) {
unsigned int i, totlen, deleted = 0;
size_t offset;
int nextdiff = 0;
zlentry first, tail;
// 需要删除的首个节点
first = zipEntry(p);
for (i = 0; p[0] != ZIP_END && i < num; i++) {
// 偏移到下一个节点
p += zipRawEntryLength(p);
// 统计待删除节点数量
deleted++;
}
// 得到待删除节点的字节总数
totlen = p-first.p;
if (totlen > 0) {
// 注意此时p指向的是待删除节点后第一个不被删除的节点
if (p[0] != ZIP_END) {
/* Storing `prevrawlen` in this entry may increase or decrease the * number of bytes required compare to the current `prevrawlen`. * There always is room to store this, because it was previously * stored by an entry that is now being deleted. */
// 计算待删除的第一个节点first的prevrawlensize与p节点prevrawlensize的差值
nextdiff = zipPrevLenByteDiff(p,first.prevrawlen);
// 根据nextdiff的值对p进行前移或后移操作,用来保存frist节点上一个节点的长度,即first.prevrawlen
p -= nextdiff;
// 删除后first节点前一个节点的下一个节点就是p节点,更新p节点的prevrawlen数值
zipPrevEncodeLength(p,first.prevrawlen);
/* Update offset for tail */
// 更新最后一个节点的偏移量
ZIPLIST_TAIL_OFFSET(zl) =
intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))-totlen);
/* When the tail contains more than one entry, we need to take * "nextdiff" in account as well. Otherwise, a change in the * size of prevlen doesn't have an effect on the *tail* offset. */
tail = zipEntry(p);
// 如果p节点不是尾节点,尾节点的偏移量还需要加上nextdiff值
if (p[tail.headersize+tail.len] != ZIP_END) {
ZIPLIST_TAIL_OFFSET(zl) =
intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))+nextdiff);
}
/* Move tail to the front of the ziplist */
// 删除first.p到p节点之间的节点,其实就是简单的数据移动操作
// 这里为什么要减1呢,因为zlend不需要处理,在后面的ziplistResize中重新设置了zlend
memmove(first.p,p,
intrev32ifbe(ZIPLIST_BYTES(zl))-(p-zl)-1);
} else {
/* The entire tail was deleted. No need to move memory. */
// 如果已经删除到zlend,值最后一个节点就是first节点的前一个节点,需要更新其偏移量
ZIPLIST_TAIL_OFFSET(zl) =
intrev32ifbe((first.p-zl)-first.prevrawlen);
}
/* Resize and update length */
// 重新调整ziplist大小,更新其长度
offset = first.p-zl;
zl = ziplistResize(zl, intrev32ifbe(ZIPLIST_BYTES(zl))-totlen+nextdiff);
ZIPLIST_INCR_LENGTH(zl,-deleted);
p = zl+offset;
/* When nextdiff != 0, the raw length of the next entry has changed, so * we need to cascade the update throughout the ziplist */
// 如果nextdiff的值不为0,说明p节点的长度发生改变,需要执行级联更新操作
if (nextdiff != 0)
zl = __ziplistCascadeUpdate(zl,p);
}
return zl;
}
/* Insert item at "p". */
/* 在p节点前插入一个新节点。各参数的含义如下: zl:ziplist首地址 p:插入位置 s:待插入字符串的首地址 slen:带插入字符串长度 */
static unsigned char *__ziplistInsert(unsigned char *zl, unsigned char *p, unsigned char *s, unsigned int slen) {
size_t curlen = intrev32ifbe(ZIPLIST_BYTES(zl)), reqlen;
unsigned int prevlensize, prevlen = 0;
size_t offset;
int nextdiff = 0;
unsigned char encoding = 0;
long long value = 123456789; /* initialized to avoid warning. Using a value that is easy to see if for some reason we use it uninitialized. */
zlentry tail;
/* Find out prevlen for the entry that is inserted. */
if (p[0] != ZIP_END) {
// 如果p节点后面还有节点,取出p节点前一个节点的长度信息和存储该长度值所需要的字节数信息
ZIP_DECODE_PREVLEN(p, prevlensize, prevlen);
} else {
// 如果p节点为ziplist结束标识,则取出尾节点,即最后一个节点
unsigned char *ptail = ZIPLIST_ENTRY_TAIL(zl);
if (ptail[0] != ZIP_END) {
prevlen = zipRawEntryLength(ptail);
}
}
/* See if the entry can be encoded */
// 尝试看能否将s保存为整数,如果可以则返回1,且value和encoding分别保存新值和编码信息
if (zipTryEncoding(s,slen,&value,&encoding)) {
/* 'encoding' is set to the appropriate integer encoding */
// 如果s可以保存为整数,则进一步计算保存该数值所需要的字节数
reqlen = zipIntSize(encoding);
} else {
/* 'encoding' is untouched, however zipEncodeLength will use the * string length to figure out how to encode it. */
// 如果s不能保存为整数,则直接使用其字符串长度
reqlen = slen;
}
/* We need space for both the length of the previous entry and * the length of the payload. */
// 计算编码prevlen所需要的字节数,prevlen用于保存前一个节点的长度
reqlen += zipPrevEncodeLength(NULL,prevlen);
// 计算编码slen所需要的长度
reqlen += zipEncodeLength(NULL,encoding,slen);
/* When the insert position is not equal to the tail, we need to * make sure that the next entry can hold this entry's length in * its prevlen field. */
// 当插入的位置不是ziplist尾部时,需要确保下一个节点(即p节点)的prevlen能够用来保存即将插入节点的长度
// 这里计算两者差值
nextdiff = (p[0] != ZIP_END) ? zipPrevLenByteDiff(p,reqlen) : 0;
/* Store offset because a realloc may change the address of zl. */
// ziplistResize操作会重新分配空间,需要事前记录p节点偏移量
offset = p-zl;
zl = ziplistResize(zl,curlen+reqlen+nextdiff);
// 重新取得p节点
p = zl+offset;
/* Apply memory move when necessary and update tail offset. */
if (p[0] != ZIP_END) {
/* Subtract one because of the ZIP_END bytes */
/* 将原来 p-nextdiff 开始的数据全部后移,中间出现reqlen个字节保存即将插入的数据 主要需要考虑一下几种情况: nextdiff == 0:p节点中用来存储原先前一个节点长度信息的数据区域正好保存待插入节点的长度 nextdiff == 4:原先p节点只需要1个字节来存储上一个节点的长度,现在需要5个字节。那就将p-4后面的数据偏移到p+reqlen nextdiff == -4:原先p节点需要5个字节来存储上一个节点的长度,现在只需要1个字节。那就将p+4后面的数据偏移到p+reqlen */
memmove(p+reqlen,p-nextdiff,curlen-offset-1+nextdiff);
/* Encode this entry's raw length in the next entry. */
// 为p节点的prevlen设置新值,即待插入节点的长度
zipPrevEncodeLength(p+reqlen,reqlen);
/* Update offset for tail */
// 更新尾节点偏移量
ZIPLIST_TAIL_OFFSET(zl) =
intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))+reqlen);
/* When the tail contains more than one entry, we need to take * "nextdiff" in account as well. Otherwise, a change in the * size of prevlen doesn't have an effect on the *tail* offset. */
tail = zipEntry(p+reqlen);
// 同样,如果p节点不是尾节点,尾节点的偏移量还需要加上nextdiff值
if (p[reqlen+tail.headersize+tail.len] != ZIP_END) {
ZIPLIST_TAIL_OFFSET(zl) =
intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))+nextdiff);
}
} else {
/* This element will be the new tail. */
// 如果p节点指向zlend,更新zltail值,待添加节点为尾部节点
ZIPLIST_TAIL_OFFSET(zl) = intrev32ifbe(p-zl);
}
/* When nextdiff != 0, the raw length of the next entry has changed, so * we need to cascade the update throughout the ziplist */
// 同样,如果nextdiff的值不为0,说明原节点节点(此时的首地址为p+reqlen)的长度发生改变,需要执行级联更新操作
if (nextdiff != 0) {
offset = p-zl;
zl = __ziplistCascadeUpdate(zl,p+reqlen);
p = zl+offset;
}
// 下面才是真正执行插入操作
/* Write the entry */
// 填写上一节点的长度
p += zipPrevEncodeLength(p,prevlen);
// 填写当前节点的长度
p += zipEncodeLength(p,encoding,slen);
// 根据编码方式执行相应的插入操作
if (ZIP_IS_STR(encoding)) {
memcpy(p,s,slen);
} else {
zipSaveInteger(p,value,encoding);
}
// 长度加1
ZIPLIST_INCR_LENGTH(zl,1);
return zl;
}
/* 往ziplist的头部或尾部插入一个节点,底层通过__ziplistInsert实现,有了上面的分析,该函数的操作就很简单 */
unsigned char *ziplistPush(unsigned char *zl, unsigned char *s, unsigned int slen, int where) {
unsigned char *p;
p = (where == ZIPLIST_HEAD) ? ZIPLIST_ENTRY_HEAD(zl) : ZIPLIST_ENTRY_END(zl);
return __ziplistInsert(zl,p,s,slen);
}
/* Returns an offset to use for iterating with ziplistNext. When the given * index is negative, the list is traversed back to front. When the list * doesn't contain an element at the provided index, NULL is returned. */
/* 根据索引了获取ziplist节点,支持从前往后的正向索引值和从后往前的反向索引值。 如果index处有节点,则返回指向该节点的指针,否则返回NULL */
unsigned char *ziplistIndex(unsigned char *zl, int index) {
unsigned char *p;
unsigned int prevlensize, prevlen = 0;
if (index < 0) {
// 如果index为负数,则从后往前算,第一个节点的索引值为-1
index = (-index)-1;
// 获取ziplist的最后一个节点
p = ZIPLIST_ENTRY_TAIL(zl);
if (p[0] != ZIP_END) {
// 下面从后往前查找目标节点
ZIP_DECODE_PREVLEN(p, prevlensize, prevlen);
while (prevlen > 0 && index--) {
// 找到前一个节点
p -= prevlen;
ZIP_DECODE_PREVLEN(p, prevlensize, prevlen);
}
}
} else {
// 如果index为正数,则从前往后算,第一个节点的索引值为0
p = ZIPLIST_ENTRY_HEAD(zl);
// 从前往后查找目标节点
while (p[0] != ZIP_END && index--) {
p += zipRawEntryLength(p);
}
}
return (p[0] == ZIP_END || index > 0) ? NULL : p;
}
/* Return pointer to next entry in ziplist. * * zl is the pointer to the ziplist * p is the pointer to the current element * * The element after 'p' is returned, otherwise NULL if we are at the end. */
/* 获取ziplist中p节点的下一个节点,其实就是指针操作 */
unsigned char *ziplistNext(unsigned char *zl, unsigned char *p) {
((void) zl);
/* "p" could be equal to ZIP_END, caused by ziplistDelete, * and we should return NULL. Otherwise, we should return NULL * when the *next* element is ZIP_END (there is no next entry). */
if (p[0] == ZIP_END) {
return NULL;
}
p += zipRawEntryLength(p);
// 注意这里需要两次判断p[0] == ZIP_END
if (p[0] == ZIP_END) {
return NULL;
}
return p;
}
/* Return pointer to previous entry in ziplist. */
/* 获取ziplist中p节点的前一个节点 */
unsigned char *ziplistPrev(unsigned char *zl, unsigned char *p) {
unsigned int prevlensize, prevlen = 0;
/* Iterating backwards from ZIP_END should return the tail. When "p" is * equal to the first element of the list, we're already at the head, * and should return NULL. */
if (p[0] == ZIP_END) {
// 如果指针p指向ziplist的结束符,则前一个节点就是尾节点,通过ZIPLIST_ENTRY_TAIL直接获取
p = ZIPLIST_ENTRY_TAIL(zl);
return (p[0] == ZIP_END) ? NULL : p;
} else if (p == ZIPLIST_ENTRY_HEAD(zl)) {
// 如果当前节点是ziplist第一个节点,则前一个节点为NULL
return NULL;
} else {
// 获取前一个节点的长度
ZIP_DECODE_PREVLEN(p, prevlensize, prevlen);
assert(prevlen > 0);
// 找到前一个节点的首地址
return p-prevlen;
}
}
/* Get entry pointed to by 'p' and store in either '*sstr' or 'sval' depending * on the encoding of the entry. '*sstr' is always set to NULL to be able * to find out whether the string pointer or the integer value was set. * Return 0 if 'p' points to the end of the ziplist, 1 otherwise. */
/* 获取p指针指向的当前节点的值,如果p指向的节点是合法节点返回1,否则返回0。 另外节点的值可能是整型数值或字符串,如果是整形则保存在sval中,如果是字符串则保存在sstr中 */
unsigned int ziplistGet(unsigned char *p, unsigned char **sstr, unsigned int *slen, long long *sval) {
zlentry entry;
// 判断p指向节点是否合法
if (p == NULL || p[0] == ZIP_END) return 0;
if (sstr) *sstr = NULL;
// 取得当前节点
entry = zipEntry(p);
if (ZIP_IS_STR(entry.encoding)) {
// 如果当前节点是字符串编码,赋值给sstr
if (sstr) {
*slen = entry.len;
*sstr = p+entry.headersize;
}
} else {
// 当前节点是整型编码,赋值给sval
if (sval) {
*sval = zipLoadInteger(p+entry.headersize,entry.encoding);
}
}
return 1;
}
/* Insert an entry at "p". */
/* 在p指针指向的位置插入一个节点,底层通过__ziplistInsert实现 */
unsigned char *ziplistInsert(unsigned char *zl, unsigned char *p, unsigned char *s, unsigned int slen) {
return __ziplistInsert(zl,p,s,slen);
}
/* Delete a single entry from the ziplist, pointed to by *p. * Also update *p in place, to be able to iterate over the * ziplist, while deleting entries. */
/* 删除p指针指向的节点,操作成功后p指向被删除节点下一个节点 */
unsigned char *ziplistDelete(unsigned char *zl, unsigned char **p) {
// 记录p指针的偏移量
size_t offset = *p-zl;
zl = __ziplistDelete(zl,*p,1);
/* Store pointer to current element in p, because ziplistDelete will * do a realloc which might result in a different "zl"-pointer. * When the delete direction is back to front, we might delete the last * entry and end up with "p" pointing to ZIP_END, so check this. */
// 重新获得p指针,此时p指向被删除节点的下一个节点
*p = zl+offset;
return zl;
}
/* Delete a range of entries from the ziplist. */
/* 删除连续的一批节点,其中index是第一个删除的节点索引,num是删除的节点个数。 该函数其实就是ziplistIndex和__ziplistDelete的封装 */
unsigned char *ziplistDeleteRange(unsigned char *zl, unsigned int index, unsigned int num) {
// 找到索引为index的节点
unsigned char *p = ziplistIndex(zl,index);
// 调用__ziplistDelete进行删除
return (p == NULL) ? zl : __ziplistDelete(zl,p,num);
}
/* Compare entry pointer to by 'p' with 'sstr' of length 'slen'. */
/* Return 1 if equal. */
/* 将p指针指向的节点的值与sstr对应的值作比较,如果两则相等返回1,否则返回0 */
unsigned int ziplistCompare(unsigned char *p, unsigned char *sstr, unsigned int slen) {
zlentry entry;
unsigned char sencoding;
long long zval, sval;
if (p[0] == ZIP_END) return 0;
// 获得p指针指向的当前节点,根据该节点的编码方式进行字符串 or 整型的比较
entry = zipEntry(p);
if (ZIP_IS_STR(entry.encoding)) {
// 对于字符串只需要进行简单的字符串比较即可
/* Raw compare */
if (entry.len == slen) {
return memcmp(p+entry.headersize,sstr,slen) == 0;
} else {
return 0;
}
} else {
/* Try to compare encoded values. Don't compare encoding because * different implementations may encoded integers differently. */
// 如果是整型编码,先尝试按照p节点的编码方式对sstr进行编码,然后与p节点保存的整型数值进行比较
if (zipTryEncoding(sstr,slen,&sval,&sencoding)) {
zval = zipLoadInteger(p+entry.headersize,entry.encoding);
return zval == sval;
}
}
return 0;
}
/* Find pointer to the entry equal to the specified entry. Skip 'skip' entries * between every comparison. Returns NULL when the field could not be found. */
/* 在ziplist查找包含给定数据的节点,可以通过参数skip指定跳过的节点数 */
unsigned char *ziplistFind(unsigned char *p, unsigned char *vstr, unsigned int vlen, unsigned int skip) {
int skipcnt = 0;
unsigned char vencoding = 0;
long long vll = 0;
while (p[0] != ZIP_END) {
unsigned int prevlensize, encoding, lensize, len;
unsigned char *q;
ZIP_DECODE_PREVLENSIZE(p, prevlensize);
ZIP_DECODE_LENGTH(p + prevlensize, encoding, lensize, len);
q = p + prevlensize + lensize;
if (skipcnt == 0) {
/* Compare current entry with specified entry */
if (ZIP_IS_STR(encoding)) {
if (len == vlen && memcmp(q, vstr, vlen) == 0) {
return p;
}
} else {
/* Find out if the searched field can be encoded. Note that * we do it only the first time, once done vencoding is set * to non-zero and vll is set to the integer value. */
if (vencoding == 0) {
if (!zipTryEncoding(vstr, vlen, &vll, &vencoding)) {
/* If the entry can't be encoded we set it to * UCHAR_MAX so that we don't retry again the next * time. */
vencoding = UCHAR_MAX;
}
/* Must be non-zero by now */
assert(vencoding);
}
/* Compare current entry with specified entry, do it only * if vencoding != UCHAR_MAX because if there is no encoding * possible for the field it can't be a valid integer. */
if (vencoding != UCHAR_MAX) {
long long ll = zipLoadInteger(q, encoding);
if (ll == vll) {
return p;
}
}
}
/* Reset skip count */
skipcnt = skip;
} else {
/* Skip entry */
skipcnt--;
}
/* Move to next entry */
p = q + len;
}
return NULL;
}
/* Return length of ziplist. */
/* 获取ziplist链表中元素的个数 */
unsigned int ziplistLen(unsigned char *zl) {
/* 前面我们说过,ziplist的头部字段zllen是一个2字节无符号整形,存储的是链表中的节点总数。 当这个值超过2^16-2时就需要遍历整个链表来获取链表的节点总数 */
unsigned int len = 0;
if (intrev16ifbe(ZIPLIST_LENGTH(zl)) < UINT16_MAX) {
// 链表长度小于2^16-2,直接读取zllen字段
len = intrev16ifbe(ZIPLIST_LENGTH(zl));
} else {
// 链表长度大于2^16-2,需要遍历整个链表
unsigned char *p = zl+ZIPLIST_HEADER_SIZE;
while (*p != ZIP_END) {
p += zipRawEntryLength(p);
len++;
}
/* Re-store length if small enough */
if (len < UINT16_MAX) ZIPLIST_LENGTH(zl) = intrev16ifbe(len);
}
return len;
}
/* Return ziplist blob size in bytes. */
/* 获取整个ziplist占用的字节数,这个信息保存在ziplist头部,直接获取即可 */
size_t ziplistBlobLen(unsigned char *zl) {
return intrev32ifbe(ZIPLIST_BYTES(zl));
}
/* 格式化输出,打印出整个ziplist的信息 */
void ziplistRepr(unsigned char *zl) {
unsigned char *p;
int index = 0;
zlentry entry;
printf(
"{total bytes %d} "
"{length %u}\n"
"{tail offset %u}\n",
intrev32ifbe(ZIPLIST_BYTES(zl)),
intrev16ifbe(ZIPLIST_LENGTH(zl)),
intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl)));
p = ZIPLIST_ENTRY_HEAD(zl);
while(*p != ZIP_END) {
entry = zipEntry(p);
printf(
"{"
"addr 0x%08lx, "
"index %2d, "
"offset %5ld, "
"rl: %5u, "
"hs %2u, "
"pl: %5u, "
"pls: %2u, "
"payload %5u"
"} ",
(long unsigned)p,
index,
(unsigned long) (p-zl),
entry.headersize+entry.len,
entry.headersize,
entry.prevrawlen,
entry.prevrawlensize,
entry.len);
p += entry.headersize;
if (ZIP_IS_STR(entry.encoding)) {
if (entry.len > 40) {
if (fwrite(p,40,1,stdout) == 0) perror("fwrite");
printf("...");
} else {
if (entry.len &&
fwrite(p,entry.len,1,stdout) == 0) perror("fwrite");
}
} else {
printf("%lld", (long long) zipLoadInteger(p,entry.encoding));
}
printf("\n");
p += entry.len;
index++;
}
printf("{end}\n\n");
}
/* 下面是一些测试代码 */
#ifdef ZIPLIST_TEST_MAIN
#include <sys/time.h>
#include "adlist.h"
#include "sds.h"
#define debug(f, ...) { if (DEBUG) printf(f, __VA_ARGS__); }
unsigned char *createList() {
unsigned char *zl = ziplistNew();
zl = ziplistPush(zl, (unsigned char*)"foo", 3, ZIPLIST_TAIL);
zl = ziplistPush(zl, (unsigned char*)"quux", 4, ZIPLIST_TAIL);
zl = ziplistPush(zl, (unsigned char*)"hello", 5, ZIPLIST_HEAD);
zl = ziplistPush(zl, (unsigned char*)"1024", 4, ZIPLIST_TAIL);
return zl;
}
unsigned char *createIntList() {
unsigned char *zl = ziplistNew();
char buf[32];
sprintf(buf, "100");
zl = ziplistPush(zl, (unsigned char*)buf, strlen(buf), ZIPLIST_TAIL);
sprintf(buf, "128000");
zl = ziplistPush(zl, (unsigned char*)buf, strlen(buf), ZIPLIST_TAIL);
sprintf(buf, "-100");
zl = ziplistPush(zl, (unsigned char*)buf, strlen(buf), ZIPLIST_HEAD);
sprintf(buf, "4294967296");
zl = ziplistPush(zl, (unsigned char*)buf, strlen(buf), ZIPLIST_HEAD);
sprintf(buf, "non integer");
zl = ziplistPush(zl, (unsigned char*)buf, strlen(buf), ZIPLIST_TAIL);
sprintf(buf, "much much longer non integer");
zl = ziplistPush(zl, (unsigned char*)buf, strlen(buf), ZIPLIST_TAIL);
return zl;
}
long long usec(void) {
struct timeval tv;
gettimeofday(&tv,NULL);
return (((long long)tv.tv_sec)*1000000)+tv.tv_usec;
}
void stress(int pos, int num, int maxsize, int dnum) {
int i,j,k;
unsigned char *zl;
char posstr[2][5] = { "HEAD", "TAIL" };
long long start;
for (i = 0; i < maxsize; i+=dnum) {
zl = ziplistNew();
for (j = 0; j < i; j++) {
zl = ziplistPush(zl,(unsigned char*)"quux",4,ZIPLIST_TAIL);
}
/* Do num times a push+pop from pos */
start = usec();
for (k = 0; k < num; k++) {
zl = ziplistPush(zl,(unsigned char*)"quux",4,pos);
zl = ziplistDeleteRange(zl,0,1);
}
printf("List size: %8d, bytes: %8d, %dx push+pop (%s): %6lld usec\n",
i,intrev32ifbe(ZIPLIST_BYTES(zl)),num,posstr[pos],usec()-start);
zfree(zl);
}
}
void pop(unsigned char *zl, int where) {
unsigned char *p, *vstr;
unsigned int vlen;
long long vlong;
p = ziplistIndex(zl,where == ZIPLIST_HEAD ? 0 : -1);
if (ziplistGet(p,&vstr,&vlen,&vlong)) {
if (where == ZIPLIST_HEAD)
printf("Pop head: ");
else
printf("Pop tail: ");
if (vstr)
if (vlen && fwrite(vstr,vlen,1,stdout) == 0) perror("fwrite");
else
printf("%lld", vlong);
printf("\n");
ziplistDeleteRange(zl,-1,1);
} else {
printf("ERROR: Could not pop\n");
exit(1);
}
}
int randstring(char *target, unsigned int min, unsigned int max) {
int p = 0;
int len = min+rand()%(max-min+1);
int minval, maxval;
switch(rand() % 3) {
case 0:
minval = 0;
maxval = 255;
break;
case 1:
minval = 48;
maxval = 122;
break;
case 2:
minval = 48;
maxval = 52;
break;
default:
assert(NULL);
}
while(p < len)
target[p++] = minval+rand()%(maxval-minval+1);
return len;
}
void verify(unsigned char *zl, zlentry *e) {
int i;
int len = ziplistLen(zl);
zlentry _e;
for (i = 0; i < len; i++) {
memset(&e[i], 0, sizeof(zlentry));
e[i] = zipEntry(ziplistIndex(zl, i));
memset(&_e, 0, sizeof(zlentry));
_e = zipEntry(ziplistIndex(zl, -len+i));
assert(memcmp(&e[i], &_e, sizeof(zlentry)) == 0);
}
}
int main(int argc, char **argv) {
unsigned char *zl, *p;
unsigned char *entry;
unsigned int elen;
long long value;
/* If an argument is given, use it as the random seed. */
if (argc == 2)
srand(atoi(argv[1]));
zl = createIntList();
ziplistRepr(zl);
zl = createList();
ziplistRepr(zl);
pop(zl,ZIPLIST_TAIL);
ziplistRepr(zl);
pop(zl,ZIPLIST_HEAD);
ziplistRepr(zl);
pop(zl,ZIPLIST_TAIL);
ziplistRepr(zl);
pop(zl,ZIPLIST_TAIL);
ziplistRepr(zl);
printf("Get element at index 3:\n");
{
zl = createList();
p = ziplistIndex(zl, 3);
if (!ziplistGet(p, &entry, &elen, &value)) {
printf("ERROR: Could not access index 3\n");
return 1;
}
if (entry) {
if (elen && fwrite(entry,elen,1,stdout) == 0) perror("fwrite");
printf("\n");
} else {
printf("%lld\n", value);
}
printf("\n");
}
printf("Get element at index 4 (out of range):\n");
{
zl = createList();
p = ziplistIndex(zl, 4);
if (p == NULL) {
printf("No entry\n");
} else {
printf("ERROR: Out of range index should return NULL, returned offset: %ld\n", p-zl);
return 1;
}
printf("\n");
}
printf("Get element at index -1 (last element):\n");
{
zl = createList();
p = ziplistIndex(zl, -1);
if (!ziplistGet(p, &entry, &elen, &value)) {
printf("ERROR: Could not access index -1\n");
return 1;
}
if (entry) {
if (elen && fwrite(entry,elen,1,stdout) == 0) perror("fwrite");
printf("\n");
} else {
printf("%lld\n", value);
}
printf("\n");
}
printf("Get element at index -4 (first element):\n");
{
zl = createList();
p = ziplistIndex(zl, -4);
if (!ziplistGet(p, &entry, &elen, &value)) {
printf("ERROR: Could not access index -4\n");
return 1;
}
if (entry) {
if (elen && fwrite(entry,elen,1,stdout) == 0) perror("fwrite");
printf("\n");
} else {
printf("%lld\n", value);
}
printf("\n");
}
printf("Get element at index -5 (reverse out of range):\n");
{
zl = createList();
p = ziplistIndex(zl, -5);
if (p == NULL) {
printf("No entry\n");
} else {
printf("ERROR: Out of range index should return NULL, returned offset: %ld\n", p-zl);
return 1;
}
printf("\n");
}
printf("Iterate list from 0 to end:\n");
{
zl = createList();
p = ziplistIndex(zl, 0);
while (ziplistGet(p, &entry, &elen, &value)) {
printf("Entry: ");
if (entry) {
if (elen && fwrite(entry,elen,1,stdout) == 0) perror("fwrite");
} else {
printf("%lld", value);
}
p = ziplistNext(zl,p);
printf("\n");
}
printf("\n");
}
printf("Iterate list from 1 to end:\n");
{
zl = createList();
p = ziplistIndex(zl, 1);
while (ziplistGet(p, &entry, &elen, &value)) {
printf("Entry: ");
if (entry) {
if (elen && fwrite(entry,elen,1,stdout) == 0) perror("fwrite");
} else {
printf("%lld", value);
}
p = ziplistNext(zl,p);
printf("\n");
}
printf("\n");
}
printf("Iterate list from 2 to end:\n");
{
zl = createList();
p = ziplistIndex(zl, 2);
while (ziplistGet(p, &entry, &elen, &value)) {
printf("Entry: ");
if (entry) {
if (elen && fwrite(entry,elen,1,stdout) == 0) perror("fwrite");
} else {
printf("%lld", value);
}
p = ziplistNext(zl,p);
printf("\n");
}
printf("\n");
}
printf("Iterate starting out of range:\n");
{
zl = createList();
p = ziplistIndex(zl, 4);
if (!ziplistGet(p, &entry, &elen, &value)) {
printf("No entry\n");
} else {
printf("ERROR\n");
}
printf("\n");
}
printf("Iterate from back to front:\n");
{
zl = createList();
p = ziplistIndex(zl, -1);
while (ziplistGet(p, &entry, &elen, &value)) {
printf("Entry: ");
if (entry) {
if (elen && fwrite(entry,elen,1,stdout) == 0) perror("fwrite");
} else {
printf("%lld", value);
}
p = ziplistPrev(zl,p);
printf("\n");
}
printf("\n");
}
printf("Iterate from back to front, deleting all items:\n");
{
zl = createList();
p = ziplistIndex(zl, -1);
while (ziplistGet(p, &entry, &elen, &value)) {
printf("Entry: ");
if (entry) {
if (elen && fwrite(entry,elen,1,stdout) == 0) perror("fwrite");
} else {
printf("%lld", value);
}
zl = ziplistDelete(zl,&p);
p = ziplistPrev(zl,p);
printf("\n");
}
printf("\n");
}
printf("Delete inclusive range 0,0:\n");
{
zl = createList();
zl = ziplistDeleteRange(zl, 0, 1);
ziplistRepr(zl);
}
printf("Delete inclusive range 0,1:\n");
{
zl = createList();
zl = ziplistDeleteRange(zl, 0, 2);
ziplistRepr(zl);
}
printf("Delete inclusive range 1,2:\n");
{
zl = createList();
zl = ziplistDeleteRange(zl, 1, 2);
ziplistRepr(zl);
}
printf("Delete with start index out of range:\n");
{
zl = createList();
zl = ziplistDeleteRange(zl, 5, 1);
ziplistRepr(zl);
}
printf("Delete with num overflow:\n");
{
zl = createList();
zl = ziplistDeleteRange(zl, 1, 5);
ziplistRepr(zl);
}
printf("Delete foo while iterating:\n");
{
zl = createList();
p = ziplistIndex(zl,0);
while (ziplistGet(p,&entry,&elen,&value)) {
if (entry && strncmp("foo",(char*)entry,elen) == 0) {
printf("Delete foo\n");
zl = ziplistDelete(zl,&p);
} else {
printf("Entry: ");
if (entry) {
if (elen && fwrite(entry,elen,1,stdout) == 0)
perror("fwrite");
} else {
printf("%lld",value);
}
p = ziplistNext(zl,p);
printf("\n");
}
}
printf("\n");
ziplistRepr(zl);
}
printf("Regression test for >255 byte strings:\n");
{
char v1[257],v2[257];
memset(v1,'x',256);
memset(v2,'y',256);
zl = ziplistNew();
zl = ziplistPush(zl,(unsigned char*)v1,strlen(v1),ZIPLIST_TAIL);
zl = ziplistPush(zl,(unsigned char*)v2,strlen(v2),ZIPLIST_TAIL);
/* Pop values again and compare their value. */
p = ziplistIndex(zl,0);
assert(ziplistGet(p,&entry,&elen,&value));
assert(strncmp(v1,(char*)entry,elen) == 0);
p = ziplistIndex(zl,1);
assert(ziplistGet(p,&entry,&elen,&value));
assert(strncmp(v2,(char*)entry,elen) == 0);
printf("SUCCESS\n\n");
}
printf("Regression test deleting next to last entries:\n");
{
char v[3][257];
zlentry e[3];
int i;
for (i = 0; i < (sizeof(v)/sizeof(v[0])); i++) {
memset(v[i], 'a' + i, sizeof(v[0]));
}
v[0][256] = '\0';
v[1][ 1] = '\0';
v[2][256] = '\0';
zl = ziplistNew();
for (i = 0; i < (sizeof(v)/sizeof(v[0])); i++) {
zl = ziplistPush(zl, (unsigned char *) v[i], strlen(v[i]), ZIPLIST_TAIL);
}
verify(zl, e);
assert(e[0].prevrawlensize == 1);
assert(e[1].prevrawlensize == 5);
assert(e[2].prevrawlensize == 1);
/* Deleting entry 1 will increase `prevrawlensize` for entry 2 */
unsigned char *p = e[1].p;
zl = ziplistDelete(zl, &p);
verify(zl, e);
assert(e[0].prevrawlensize == 1);
assert(e[1].prevrawlensize == 5);
printf("SUCCESS\n\n");
}
printf("Create long list and check indices:\n");
{
zl = ziplistNew();
char buf[32];
int i,len;
for (i = 0; i < 1000; i++) {
len = sprintf(buf,"%d",i);
zl = ziplistPush(zl,(unsigned char*)buf,len,ZIPLIST_TAIL);
}
for (i = 0; i < 1000; i++) {
p = ziplistIndex(zl,i);
assert(ziplistGet(p,NULL,NULL,&value));
assert(i == value);
p = ziplistIndex(zl,-i-1);
assert(ziplistGet(p,NULL,NULL,&value));
assert(999-i == value);
}
printf("SUCCESS\n\n");
}
printf("Compare strings with ziplist entries:\n");
{
zl = createList();
p = ziplistIndex(zl,0);
if (!ziplistCompare(p,(unsigned char*)"hello",5)) {
printf("ERROR: not \"hello\"\n");
return 1;
}
if (ziplistCompare(p,(unsigned char*)"hella",5)) {
printf("ERROR: \"hella\"\n");
return 1;
}
p = ziplistIndex(zl,3);
if (!ziplistCompare(p,(unsigned char*)"1024",4)) {
printf("ERROR: not \"1024\"\n");
return 1;
}
if (ziplistCompare(p,(unsigned char*)"1025",4)) {
printf("ERROR: \"1025\"\n");
return 1;
}
printf("SUCCESS\n\n");
}
printf("Stress with random payloads of different encoding:\n");
{
int i,j,len,where;
unsigned char *p;
char buf[1024];
int buflen;
list *ref;
listNode *refnode;
/* Hold temp vars from ziplist */
unsigned char *sstr;
unsigned int slen;
long long sval;
for (i = 0; i < 20000; i++) {
zl = ziplistNew();
ref = listCreate();
listSetFreeMethod(ref,sdsfree);
len = rand() % 256;
/* Create lists */
for (j = 0; j < len; j++) {
where = (rand() & 1) ? ZIPLIST_HEAD : ZIPLIST_TAIL;
if (rand() % 2) {
buflen = randstring(buf,1,sizeof(buf)-1);
} else {
switch(rand() % 3) {
case 0:
buflen = sprintf(buf,"%lld",(0LL + rand()) >> 20);
break;
case 1:
buflen = sprintf(buf,"%lld",(0LL + rand()));
break;
case 2:
buflen = sprintf(buf,"%lld",(0LL + rand()) << 20);
break;
default:
assert(NULL);
}
}
/* Add to ziplist */
zl = ziplistPush(zl, (unsigned char*)buf, buflen, where);
/* Add to reference list */
if (where == ZIPLIST_HEAD) {
listAddNodeHead(ref,sdsnewlen(buf, buflen));
} else if (where == ZIPLIST_TAIL) {
listAddNodeTail(ref,sdsnewlen(buf, buflen));
} else {
assert(NULL);
}
}
assert(listLength(ref) == ziplistLen(zl));
for (j = 0; j < len; j++) {
/* Naive way to get elements, but similar to the stresser * executed from the Tcl test suite. */
p = ziplistIndex(zl,j);
refnode = listIndex(ref,j);
assert(ziplistGet(p,&sstr,&slen,&sval));
if (sstr == NULL) {
buflen = sprintf(buf,"%lld",sval);
} else {
buflen = slen;
memcpy(buf,sstr,buflen);
buf[buflen] = '\0';
}
assert(memcmp(buf,listNodeValue(refnode),buflen) == 0);
}
zfree(zl);
listRelease(ref);
}
printf("SUCCESS\n\n");
}
printf("Stress with variable ziplist size:\n");
{
stress(ZIPLIST_HEAD,100000,16384,256);
stress(ZIPLIST_TAIL,100000,16384,256);
}
return 0;
}
#endif