压缩列表(ziplist)时列表键和哈希键的底层实现之一。压缩列表时redis为了节约内存而开发的,是由一系列特殊编码的连续内存块组成的顺序型(sequential)数据结构。一个压缩列表可以包含任意多个节点(entry),每个节点可以保存一个字节数组或者一个整数值。
接下来给大家展示以下压缩列表的具体结构:
在表中列出了空的压缩列表和非空的压缩列表的具体是如何存储的。
下面我们来看redis是如何对压缩列表定义的:
/* The ziplist is a specially encoded dually linked list that is designed * to be very memory efficient. * * Ziplist 是为了尽可能地节约内存而设计的特殊编码双端链表。 * * It stores both strings and integer values, * where integers are encoded as actual integers instead of a series of * characters. * * Ziplist可以存储字符串值和整数值, * 其中,整数值被保存为整数,而不是字符数组。 * * It allows push and pop operations on either side of the list * in O(1) time. However, because every operation requires a reallocation of * the memory used by the ziplist, the actual complexity is related to the * amount of memory used by the ziplist. * * Ziplist允许在列表的两端进行o(1)复杂度的push和pop操作。 * 但是,因为这些操作都需要对整个Ziplist进行内存分配 * 所以实际的复杂度和ziplist占用的内存大小有关。 * ---------------------------------------------------------------------------- * * ZIPLIST OVERALL LAYOUT: * Ziplist整体布局: * * The general layout of the ziplist is as follows: * 以下是ziplist的一般布局: * * <zlbytes><zltail><zllen><entry><entry><zlend> * * <zlbytes> is an unsigned integer to hold the number of bytes that the * ziplist occupies. This value needs to be stored to be able to resize the * entire structure without the need to traverse it first. * * <zlbytes>是一个无符号整数,保存着ziplist使用的内存数量。 * 通过这个值,程序员可以直接对ziplist的内存大小进行调整 * * <zltail> is the offset to the last entry in the list. This allows a pop * operation on the far side of the list without the need for full traversal. * * <zltail>保存着到达列表中最后一个节点的偏移量 * 这个偏移量使得对表尾的pop操作可以在无需遍历整个列表的情况下进行。 * * <zllen> is the number of entries.When this value is larger than 2**16-2, * we need to traverse the entire list to know how many items it holds. * * <zllen>保存着列表中的节点的数量, * 当zllen保存的值大于65536时, * 程序需要遍历整个列表才能知道列表实际包含了多少个节点 * * <zlend> is a single byte special value, equal to 255, which indicates the * end of the list. * * <zlend>的长度为1字节,值为255,标识列表的末尾 * * ZIPLIST ENTRIES: * ZIPLIST 节点: * * Every entry in the ziplist is prefixed by a header that contains two pieces * of information. First, the length of the previous entry is stored to be * able to traverse the list from back to front. Second, the encoding with an * optional string length of the entry itself is stored. * * 每个ziplist节点的前面都带有一个header,这个header包含两部分信息: * (1)前置节点的长度,在程序从后向前遍历时使用, * (2)当前节点所保存的值的类型和长度 * * The length of the previous entry is encoded in the following way: * If this length is smaller than 254 bytes, it will only consume a single * byte that takes the length as value. When the length is greater than or * equal to 254, it will consume 5 bytes. The first byte is set to 254 to * indicate a larger value is following. The remaining 4 bytes take the * length of the previous entry as value. * * 编码前置节点的长度的方法如下: * (1)如果前置节点的长度小于254字节,那么程序将使用1个字节来保存这个长度 * (2)如果前置节点的长度大于等于254字节,那么程序将使用5个字节来保存这个长度值: * (a)第一个字节的值被设为254,用于标识这是一个5字节长的长度值 * (b)之后的4个字节则用于保存前置节点的实际长度。 * * The other header field of the entry itself depends on the contents of the * entry. When the entry is a string, the first 2 bits of this header will hold * the type of encoding used to store the length of the string, followed by the * actual length of the string. When the entry is an integer the first 2 bits * are both set to 1. The following 2 bits are used to specify what kind of * integer will be stored after this header. An overview of the different * types and encodings is as follows: * * header另一部分的内容和节点所保存的值有关。 * * (1)如果节点保存的是字符串值, * 那么这部分的header的头2个位保存编码字符串长度所使用的类型, * 而之后跟着的内容则是字符串的实际长度 * * |00pppppp| - 1 byte * String value with length less than or equal to 63 bytes (6 bits). * 字符串的长度小于或等于63字节。 * |01pppppp|qqqqqqqq| - 2 bytes * String value with length less than or equal to 16383 bytes (14 bits). * 字符串的长度小于或等于16383字节 * |10______|qqqqqqqq|rrrrrrrr|ssssssss|tttttttt| - 5 bytes * String value with length greater than or equal to 16384 bytes. * 字符串的长度大于或等于16384个字节。 * * (2)如果节点保存的是整数, * 那么这部分header的头2位都被设置为1, * 而之后跟着的2位则用于标识所保存的整数的类型。 * * |11000000| - 1 byte * Integer encoded as int16_t (2 bytes). * 节点的值为int16_t类型的整数,长度为2个字节 * |11010000| - 1 byte * Integer encoded as int32_t (4 bytes). * 节点的值为int32_t类型的整数,长度为4个字节 * |11100000| - 1 byte * Integer encoded as int64_t (8 bytes). * 节点的值为int64_t类型的整数,长度为8个字节 * |11110000| - 1 byte * Integer encoded as 24 bit signed (3 bytes). * 节点的值为24位(3字节)长的整数 * |11111110| - 1 byte * Integer encoded as 8 bit signed (1 byte). * 节点的值为8位(1字节)长的整数 * |1111xxxx| - (with xxxx between 0000 and 1101) immediate 4 bit integer. * Unsigned integer from 0 to 12. The encoded value is actually from * 1 to 13 because 0000 and 1111 can not be used, so 1 should be * subtracted from the encoded 4 bit value to obtain the right value. * 节点的值介于0至12之间的无符号数, * 因为0000和1111都不能使用,所以位的实际值将是1至13。 * 程序在取得这4位的值之后,还需要减去1,才能计算出正确的值。 * 比如说,如果位的值位为0001 = 1,那么程序返回的值将是1- 1=0。 * |11111111| - End of ziplist. * ziplist的结尾标志。 * All the integers are represented in little endian byte order. * * 所有的整数都表示为小端字节序。 * ---------------------------------------------------------------------------- * * Copyright (c) 2009-2012, Pieter Noordhuis <pcnoordhuis at gmail dot com> * Copyright (c) 2009-2012, Salvatore Sanfilippo <antirez at gmail dot com> * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are met: * * * Redistributions of source code must retain the above copyright notice, * this list of conditions and the following disclaimer. * * Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * Neither the name of Redis nor the names of its contributors may be used * to endorse or promote products derived from this software without * specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE * POSSIBILITY OF SUCH DAMAGE. */ #include <stdio.h> #include <stdlib.h> #include <string.h> #include <stdint.h> #include <limits.h> #include "zmalloc.h" #include "util.h" #include "ziplist.h" #include "endianconv.h" #include "redisassert.h" /* * ziplist末端标识符,以及5字节标识符 * * * */ #define ZIP_END 255 #define ZIP_BIGLEN 254 /* Different encoding/length possibilities */ /* * 字符串编码和整数编码的掩码 * * */ #define ZIP_STR_MASK 0xc0 #define ZIP_INT_MASK 0x30 /* * 字符串编码类型 * * */ //编码长度1字节,长度小于或等于63字节的字节数组 #define ZIP_STR_06B (0 << 6) //编码长度为2字节,长度小于等于16383字节的字节数组 #define ZIP_STR_14B (1 << 6) //编码长度为5字节,长度小于等于4294967295的字节数组 #define ZIP_STR_32B (2 << 6) /* * 整数编码类型 * * * */ //编码长度1字节,int16_t类型的整数 #define ZIP_INT_16B (0xc0 | 0<<4) //编码长度1字节,int32_t类型的整数 #define ZIP_INT_32B (0xc0 | 1<<4) //编码长度1字节,int64_t类型的整数 #define ZIP_INT_64B (0xc0 | 2<<4) //编码长度1字节,24位有符号的整数 #define ZIP_INT_24B (0xc0 | 3<<4) //8位有符号整数 #define ZIP_INT_8B 0xfe /* 4 bit integer immediate encoding */ /* * 4位整数编码的掩码和类型 * * */ #define ZIP_INT_IMM_MASK 0x0f #define ZIP_INT_IMM_MIN 0xf1 /* 11110001 */ #define ZIP_INT_IMM_MAX 0xfd /* 11111101 */ #define ZIP_INT_IMM_VAL(v) (v & ZIP_INT_IMM_MASK) /* * 24位整数的最大值和最小值 * */ #define INT24_MAX 0x7fffff #define INT24_MIN (-INT24_MAX - 1) /* Macro to determine type */ /* * 查看给定编码enc是否字符串编码 * * */ #define ZIP_IS_STR(enc) (((enc) & ZIP_STR_MASK) < ZIP_STR_MASK) /* Utility macros */ /* * ziplist属性宏 * * */ //定位到ziplist的bytes属性,该属性记录了整个ziplist所占用的内存字节数 //用于取出bytes属性的现有值,或者bytes属性赋予新值 #define ZIPLIST_BYTES(zl) (*((uint32_t*)(zl))) //定位到ziplist的offset属性,该属性记录了到达表尾节点的偏移量 //用于取出offset属性的现有值,或者为offset属性赋予新值 #define ZIPLIST_TAIL_OFFSET(zl) (*((uint32_t*)((zl)+sizeof(uint32_t)))) //定位到ziplist的length属性,该属性记录了ziplist包含的节点数量 //用于取出length属性的现有值,或者为length属性赋予新值 #define ZIPLIST_LENGTH(zl) (*((uint16_t*)((zl)+sizeof(uint32_t)*2))) //返回ziplist表头的大小 #define ZIPLIST_HEADER_SIZE (sizeof(uint32_t)*2+sizeof(uint16_t)) //返回指向ziplist第一个节点(的起始位置)的指针 #define ZIPLIST_ENTRY_HEAD(zl) ((zl)+ZIPLIST_HEADER_SIZE) //返回指向ziplist的最后一个节点(的起始位置)的指针 #define ZIPLIST_ENTRY_TAIL(zl) ((zl)+intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))) //返回指向ziplist末端ZIP_END(的起始位置)的指针 #define ZIPLIST_ENTRY_END(zl) ((zl)+intrev32ifbe(ZIPLIST_BYTES(zl))-1)
下面我们来看压缩列表节点的定义:
typedef struct zlentry { //prevrawlen:前置节点的长度 //prevrawlensize:编码prevrawlen所需的字节大小 unsigned int prevrawlensize, prevrawlen; //len:当前节点的长度 //lensize:编码len所需的字节大小 unsigned int lensize, len; //当前节点header的大小 //等于prevrawlensize + lensize unsigned int headersize; //当前节点所使用的编码类型 unsigned char encoding; //指向当前节点的指针 unsigned char *p; } zlentry; /* Extract the encoding from the byte pointed by 'ptr' and set it into * 'encoding'. */ /* * 从ptr中取出节点值的编码类型,并将它保存在encoding变量中。 * * */ #define ZIP_ENTRY_ENCODING(ptr, encoding) do { \ (encoding) = (ptr[0]); \ if ((encoding) < ZIP_STR_MASK) (encoding) &= ZIP_STR_MASK; \ } while(0) /* * 增加ziplist的节点数 * * * */ #define ZIPLIST_INCR_LENGTH(zl,incr) { \ if (ZIPLIST_LENGTH(zl) < UINT16_MAX) \ ZIPLIST_LENGTH(zl) = intrev16ifbe(intrev16ifbe(ZIPLIST_LENGTH(zl))+incr); \ }上述就是redis对于压缩列表节点的定义。
下面我们来看redis对于一下接口的具体实现:
/* Return bytes needed to store integer encoded by 'encoding' */ /* * 返回保存encoding编码的值所需要的字节数量 * * */ static unsigned int zipIntSize(unsigned char encoding) { switch(encoding) { case ZIP_INT_8B: return 1; case ZIP_INT_16B: return 2; case ZIP_INT_24B: return 3; case ZIP_INT_32B: return 4; case ZIP_INT_64B: return 8; default: return 0; /* 4 bit immediate */ } assert(NULL); return 0; } /* Encode the length 'rawlen' writing it in 'p'. If p is NULL it just returns * the amount of bytes required to encode such a length. */ /* * 编码节点的长度值为l,并将它写入到p中,然后返回编码l所需的字节数量。 * * 如果p为NULL,那么仅返回编码l所需的字节数量,不进行写入 * * */ static unsigned int zipEncodeLength(unsigned char *p, unsigned char encoding, unsigned int rawlen) { unsigned char len = 1, buf[5]; //判断encoding是否为字符串编码 if (ZIP_IS_STR(encoding)) { /* Although encoding is given it may not be set for strings, * so we determine it here using the raw length. */ if (rawlen <= 0x3f) { //rawlen 长度小于等于63字节的字节数组 //编码长度为1个字节 if (!p) return len; //程序执行到这一步时,说明p不为空,要将编码节点的长度值写入p中 //ZIP_STR_06B == 00bbbbbb buf[0] = ZIP_STR_06B | rawlen; } else if (rawlen <= 0x3fff) { //rawlen长度小于等于16383字节的字节数组 //编码长度为2个字节 len += 1; if (!p) return len; buf[0] = ZIP_STR_14B | ((rawlen >> 8) & 0x3f); buf[1] = rawlen & 0xff; } else { //否则,长度小于等于4294967295的字节数组 //编码长度为5个字节 len += 4; if (!p) return len; buf[0] = ZIP_STR_32B; buf[1] = (rawlen >> 24) & 0xff; buf[2] = (rawlen >> 16) & 0xff; buf[3] = (rawlen >> 8) & 0xff; buf[4] = rawlen & 0xff; } } else { //编码整数 /* Implies integer encoding, so length is always 1. */ if (!p) return len; buf[0] = encoding; } /* Store this length at p */ //将编码的长度写入p memcpy(p,buf,len); //返回编码所需的字节数 return len; } /* Decode the length encoded in 'ptr'. The 'encoding' variable will hold the * entries encoding, the 'lensize' variable will hold the number of bytes * required to encode the entries length, and the 'len' variable will hold the * entries length. * * 解码ptr指针,取出列表节点的相关信息,并将它们保存在以下变量中: * * -encoding 保存节点值的编码类型 * * -lensize 保存编码节点长度所需的字节数 * * -len 保存节点所需的长度 * * */ #define ZIP_DECODE_LENGTH(ptr, encoding, lensize, len) do { \ //取出节点值的编码类型,并保存到encoding变量中 ZIP_ENTRY_ENCODING((ptr), (encoding)); \ if ((encoding) < ZIP_STR_MASK) { \ //字符串编码 if ((encoding) == ZIP_STR_06B) { \ //如果编码为00bbbbbb //编码节点长度所需要的字节数为1 (lensize) = 1; \ //保存节点所需的长度 (len) = (ptr)[0] & 0x3f; \ } else if ((encoding) == ZIP_STR_14B) { \ //如果编码为01bbbbbb //编码节点长度所需要的字节数为2 (lensize) = 2; \ //保存节点所需的长度 (len) = (((ptr)[0] & 0x3f) << 8) | (ptr)[1]; \ } else if (encoding == ZIP_STR_32B) { \ //如果编码为10_ _ _ _ _ _ _ //编码节点长度所需要的字节数为5 (lensize) = 5; \ (len) = ((ptr)[1] << 24) | \ ((ptr)[2] << 16) | \ ((ptr)[3] << 8) | \ ((ptr)[4]); \ } else { \ //否则直接退出 assert(NULL); \ } \ } else { \ //编码整数 (lensize) = 1; \ //保存encoding编码的值所需的字节长度 (len) = zipIntSize(encoding); \ } \ } while(0); /* Encode the length of the previous entry and write it to "p". Return the * number of bytes needed to encode this length if "p" is NULL. * * 对前置节点的长度len进行编码,并将它写入到p中, * 然后返回编码len所需的字节数量 * * 如果p为空,那么不进行写入,仅返回编码len所需的字节数量 * * */ static unsigned int zipPrevEncodeLength(unsigned char *p, unsigned int len) { if (p == NULL) { //p为空时,仅返回编码len所需的字节数量 //如果len < 254时,编码len仅需要1个字节 return (len < ZIP_BIGLEN) ? 1 : sizeof(len)+1; } else { //当p不为空时 if (len < ZIP_BIGLEN) { //(1)如果len < 254时 p[0] = len; return 1; } else { //(2)如果len > 254时 //第一个字节被设置为254 p[0] = ZIP_BIGLEN; //将len的值拷贝到字符串从p+1开始的位置 memcpy(p+1,&len,sizeof(len)); //大小端进行转换 memrev32ifbe(p+1); //返回编码长度 return 1+sizeof(len); } } } /* Encode the length of the previous entry and write it to "p". This only * uses the larger encoding (required in __ziplistCascadeUpdate). * * 将原本只需要1个字节来保存的前置节点长度len编码至一个5字节长的header中 * * */ static void zipPrevEncodeLengthForceLarge(unsigned char *p, unsigned int len) { //如果p为空,则直接返回 if (p == NULL) return; //当p不为空时,首先将p[0]设置为254,用于标识5字节长度标识 p[0] = ZIP_BIGLEN; //内存拷贝,将len写入字符串p + 1的位置 memcpy(p+1,&len,sizeof(len)); //大小端的转换 memrev32ifbe(p+1); } /* Decode the number of bytes required to store the length of the previous * element, from the perspective of the entry pointed to by 'ptr'. * * 解码ptr指针,取出解码前置节点长度的字节数,并将它保存到prevlensize变量中。 * * */ #define ZIP_DECODE_PREVLENSIZE(ptr, prevlensize) do { \ if ((ptr)[0] < ZIP_BIGLEN) { \ //当ptr[0]小于254时,说明前置节点的字节数为1 (prevlensize) = 1; \ } else { \ //否则说明前置节点的字节数为5 (prevlensize) = 5; \ } \ } while(0); /* Decode the length of the previous element, from the perspective of the entry * pointed to by 'ptr'. * * 解码pre指针, * 取出编码前置节点长度所需的字节数,并将它保存到prevlensize变量中。 * * 然后根据prevlensize,从ptr中取出前置节点的长度值, * 并将这个长度保存到prevlen变量中。 * */ #define ZIP_DECODE_PREVLEN(ptr, prevlensize, prevlen) do { \ //获取编码前置节点长度所需的字节数,并将其保存在prevlensize中 ZIP_DECODE_PREVLENSIZE(ptr, prevlensize); \ if ((prevlensize) == 1) { \ //如果所需字节数为1,则说明其长度值小于254,仅用一个字节就可以编码 (prevlen) = (ptr)[0]; \ } else if ((prevlensize) == 5) { \ //当所需字节数为5时,则说明其长度大于等于254。 assert(sizeof((prevlensize)) == 4); \ //如果前置节点的字节数为4时,将ptr从下标为1到4的内容拷贝到prevlen内 memcpy(&(prevlen), ((char*)(ptr)) + 1, 4); \ //进行大小端转换 memrev32ifbe(&prevlen); \ } \ } while(0); /* Return the difference in number of bytes needed to store the length of the * previous element 'len', in the entry pointed to by 'p'. * * 计算编码新的前置节点长度len所需的字节数 * 减去编码p原来的前置节点长度所需的字节数之差 * */ static int zipPrevLenByteDiff(unsigned char *p, unsigned int len) { unsigned int prevlensize; //取出前置节点长度所需的字节数,并将它保存在prevlensize变量中 ZIP_DECODE_PREVLENSIZE(p, prevlensize); //计算编码len所需的字节数,然后进行减法 return zipPrevEncodeLength(NULL, len) - prevlensize; } /* Return the total number of bytes used by the entry pointed to by 'p'. * * 返回指针p所指向的节点占用的字节数总和 * * */ static unsigned int zipRawEntryLength(unsigned char *p) { unsigned int prevlensize, encoding, lensize, len; //解码p指针,取出编码前置节点长度所需的字节数,并将它保存到prevlensize中。 ZIP_DECODE_PREVLENSIZE(p, prevlensize); //encoding:用于保存当前节点的编码类型 //lensize:用于保存当前节点长度所需的字节数 //len:保存当前节点长度 ZIP_DECODE_LENGTH(p + prevlensize, encoding, lensize, len); //返回占用的总的字节总和 return prevlensize + lensize + len; } /* Check if string pointed to by 'entry' can be encoded as an integer. * Stores the integer value in 'v' and its encoding in 'encoding'. * * 检查entry中指定的字符串能否被编码为整数 * * 如果可以的话, * 将编码的整数保存在指针v的值中,并将编码方式保存在指针encoding的值中。 * * * * */ static int zipTryEncoding(unsigned char *entry, unsigned int entrylen, long long *v, unsigned char *encoding) { long long value; //忽略太长或太短的字符串 if (entrylen >= 32 || entrylen == 0) return 0; //将一个字符串转换为long long 类型 if (string2ll((char*)entry,entrylen,&value)) { /* Great, the string can be encoded. Check what's the smallest * of our encoding types that can hold this value. */ //当转换成功时, //以从大到小的顺序检查适合value的编码方式 if (value >= 0 && value <= 12) { *encoding = ZIP_INT_IMM_MIN+value; } else if (value >= INT8_MIN && value <= INT8_MAX) { *encoding = ZIP_INT_8B; } else if (value >= INT16_MIN && value <= INT16_MAX) { *encoding = ZIP_INT_16B; } else if (value >= INT24_MIN && value <= INT24_MAX) { *encoding = ZIP_INT_24B; } else if (value >= INT32_MIN && value <= INT32_MAX) { *encoding = ZIP_INT_32B; } else { *encoding = ZIP_INT_64B; } //使用指针记录value的值 *v = value; //返回转换成功的标识 return 1; } //转换失败 return 0; } /* Store integer 'value' at 'p', encoded as 'encoding' * * 以encoding指定的编码方式,将整数值value写入到p * * */ static void zipSaveInteger(unsigned char *p, int64_t value, unsigned char encoding) { int16_t i16; int32_t i32; int64_t i64; if (encoding == ZIP_INT_8B) { //当编码方式为8为有符号的整数时 //编码长度只需要一个字节, ((int8_t*)p)[0] = (int8_t)value; } else if (encoding == ZIP_INT_16B) { //当编码方式为int32_t类型的整数时, //编码长度为1个字节 //i16用于保存int16_t类型的整数 i16 = value; //内存拷贝,将i16的值拷贝到字符串指针p中 memcpy(p,&i16,sizeof(i16)); //进行大小端转换 memrev16ifbe(p); } else if (encoding == ZIP_INT_24B) { //当编码方式为24位有符号数时, i32 = value<<8; memrev32ifbe(&i32); memcpy(p,((uint8_t*)&i32)+1,sizeof(i32)-sizeof(uint8_t)); } else if (encoding == ZIP_INT_32B) { //当编码方式为int32_t类型的整数时 i32 = value; //内存拷贝 memcpy(p,&i32,sizeof(i32)); //大小端转换 memrev32ifbe(p); } else if (encoding == ZIP_INT_64B) { //当编码方式为int64_t类型时, i64 = value; //内存拷贝 memcpy(p,&i64,sizeof(i64)); //大小端转换 memrev64ifbe(p); } else if (encoding >= ZIP_INT_IMM_MIN && encoding <= ZIP_INT_IMM_MAX) { /* Nothing to do, the value is stored in the encoding itself. */ //否则,什么也不做 } else { assert(NULL); } } /* Read integer encoded as 'encoding' from 'p' * * 以encoding指定的编码方式,读取并返回指针p中的整数值 * * */ static int64_t zipLoadInteger(unsigned char *p, unsigned char encoding) { int16_t i16; int32_t i32; int64_t i64, ret = 0; if (encoding == ZIP_INT_8B) { //当编码方式为8位的有符号整数时, //其编码长度为1个字节 //首先将指针p转化为int8_t的类型 //然后将第一个字节里的内容赋给ret ret = ((int8_t*)p)[0]; } else if (encoding == ZIP_INT_16B) { //当编码方式为int16_t类型时, //进行内存拷贝 memcpy(&i16,p,sizeof(i16)); //大小端转换 memrev16ifbe(&i16); ret = i16; } else if (encoding == ZIP_INT_32B) { memcpy(&i32,p,sizeof(i32)); memrev32ifbe(&i32); ret = i32; } else if (encoding == ZIP_INT_24B) { i32 = 0; memcpy(((uint8_t*)&i32)+1,p,sizeof(i32)-sizeof(uint8_t)); memrev32ifbe(&i32); ret = i32>>8; } else if (encoding == ZIP_INT_64B) { memcpy(&i64,p,sizeof(i64)); memrev64ifbe(&i64); ret = i64; } else if (encoding >= ZIP_INT_IMM_MIN && encoding <= ZIP_INT_IMM_MAX) { ret = (encoding & ZIP_INT_IMM_MASK)-1; } else { assert(NULL); } return ret; } /* Return a struct with all information about an entry. * * 将p所指向的列表节点的信息全保存到zlentry中,并返回该zlentry * * */ static zlentry zipEntry(unsigned char *p) { zlentry e; //e.prevrawlensize保存着编码前一个节点的长度所需的字节数 //e.prevrawlen保存着前一个节点的长度 //调用ZIP_DECODE_PREVLEN 对p进行解码,取出编码前置节点长度所需的字节数和长度值分别保存在e.prevrawlensize和e.prevrawlen中 ZIP_DECODE_PREVLEN(p, e.prevrawlensize, e.prevrawlen); //调用ZIP_DECODE_LENGTH 对p进行解码, //e.encoding 保存节点值的编码类型 //e.lensize保存编码节点值长度所需的字节数 //e.len保存节点值的长度 ZIP_DECODE_LENGTH(p + e.prevrawlensize, e.encoding, e.lensize, e.len); //计算头节点的字节数 e.headersize = e.prevrawlensize + e.lensize; //记录指针 e.p = p; return e; }
下面我们看几个操作压缩列表的函数:
(1)创建一个空的新的ziplist:
/* Create a new empty ziplist. * *创建并返回一个新的ziplist * */ unsigned char *ziplistNew(void) { //ZIPLIST_HEADER_SIZE是ziplist表头的大小 //1 字节是表末端ZIP_END的大小 unsigned int bytes = ZIPLIST_HEADER_SIZE+1; //为表头和表尾分配空间 unsigned char *zl = zmalloc(bytes); //初始化表的属性 ZIPLIST_BYTES(zl) = intrev32ifbe(bytes); ZIPLIST_TAIL_OFFSET(zl) = intrev32ifbe(ZIPLIST_HEADER_SIZE); ZIPLIST_LENGTH(zl) = 0; //设置表末端 zl[bytes-1] = ZIP_END; //返回空的新的ziplist return zl; }
/* Resize the ziplist. * * 调整ziplist的大小为len个字节 * *当ziplist原有的大小小于len时,扩展ziplist不会改变ziplist原有的元素 * */ static unsigned char *ziplistResize(unsigned char *zl, unsigned int len) { //用zrealloc扩展时不改变现有元素 zl = zrealloc(zl,len); //更新bytes属性 ZIPLIST_BYTES(zl) = intrev32ifbe(len); //重新设置表末端 zl[len-1] = ZIP_END; return zl; } /* When an entry is inserted, we need to set the prevlen field of the next * entry to equal the length of the inserted entry. It can occur that this * length cannot be encoded in 1 byte and the next entry needs to be grow * a bit larger to hold the 5-byte encoded prevlen. This can be done for free, * because this only happens when an entry is already being inserted (which * causes a realloc and memmove). However, encoding the prevlen may require * that this entry is grown as well. This effect may cascade throughout * the ziplist when there are consecutive entries with a size close to * ZIP_BIGLEN, so we need to check that the prevlen can be encoded in every * consecutive entry. * * 当一个新节点添加到某个节点之前的时候, * 如果原节点的header空间不足以保存新节点的长度, * 那么就需要对原节点的header空间进行扩展(从1字节扩展到5字节)。 * * 但是,当对原节点进行扩展之后,原节点的下一个节点的prevlen可能出现空间不足, * 这种情况在多个连续节点的长度都接近于ZIP_BIGLEN时可能发生。 * * 这个函数就用于检查并修复后续节点的空间问题 * * Note that this effect can also happen in reverse, where the bytes required * to encode the prevlen field can shrink. This effect is deliberately ignored, * because it can cause a "flapping" effect where a chain prevlen fields is * first grown and then shrunk again after consecutive inserts. Rather, the * field is allowed to stay larger than necessary, because a large prevlen * field implies the ziplist is holding large entries anyway. * * 反过来说, * 因为节点的长度变小引起的连续缩小也可能出现, * * The pointer "p" points to the first entry that does NOT need to be * updated, i.e. consecutive fields MAY need an update. */ static unsigned char *__ziplistCascadeUpdate(unsigned char *zl, unsigned char *p) { //curlen用于记录ziplist所占用的内存的字节数 size_t curlen = intrev32ifbe(ZIPLIST_BYTES(zl)), rawlen, rawlensize; size_t offset, noffset, extra; unsigned char *np; //定义ziplist结构体 zlentry cur, next; while (p[0] != ZIP_END) { //将p所指向的列表节点的信息全部保存到cur中 cur = zipEntry(p); //当前p节点的整个entry的字节数 rawlen = cur.headersize + cur.len; //调用函数zipPrevEncodeLength,由于第一个参数为NULL,则仅返回编码len所需的字节数量 //存储rawlen需要的字节数 rawlensize = zipPrevEncodeLength(NULL,rawlen); /* Abort if there is no next entry. */ //如果已经没有后续空间需要更新了,跳出 //到达表尾 if (p[rawlen] == ZIP_END) break; //取出后续节点的信息,保存到next结构中 next = zipEntry(p+rawlen); /* Abort when "prevlen" has not changed. */ //后续节点编码当前节点的空间已经足够,无需再进行任何处理,跳出 //可以证明,只要遇到一个空间足够的节点, //那么这个节点之后的所有节点的空间都是足够的 if (next.prevrawlen == rawlen) break; if (next.prevrawlensize < rawlensize) { /* The "prevlen" field of "next" needs more bytes to hold * the raw length of "cur". */ //执行到这里,表示next空间的大小不足以编码cur的长度 //所以程序需要对next节点的(header部分)空间进行扩展 //记录p的偏移量 offset = p-zl; //计算需要增加的字节数 extra = rawlensize-next.prevrawlensize; //调用ziplistResize函数,调整ziplist的大小为len字节 //当ziplist原有的大小小于len时,扩展ziplist不会改变ziplist原有的元素 zl = ziplistResize(zl,curlen+extra); p = zl+offset; /* Current pointer and offset for next element. */ //新的下一个节点的首地址 np = p+rawlen; //新节点的偏移量 noffset = np-zl; /* Update tail offset when next element is not the tail element. */ if ((zl+intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))) != np) { //ZIPLIST_TAIL_OFFSET(zl):记录到达表尾节点的偏移量 //当 np不是尾节点时 //更新zl的尾节点的偏移量 ZIPLIST_TAIL_OFFSET(zl) = intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))+extra); } //当next节点不是表尾节点时,更新列表到表尾节点的偏移量 // //不用更新的情况(next为表尾节点): // // | | next | ==> | | new next | // ^ ^ // | | // tail tail // // 需要更新的情况(next不是表位节点): // // | next | | ==> | new next | | // ^ ^ // | | // old tail old tail // 更新之后: // | new next | | // ^ // | // new tail // /* Move the tail to the back. */ //np + rawlensize :新的下一个节点存储自身数据的首地址 //np + next.prevrawlensize :旧的下一个节点存储自身数据的首地址 //向后移动cur节点之后的数据,为cur的新header腾出空间 // //例: // | header | value | ==> | header | | value | ==> | header | value | // |<-->| // 为新header腾出的空间 memmove(np+rawlensize, np+next.prevrawlensize, curlen-noffset-next.prevrawlensize-1); //对前置节点p的长度进行编码写入到np中 zipPrevEncodeLength(np,rawlen); /* Advance the cursor */ //移动指针,处理下一个节点 p += rawlen; //更新ziplist所占用的字节数 curlen += extra; } else { //执行到这里说明next节点编码前置节点header空间有5字节 //而编码rawlen只需要1字节 //但是程序不会对next进行缩小 //所以这里只将rawlen写入5字节的header中算了 if (next.prevrawlensize > rawlensize) { /* This would result in shrinking, which we want to avoid. * So, set "rawlen" in the available bytes. */ zipPrevEncodeLengthForceLarge(p+rawlen,rawlen); } else { //运行到这正好说明cur节点的长度正好可以编码next节点的header中 zipPrevEncodeLength(p+rawlen,rawlen); } /* Stop here, as the raw length of "next" has not changed. */ //后续节点不用扩展 break; } } return zl; } /* Delete "num" entries, starting at "p". Returns pointer to the ziplist. * * 从位置p开始,连续删除num个节点 * * 函数返回值为处理删除操作之后的ziplist * * */ static unsigned char *__ziplistDelete(unsigned char *zl, unsigned char *p, unsigned int num) { unsigned int i, totlen, deleted = 0; size_t offset; int nextdiff = 0; zlentry first, tail; //使用first记录节点p的所有信息 first = zipEntry(p); //计算被删除节点的总个数 for (i = 0; p[0] != ZIP_END && i < num; i++) { //zipRawEntryLength:用于计算节点p所占的节点数 p += zipRawEntryLength(p); deleted++; } //totlen用于记录所有被删除节点占用的内存字节数 totlen = p-first.p; if (totlen > 0) { if (p[0] != ZIP_END) { //执行到这里说明被删除的节点之后还有节点存在 /* Storing `prevrawlen` in this entry may increase or decrease the * number of bytes required compare to the current `prevrawlen`. * There always is room to store this, because it was previously * stored by an entry that is now being deleted. */ //因为位于被删除节点范围之后的第一个节点的header部分的大小 //可能容纳不了新的前置节点,所以需要计算新旧前置节点的字节数差 nextdiff = zipPrevLenByteDiff(p,first.prevrawlen); //将指针p后退nextdiff个字节,为新的header空出空间 p -= nextdiff; //将first前置节点的长度编码至p中 zipPrevEncodeLength(p,first.prevrawlen); /* Update offset for tail */ //更新到达表尾的偏移量 ZIPLIST_TAIL_OFFSET(zl) = intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))-totlen); /* When the tail contains more than one entry, we need to take * "nextdiff" in account as well. Otherwise, a change in the * size of prevlen doesn't have an effect on the *tail* offset. */ //如果被删除节点之后,有多于一个节点 //那么程序需要将nextdiff记录的字节数也计算到表尾偏移量中 //这样才能让表尾的偏移量正确对齐表尾节点 tail = zipEntry(p); if (p[tail.headersize+tail.len] != ZIP_END) { ZIPLIST_TAIL_OFFSET(zl) = intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))+nextdiff); } /* Move tail to the front of the ziplist */ //从表尾向表头移动数据,覆盖被删除节点的数据 memmove(first.p,p, intrev32ifbe(ZIPLIST_BYTES(zl))-(p-zl)-1); } else { //执行到这里时,说明被删除节点之后没有其他的节点 /* The entire tail was deleted. No need to move memory. */ //更新表尾节点的偏移量 ZIPLIST_TAIL_OFFSET(zl) = intrev32ifbe((first.p-zl)-first.prevrawlen); } //缩小并更新ziplist的长度 /* Resize and update length */ offset = first.p-zl; zl = ziplistResize(zl, intrev32ifbe(ZIPLIST_BYTES(zl))-totlen+nextdiff); ZIPLIST_INCR_LENGTH(zl,-deleted); p = zl+offset; /* When nextdiff != 0, the raw length of the next entry has changed, so * we need to cascade the update throughout the ziplist */ //如果p所指向的节点的大小已经变更,那么进行级联更新 //检查p之后的所有节点是否符合ziplist编码的的要求 if (nextdiff != 0) zl = __ziplistCascadeUpdate(zl,p); } return zl; } <span style="font-size:18px;"> 基于删除函数的封装: </span>
/* Delete a single entry from the ziplist, pointed to by *p. * Also update *p in place, to be able to iterate over the * ziplist, while deleting entries. * * 从zl中删除*p所指向的节点, * 并且原地更新*p所指向的位置,使得可以在迭代列表的过程中对节点进行删除 * * */ unsigned char *ziplistDelete(unsigned char *zl, unsigned char **p) { //因为在_ziplistDelete时会对zl进行内存的重分配 //而内存的重分配可能会改变zl的内存地址 //所以需要记录到达*p的偏移量 //这样可以在删除节点之后通过偏移量来将*p还原到正确的位置 size_t offset = *p-zl; //调用函数_ziplistDelete zl = __ziplistDelete(zl,*p,1); /* Store pointer to current element in p, because ziplistDelete will * do a realloc which might result in a different "zl"-pointer. * When the delete direction is back to front, we might delete the last * entry and end up with "p" pointing to ZIP_END, so check this. */ *p = zl+offset; return zl; } /* Delete a range of entries from the ziplist. * * 从index索引指定的节点开始,连续地从zl中删除num个节点 * * */ unsigned char *ziplistDeleteRange(unsigned char *zl, unsigned int index, unsigned int num) { //根据索引定位到节点 unsigned char *p = ziplistIndex(zl,index); //如果p == NULL 说明根据下标没有定位到, //如果不为空时,调用函数_ziplistDelete进行删除 return (p == NULL) ? zl : __ziplistDelete(zl,p,num); }
<span style="font-size:18px;">/* Insert item at "p". * * 根据指针p所指定的位置,将长度为slen的字符串s插入到zl中。 * * 函数返回值为完成插入操作之后的ziplist * * */ static unsigned char *__ziplistInsert(unsigned char *zl, unsigned char *p, unsigned char *s, unsigned int slen) { //curlen记录ziplist占用内存总的字节数 size_t curlen = intrev32ifbe(ZIPLIST_BYTES(zl)), reqlen; unsigned int prevlensize, prevlen = 0; size_t offset; int nextdiff = 0; unsigned char encoding = 0; long long value = 123456789; /* initialized to avoid warning. Using a value that is easy to see if for some reason we use it uninitialized. */ zlentry tail; /* Find out prevlen for the entry that is inserted. */ if (p[0] != ZIP_END) { //如果p[0]不指向列表末端,说明列表非空,并且p正指向列表中的一个节点 //调用函数ZIP_DECODE_PREVLEN:解码p指针, //使得prevlensize用来保存p节点的前置节点长度所需的字节数 //prevlen用于保存前置节点的长度 ZIP_DECODE_PREVLEN(p, prevlensize, prevlen); } else { //如果p指向表尾末端,那么程序需要检查序列是否为: // (1)如果ptail也指向表尾节点ZIP_END,那么列表为空 // (2)如果列表不为空,那么ptail将指向列表的最后一个节点 unsigned char *ptail = ZIPLIST_ENTRY_TAIL(zl); if (ptail[0] != ZIP_END) { //如果列表不为空时: //prevlen用于记录指针ptail所指向的节点占用的字节数总和 prevlen = zipRawEntryLength(ptail); } } /* See if the entry can be encoded */ //调用函数zipTryEncoding:用于检查s指向的字符串能否被编码为整数 //如果可以的话,将编码后的整数保存在指针value中,并将编码方式保存在指针encoding的值中。 if (zipTryEncoding(s,slen,&value,&encoding)) { //当编码成功时,reqlen用于保存encoding编码的值所需的字节数量 /* 'encoding' is set to the appropriate integer encoding */ reqlen = zipIntSize(encoding); } else { //当编码失败时,reqlen用于保存字符串s的长度 /* 'encoding' is untouched, however zipEncodeLength will use the * string length to figure out how to encode it. */ reqlen = slen; } /* We need space for both the length of the previous entry and * the length of the payload. */ //调用函数zipPrevEncodeLength 仅返回编码前置节点长度Prevlen所需的字节数量 reqlen += zipPrevEncodeLength(NULL,prevlen); //编码当前节点所需的字节数量 reqlen += zipEncodeLength(NULL,encoding,slen); /* When the insert position is not equal to the tail, we need to * make sure that the next entry can hold this entry's length in * its prevlen field. */ //只要新节点不是被添加到列表末端 //那么程序就需要检查p所指向的节点(的header)能否编码新节点的长度。 //nextdiff保存了新旧编码之间的字节大小差,如果这个值大于0, //那么说明需要对p所指向的节点(的header)进行扩展 nextdiff = (p[0] != ZIP_END) ? zipPrevLenByteDiff(p,reqlen) : 0; /* Store offset because a realloc may change the address of zl. */ //因为重分配空间可能会改变zl的地址, //所以再分配之前,需要记录zl到p的偏移量,然后在分配之后依靠偏移量还原p offset = p-zl; //curlen:ziplist原来的长度 //reqlen:整个新节点的长度 //nextdiff:新节点的后继节点扩展header的长度(要么0字节,要么4字节) zl = ziplistResize(zl,curlen+reqlen+nextdiff); p = zl+offset; /* Apply memory move when necessary and update tail offset. */ if (p[0] != ZIP_END) { //当列表不为空时, /* Subtract one because of the ZIP_END bytes */ //新节点之后还有节点,因为新节点的加入,需要对这些原有的节点进行调整 //移动现有元素,为新元素的插入空间腾出位置 memmove(p+reqlen,p-nextdiff,curlen-offset-1+nextdiff); /* Encode this entry's raw length in the next entry. */ //将新节点的长度编码至后置节点 //p + reqlen 定位到后置节点 //reqlen是新节点的长度 zipPrevEncodeLength(p+reqlen,reqlen); /* Update offset for tail */ //更新表尾的偏移量,将新节点的长度也算上 ZIPLIST_TAIL_OFFSET(zl) = intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))+reqlen); /* When the tail contains more than one entry, we need to take * "nextdiff" in account as well. Otherwise, a change in the * size of prevlen doesn't have an effect on the *tail* offset. */ //如果新节点的后面多于一个节点 //那么程序需要将nextdiff记录的字节数也计算到表尾偏移量中 //这样才能让表尾偏移量正确对齐表尾节点 tail = zipEntry(p+reqlen); if (p[reqlen+tail.headersize+tail.len] != ZIP_END) { ZIPLIST_TAIL_OFFSET(zl) = intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))+nextdiff); } } else { //新元素为表尾节点 /* This element will be the new tail. */ ZIPLIST_TAIL_OFFSET(zl) = intrev32ifbe(p-zl); } /* When nextdiff != 0, the raw length of the next entry has changed, so * we need to cascade the update throughout the ziplist */ //当nextdiff != 0时,新节点的后继节点的(header部分)长度已经被改变, //所以需要级联地更新后续的节点 if (nextdiff != 0) { offset = p-zl; zl = __ziplistCascadeUpdate(zl,p+reqlen); p = zl+offset; } /* Write the entry */ //调用函数zipPrevEncodeLength:对前置节点的长度进行编码,并将它写入p中, //返回编码prevlen所需的字节数量 p += zipPrevEncodeLength(p,prevlen); //调用函数zipEncodeLength:编码节点长度值slen,并将它写入到p中,然后返回编码slen所需的字节数量 p += zipEncodeLength(p,encoding,slen); //如果是编码字符串 if (ZIP_IS_STR(encoding)) { //内存拷贝 memcpy(p,s,slen); } else { //否则,以encoding指定的编码方式,将整数值value写入到p zipSaveInteger(p,value,encoding); } //更新列表的节点数量计数器 ZIPLIST_INCR_LENGTH(zl,1); return zl; }</span>基于插入函数封装的一个函数:
/* * 将长度为slen的字符串s推入到zl中。 * * where 参数的值决定了推入方向: * -值为ZIPLIST_HEAD时,将新值推入到表头 * -否则,将新值推入到表末端 * * 函数的返回值:添加新值后的ziplist * */ unsigned char *ziplistPush(unsigned char *zl, unsigned char *s, unsigned int slen, int where) { unsigned char *p; //根据where的值,决定将值推入到表头还是表尾 p = (where == ZIPLIST_HEAD) ? ZIPLIST_ENTRY_HEAD(zl) : ZIPLIST_ENTRY_END(zl); //调用插入函数 return __ziplistInsert(zl,p,s,slen); }(3)压缩列表有关的查找(比如 返回下标对应的节点,返回前置节点,返回后置节点等 ):
/* Returns an offset to use for iterating with ziplistNext. When the given * index is negative, the list is traversed back to front. When the list * doesn't contain an element at the provided index, NULL is returned. * * 根据给定的索引,遍历列表,并返回索引指定节点的指针 * * 如果索引为正,那么从表头向表尾遍历 * 如果索引为负,那么从表尾向表头遍历 * 正数索引从0开始,负数索引从-1开始 * * 如果索引超过列表的节点数量,或者列表为空,那么返回NULL * * */ unsigned char *ziplistIndex(unsigned char *zl, int index) { unsigned char *p; unsigned int prevlensize, prevlen = 0; if (index < 0) { //当索引为负数时 index = (-index)-1; //定位表尾节点 p = ZIPLIST_ENTRY_TAIL(zl); if (p[0] != ZIP_END) { //如果列表不为空时, //调用函数ZIP_DECODE_PREVLEN:解码p指针 //prevlensize:保存编码前置节点长度所所需的字节数 //prevlen:保存前置节点的长度值 ZIP_DECODE_PREVLEN(p, prevlensize, prevlen); //从表尾向表头遍历 while (prevlen > 0 && index--) { p -= prevlen; ZIP_DECODE_PREVLEN(p, prevlensize, prevlen); } } } else { //当index > 0时 //定位表头节点 p = ZIPLIST_ENTRY_HEAD(zl); //从表头开始遍历节点 while (p[0] != ZIP_END && index--) { p += zipRawEntryLength(p); } } //返回结果 return (p[0] == ZIP_END || index > 0) ? NULL : p; } /* Return pointer to next entry in ziplist. * * zl is the pointer to the ziplist * p is the pointer to the current element * * The element after 'p' is returned, otherwise NULL if we are at the end. * * 返回p所指向节点的后置节点 * * 如果p为表末端,或者p已经是表尾节点,那么返回NULL * * */ unsigned char *ziplistNext(unsigned char *zl, unsigned char *p) { ((void) zl); /* "p" could be equal to ZIP_END, caused by ziplistDelete, * and we should return NULL. Otherwise, we should return NULL * when the *next* element is ZIP_END (there is no next entry). */ //如果p已经指向列表末端 if (p[0] == ZIP_END) { return NULL; } //指向p的后一个节点 p += zipRawEntryLength(p); if (p[0] == ZIP_END) { //p已经是表尾节点,没有后置节点 return NULL; } return p; } /* Return pointer to previous entry in ziplist. * * 返回p所指向节点的前置节点 * * 如果p指向为空列表,或者p已经指向表头节点,那么返回NULL * * */ unsigned char *ziplistPrev(unsigned char *zl, unsigned char *p) { unsigned int prevlensize, prevlen = 0; /* Iterating backwards from ZIP_END should return the tail. When "p" is * equal to the first element of the list, we're already at the head, * and should return NULL. */ //如果p指向表列表末端 if (p[0] == ZIP_END) { //定位表尾节点 p = ZIPLIST_ENTRY_TAIL(zl); //如果表尾节点也指向列表末端,那么列表为空,返回NULL, //否则,返回p return (p[0] == ZIP_END) ? NULL : p; } else if (p == ZIPLIST_ENTRY_HEAD(zl)) { //如果p指向表头节点,则其前置节点为NULL return NULL; } else { //当p既不是表头也不是表尾时, //调用ZIP_DECODE_PREVLEN函数:获得编码前置节点需要的字节数和前置节点的长度 ZIP_DECODE_PREVLEN(p, prevlensize, prevlen); //保证prevlen > 0 assert(prevlen > 0); //移动指针指向前一个节点 return p-prevlen; } }(4) 获取节点值:
<span style="font-size:18px;">/* Get entry pointed to by 'p' and store in either '*sstr' or 'sval' depending * on the encoding of the entry. '*sstr' is always set to NULL to be able * to find out whether the string pointer or the integer value was set. * Return 0 if 'p' points to the end of the ziplist, 1 otherwise. * * 取出p所指向节点的值: * * -如果节点保存的是字符串,那么将字符串指针保存到*sstr中,字符串长度保存到*slen * * -如果节点保存的是整数,那么将整数保存到*sval * * 程序可以通过检查*sstr是否为NULL来检查值是字符串还是整数 * * 提取成功返回1 * 如果p为空,或者p指向的是列表末端,那么返回0,提取值失败 * * */ unsigned int ziplistGet(unsigned char *p, unsigned char **sstr, unsigned int *slen, long long *sval) { zlentry entry; //如果p == NULL 或 列表为空时,返回0,提取失败 if (p == NULL || p[0] == ZIP_END) return 0; if (sstr) *sstr = NULL; //取出p所指向的节点的各项信息,并保存到结构entry中 entry = zipEntry(p); if (ZIP_IS_STR(entry.encoding)) { //当节点的值为字符串,将字符串长度保存到*slen,字符串保存到*sstr if (sstr) { //字符串长度 *slen = entry.len; //字符串内容 *sstr = p+entry.headersize; } } else { //节点的值为整数时, if (sval) { //调用函数zipLoadInteger,以encoding指定的编码方式,读取并返回指针p中的整数值 *sval = zipLoadInteger(p+entry.headersize,entry.encoding); } } return 1; } </span>
(5)节点值的比较:
/* Compare entry pointer to by 'p' with 'sstr' of length 'slen'. */ /* Return 1 if equal. * * 将p所指向的节点的值和sstr进行对比 * * 如果节点值和sstr的值相等,返回1,不相等返回0 * * */ unsigned int ziplistCompare(unsigned char *p, unsigned char *sstr, unsigned int slen) { zlentry entry; unsigned char sencoding; long long zval, sval; //如果列表为空,则返回0 if (p[0] == ZIP_END) return 0; //取出节点p所对应的信息保存于entry中 entry = zipEntry(p); if (ZIP_IS_STR(entry.encoding)) { //如果节点值为字符串,进行字符串对比 /* Raw compare */ if (entry.len == slen) { //如果节点p中保存的字符串长度 == slen //调用字符串对比函数 return memcmp(p+entry.headersize,sstr,slen) == 0; } else { //两长度不相等时,返回0 return 0; } } else { //如果节点值为整数时, /* Try to compare encoded values. Don't compare encoding because * different implementations may encoded integers differently. */ //调用函数zipEncoding:检查sstr中指向的字符串能否被编码为整数 //如果可以编码的话,将编码后的整数保存在sval中,将编码方式保存在sencoding中 if (zipTryEncoding(sstr,slen,&sval,&sencoding)) { //当编码成功时,调用函数获取p节点的整数值 zval = zipLoadInteger(p+entry.headersize,entry.encoding); return zval == sval; } } return 0; } /* Find pointer to the entry equal to the specified entry. Skip 'skip' entries * between every comparison. Returns NULL when the field could not be found. * * 寻找节点值和vstr相等的列表节点,并返回该节点的指针 * * 每次对比之前都跳过skip个节点 * * 如果找不到相应的节点,则返回NULL * * */ unsigned char *ziplistFind(unsigned char *p, unsigned char *vstr, unsigned int vlen, unsigned int skip) { int skipcnt = 0; unsigned char vencoding = 0; long long vll = 0; //只要没有到达列表末节点,就一直迭代 while (p[0] != ZIP_END) { unsigned int prevlensize, encoding, lensize, len; unsigned char *q; //获得编码前置节点长度所需的字节数 ZIP_DECODE_PREVLENSIZE(p, prevlensize); //调用函数获取列表节点的相关信息 //encoding:保存节点的编码类型 //lensize:保存编码节点长度所需的字节数 //len:保存节点的长度 ZIP_DECODE_LENGTH(p + prevlensize, encoding, lensize, len); q = p + prevlensize + lensize; if (skipcnt == 0) { /* Compare current entry with specified entry */ if (ZIP_IS_STR(encoding)) { //如果p中保存的是字符串, //对比字符串 if (len == vlen && memcmp(q, vstr, vlen) == 0) { return p; } } else { /* Find out if the searched field can be encoded. Note that * we do it only the first time, once done vencoding is set * to non-zero and vll is set to the integer value. */ //因为传入值有可能被编码了 //所以当第一次进行值对比时,程序会对传入值进行解码 //这个解码操作只会进行1次 if (vencoding == 0) { //调用函数zipTryEncoding:尝试将vstr编码为整数 if (!zipTryEncoding(vstr, vlen, &vll, &vencoding)) { /* If the entry can't be encoded we set it to * UCHAR_MAX so that we don't retry again the next * time. */ //当编码失败时, vencoding = UCHAR_MAX; } /* Must be non-zero by now */ assert(vencoding); } /* Compare current entry with specified entry, do it only * if vencoding != UCHAR_MAX because if there is no encoding * possible for the field it can't be a valid integer. */ if (vencoding != UCHAR_MAX) { //当程序执行到这时,说明vstr可以编码为整数。 //q代表节点p所占的字节数,将p根据编码转化为整数 long long ll = zipLoadInteger(q, encoding); if (ll == vll) { return p; } } } /* Reset skip count */ skipcnt = skip; } else { /* Skip entry */ skipcnt--; } /* Move to next entry */ //指针后移 p = q + len; } return NULL; } /* Return length of ziplist. * * 返回ziplist中的节点个数 * * */ unsigned int ziplistLen(unsigned char *zl) { unsigned int len = 0; //节点数小于< UINT16_MAX,直接返回其长度 if (intrev16ifbe(ZIPLIST_LENGTH(zl)) < UINT16_MAX) { len = intrev16ifbe(ZIPLIST_LENGTH(zl)); } else { //当节点数大于UINT16_MAX,需要遍历整个列表才能计算出节点 unsigned char *p = zl+ZIPLIST_HEADER_SIZE; while (*p != ZIP_END) { p += zipRawEntryLength(p); len++; } /* Re-store length if small enough */ if (len < UINT16_MAX) ZIPLIST_LENGTH(zl) = intrev16ifbe(len); } return len; } /* Return ziplist blob size in bytes. * * 返回整个ziplist占用的内存字节数 * */ size_t ziplistBlobLen(unsigned char *zl) { return intrev32ifbe(ZIPLIST_BYTES(zl)); }
上述则是所有redis对于压缩列表的实现!!!源代码我也做了比较清楚的说明,欢迎大家找茬!!!
看过源码的人应该知道,压缩列表在插入和删除时效率是比较低的,因为在压缩列表的节点的结构体中封装了一个叫prevlensize的成员,该成员记录的是编码前置节点所需的字节数,和prevlen记录前置节点的长度,在插入的过程中就会出现连锁更新的现象,删除也是如此,对于插入和删除的最坏的时间复杂度达到o(n^2),所以压缩列表不太适合数据量比较的时候,它适用于数据量不大的情况下。。。