Redis数据类型和抽象介绍

Redis is not a plain key-value store, it is actually a data structures server, supporting different kinds of values. What this means is that, while in traditional key-value stores you associate string keys to string values, in Redis the value is not limited to a simple string, but can also hold more complex data structures. The following is the list of all the data structures supported by Redis, which will be covered separately in this tutorial:

Redis不是一个简单的键-值存储器,它实际上是一个数据结构服务,支持不同类型的值。这意味着,在传统的键值存储你关联字符串键到字符串值,在Redis中值不限于普通的字符串,还可以更多复合的数据结构。下面是Rediscover支持的所有的数据类型,这将在本教程中单独介绍。

  • Binary-safe strings.
  • 二进制安全字符串
  • Lists: collections of string elements sorted according to the order of insertion. They are basically linked lists.
  • 列表(List):与插入顺序一致的已排序字符串元素集合。它们基本上就是链接列表
  • Sets: collections of unique, unsorted string elements.
  • 集合:唯一、无序的字符串元素集合。
  • Sorted sets, similar to Sets but where every string element is associated to a floating number value, called score. The elements are always taken sorted by their score, so unlike Sets it is possible to retrieve a range of elements (for example you may ask: give me the top 10, or the bottom 10).
  • 有序集合,类似于集合但是每个字符串元素关联一个浮点数字值score。这些元素总是按他们的score排序,不像集合(set),它可能检索一个范围内的元素(例如你会问:给我前十,或者后十个元素)。
  • Hashes, which are maps composed of fields associated with values. Both the field and the value are strings. This is very similar to Ruby or Python hashes.
  • 哈希,由字段和值的映射组成。字段和值都是字符串,跟Ruby或者Python的哈希非常相似。
  • Bit arrays (or simply bitmaps): it is possible, using special commands, to handle String values like an array of bits: you can set and clear individual bits, count all the bits set to 1, find the first set or unset bit, and so forth.
  • 字节数组(或者简单的位图):可以使用特别的命令去处理字符串值像一个字节数组一样:你可以设置和清除单个的自己,将所有的位都设置为1,查找第一个或者未设置的位,等等。
  • HyperLogLogs: this is a probabilistic data structure which is used in order to estimate the cardinality of a set. Don't be scared, it is simpler than it seems... See later in the HyperLogLog section of this tutorial.
  • HyperLogLogs:这个是一个概率数据结构,用于估计一个集合的基数。不要害怕,它比它看起来更简单... 稍后在本教程的HyperLogLog部分会看到。
  • Streams: append-only collections of map-like entries that provide an abstract log data type. They are covered in depth in the Introduction to Redis Streams.
  • 流:提供一个抽象日志数据类型的类-图(map-like)实体的只追加集合。它们在 Introduction to Redis Streams中有深入的介绍。

It's not always trivial to grasp how these data types work and what to use in order to solve a given problem from the command reference, so this document is a crash course in Redis data types and their most common patterns.

掌握这些数据类型如何工作并不是总是不重要的,为了解决command reference中的给定的问题该使用什么。

在命令参考中找到这些数据类型是如何工作的,虽然这并不是很容易,所以这个文档是Redis数据类型和它们常用模型的一个快速教程。

For all the examples we'll use the redis-cli utility, a simple but handy command-line utility, to issue commands against the Redis server.

所有的示例,我们都会使用redis-cli应用程序,一个简单但是方便的命令行程序,对Redis服务器发送命令。

Redis keys

Redis keys are binary safe, this means that you can use any binary sequence as a key, from a string like "foo" to the content of a JPEG file. The empty string is also a valid key.

Redis的键(key)是二进制安全的,这就意味着你可以使用任何二进制序列作为一个key,从一个像“foo”的字符串到一个JPEG文件。空字符串也是一个合法的key。

A few other rules about keys:

关于key一些其他规则:

  • Very long keys are not a good idea. For instance a key of 1024 bytes is a bad idea not only memory-wise, but also because the lookup of the key in the dataset may require several costly key-comparisons. Even when the task at hand is to match the existence of a large value, hashing it (for example with SHA1) is a better idea, especially from the perspective of memory and bandwidth.
  • 非常长的key不是一个主意。比如,一个长达2024字节的key是一个坏的注意,不仅仅从内存角度,同样也因为在数据集中查找查找这个key可能会花费很多key比较的时间。即使手头的任务是匹配一个大的值的存在,哈希是一个更好的主意(比如使用SHA1),特别是从内存和带宽的视角。
  • Very short keys are often not a good idea. There is little point in writing "u1000flw" as a key if you can instead write "user:1000:followers". The latter is more readable and the added space is minor compared to the space used by the key object itself and the value object. While short keys will obviously consume a bit less memory, your job is to find the right balance.
  • 非常短的key通常不是一个好主意。使用“u1000flw”没有什么意义,你可以用“user:1000:followers”代替。后一种写法更可读,并且key对象本身增加的空间和值对象使用的空间相比较是微不足道的。虽然小的key明显被假定为使用更少的内存,但是你的工作是发现正确的平衡。
  • Try to stick with a schema. For instance "object-type:id" is a good idea, as in "user:1000". Dots or dashes are often used for multi-word fields, as in comment:1234:reply.to or comment:1234:reply-to.
  • 尝试给出一个图示。例如,“object-type:id”是一个好主意,就像“user:1000”。点(.)和破折号(-)常常用于多词的字段,像"comment:1234:reply.to" 或者"comment:1234:reply-to"一样。
  • The maximum allowed key size is 512 MB.
  • 最大允许512MB的key。

Redis Strings

The Redis String type is the simplest type of value you can associate with a Redis key. It is the only data type in Memcached, so it is also very natural for newcomers to use it in Redis.

Redis字符串类型是你能关联到Redis键(key)的最简单的一种类型。它是Memcached唯一的数据类型,因此它对Redis的新用户来说非常自然。

Since Redis keys are strings, when we use the string type as a value too, we are mapping a string to another string. The string data type is useful for a number of use cases, like caching HTML fragments or pages.

因为Redis的键是字符串,当我们也使用字符串类型作为值的时候,我们可以映射一个字符串和其他的字符串。字符串数据类型在很多场景下都非常有用,比如缓存HTML片段或者页面。

Let's play a bit with the string type, using redis-cli (all the examples will be performed via redis-cli in this tutorial).

让我们使用redis-cli来演示一下字符串类型(本教程中的所有例子都将通过redis-cli演示)。

> set mykey somevalue
OK
> get mykey
"somevalue"

As you can see using the SET and the GET commands are the way we set and retrieve a string value. Note that SET will replace any existing value already stored into the key, in the case that the key already exists, even if the key is associated with a non-string value. So SET performs an assignment.

如你所见,我们使用 SET 和 GET命令设置和获取一个字符串值。 注意, SET命令将会替换任何已经保存在key上的值,在这个情况下key已经存在甚至key跟一个空字符串关联。因此SET执行一个赋值操作。

Values can be strings (including binary data) of every kind, for instance you can store a jpeg image inside a value. A value can't be bigger than 512 MB.

任何类型(包括二进制数据)都可以字符串化,例如,你可以保存jpeg图像文件到一个值上。值不能大于512MB。

The SET command has interesting options, that are provided as additional arguments. For example, I may ask SET to fail if the key already exists, or the opposite, that it only succeed if the key already exists:

SET有一些有趣的选项,可以作为一个额外参数传递过来。例如,我可能会要求 SET 失败(fail),如果key已经存在的话;或者相反,当key已经存在时,它只会成功。

> set mykey newval nx
(nil)
> set mykey newval xx
OK

Even if strings are the basic values of Redis, there are interesting operations you can perform with them. For instance, one is atomic increment:

甚至如果字符串是Redis的基础值,你也可以执行一些有趣的操作。比如 ,原子自增。

> set counter 100
OK
> incr counter
(integer) 101
> incr counter
(integer) 102
> incrby counter 50
(integer) 152

The INCR command parses the string value as an integer, increments it by one, and finally sets the obtained value as the new value. There are other similar commands like INCRBY, DECR and DECRBY. Internally it's always the same command, acting in a slightly different way.

INCR命令解析字符串的值为整数,每次增加1,最后将获取的值设置为新值。还有一些相似的值,例如:INCRBY, DECR and DECRBY。在内部,它们几乎是相同的命令,有一些些微的不同。

What does it mean that INCR is atomic? That even multiple clients issuing INCR against the same key will never enter into a race condition. For instance, it will never happen that client 1 reads "10", client 2 reads "10" at the same time, both increment to 11, and set the new value to 11. The final value will always be 12 and the read-increment-set operation is performed while all the other clients are not executing a command at the same time.

INCR是原子行操作吗?甚至是多客户端使用INCR操作相同的key,也不会进入竞争条件。例如,客户端1读取10,客户端2也同时读取10,同时增长到11并且设置同样的新值11,这样的情况永远不会发生。最终值将永远会是12,并且当其他所有的客户端没有在同一时间执行一个命令时,执行读取-增长-设置操作。

There are a number of commands for operating on strings. For example the GETSET command sets a key to a new value, returning the old value as the result. You can use this command, for example, if you have a system that increments a Redis key using INCR every time your web site receives a new visitor. You may want to collect this information once every hour, without losing a single increment. You can GETSET the key, assigning it the new value of "0" and reading the old value back.

有很多命令可以操作字符串。例如,GETSET命令给一个key设置一个新值,返回一个旧的值作为结果。你可以使用命令,比如,如果你有一个系统,当你的网站每增加一个新的访问者,就使用INCR把一个Redis的key的值加一。有可能想要每小时搜集一次这个信息,而不丢失一个增量。你可以GETSET这个key,给他赋予新值“0”,并且返回久的值。

The ability to set or retrieve the value of multiple keys in a single command is also useful for reduced latency. For this reason there are the MSET and MGET commands:

在单个命令中一次性设置或者取回多个值对于减少延迟是非常有用的。因此而有了MSET 和 MGET命令。

> mset a 10 b 20 c 30
OK
> mget a b c
1) "10"
2) "20"
3) "30"

When MGET is used, Redis returns an array of values.

当使用MSET 时,Redis返回一个值的数组。

Altering and querying the key space

There are commands that are not defined on particular types, but are useful in order to interact with the space of keys, and thus, can be used with keys of any type.

有一些命令没有定义到特别的类型上,但是在键(key)和空间的交互上是有用的,因而,可以用在任何类型的键上。

For example the EXISTS command returns 1 or 0 to signal if a given key exists or not in the database, while the DEL command deletes a key and associated value, whatever the value is.

例如,EXISTS命令返回1或者0来标志一个给定的key是否在数据库中存在,DEL命令删除一个键和其关联的值,无论值是否存在。

> set mykey hello
OK
> exists mykey
(integer) 1
> del mykey
(integer) 1
> exists mykey
(integer) 0

From the examples you can also see how DEL itself returns 1 or 0 depending on whether the key was removed (it existed) or not (there was no such key with that name).

从这个例子中,你也可以看到 DEL如何删除其自身,返回1或0依赖于这个键(key)是(它存在)与否(没有这个名字的key)被移除。

There are many key space related commands, but the above two are the essential ones together with the TYPE command, which returns the kind of value stored at the specified key:

有许多跟键空间关联的命令,但是上面的两个和返回特定key的值类型的TYPE是最基本的命令:

> set mykey x
OK
> type mykey
string
> del mykey
(integer) 1
> type mykey
none

Redis expires: keys with limited time to live

Before continuing with more complex data structures, we need to discuss another feature which works regardless of the value type, and is called Redis expires. Basically you can set a timeout for a key, which is a limited time to live. When the time to live elapses, the key is automatically destroyed, exactly as if the user called the DEL command with the key.

在继续更复杂的数据类型之前,我们需要讨论另一个与值类型无关的特性,它被称为Redis expires。最基本的你可以为一个key设置一个超时(timeout),即一个有限的生存时间。当生存时间过后,这个键(key)自动被销毁,就好像用户调用 DEL命令删除一样。

A few quick info about Redis expires:

Redis有效期的快讯:

  • They can be set both using seconds or milliseconds precision.
  • 有效期可以设置为秒或微秒精度。
  • However the expire time resolution is always 1 millisecond.
  • 然而,过期时间的分辨度始终是一毫秒。
  • Information about expires are replicated and persisted on disk, the time virtually passes when your Redis server remains stopped (this means that Redis saves the date at which a key will expire).
  • 有关过期时间的信息被复制和持久化到磁盘上,直到你的Redis服务器停止(这意味着Redis保存一个键的过期时间)。

Setting an expire is trivial:

设置过期时间很简单:

> set key some-value
OK
> expire key 5
(integer) 1
> get key (immediately)
"some-value"
> get key (after some time)
(nil)

The key vanished between the two GET calls, since the second call was delayed more than 5 seconds. In the example above we used EXPIRE in order to set the expire (it can also be used in order to set a different expire to a key already having one, like PERSIST can be used in order to remove the expire and make the key persistent forever). However we can also create keys with expires using other Redis commands. For example using SET options:

在两次 GET请求期间这个key被销毁了,因为第二次调用被延迟超过5秒。在上面的例子中我们使用EXPIRE来设置过期时间(它也可用给一个已经有过期时间的键设置一个不同的过期时间,类似于PERSIST可用于移除一个键已设置的过期时间)。然而,我们也可以使用其它Redis命令创建带有过期时间的键。例如,使用 SET选项。

> set key 100 ex 10
OK
> ttl key
(integer) 9

The example above sets a key with the string value 100, having an expire of ten seconds. Later the TTL command is called in order to check the remaining time to live for the key.

善变的例子中,给key设置了一个字符串值100,10秒钟过期。稍后调用了TTL命令来检查这个key剩余的生存时间。

In order to set and check expires in milliseconds, check the PEXPIRE and the PTTL commands, and the full list of SET options.

为了设置和检查毫秒级过期时间,核对PEXPIRE 、PTTL命令,和SET的全部选项列表。

Redis Lists

To explain the List data type it's better to start with a little bit of theory, as the term List is often used in an improper way by information technology folks. For instance "Python Lists" are not what the name may suggest (Linked Lists), but rather Arrays (the same data type is called Array in Ruby actually).

为了解释List数据类型,我们最好从一点点理论开始,因为List这个属于经常被信息技术人员用错误的方式使用。比如,“Pyton Lists”不是听起来的(链表)那样,而是数组(实际上,同样的数据类型在Ruby中被称为数组Array)。

From a very general point of view a List is just a sequence of ordered elements: 10,20,1,2,3 is a list. But the properties of a List implemented using an Array are very different from the properties of a List implemented using a Linked List.

从一个通用的视角来看,List是一个有序元素的队列:10,20,1,2,3是一个列表。但是队列的属性使用数组(Array)实现和使用链表(Linked List)来实现是不同的。

Redis lists are implemented via Linked Lists. This means that even if you have millions of elements inside a list, the operation of adding a new element in the head or in the tail of the list is performed in constant time. The speed of adding a new element with the LPUSH command to the head of a list with ten elements is the same as adding an element to the head of list with 10 million elements.

Redis列表是通过链表来实现的。这意味着即使一个列表里面 有上百万个元素,在链表的头部或尾部添加一个新的元素的操作也可以在常量级时间(constant time)内完成。使用 LPUSH在一个拥有十个元素的列表头部添加一个新元素的速度,与一个拥有上千万元素的列表添加一个新元素的速度是一样的。

What's the downside? Accessing an element by index is very fast in lists implemented with an Array (constant time indexed access) and not so fast in lists implemented by linked lists (where the operation requires an amount of work proportional to the index of the accessed element).

缺点是什么呢?当列表通过数组(Array)实现时,通过索引访问元素非常快,通过链表(linked lists)实现时并不快(这项操作需要的工作量与访问该元素的索引成正比)。

Redis Lists are implemented with linked lists because for a database system it is crucial to be able to add elements to a very long list in a very fast way. Another strong advantage, as you'll see in a moment, is that Redis Lists can be taken at constant length in constant time.

Redis列表是用链表实现的,因为对于一个数据库系统来说,能快速的在一个很长的列表上添加元素是非常重要的。另外一个强大的优势是,你很快就能看到,Redis List可以在一个常量级长度常量级时间获取。

When fast access to the middle of a large collection of elements is important, there is a different data structure that can be used, called sorted sets. Sorted sets will be covered later in this tutorial.

当快速访问一个很大集合中的元素很重要时,可以使用另一种数据结构,被称为有序集合。有序集合稍后会在本教程中介绍。

First steps with Redis Lists

The LPUSH command adds a new element into a list, on the left (at the head), while the RPUSH command adds a new element into a list, on the right (at the tail). Finally the LRANGE command extracts ranges of elements from lists:

LPUSH命令在列表的左边(头部)添加一个新的元素,而RPUSH在列表的右边(尾部)添加一个新的元素。最后,LRANGE命令从列表中获取一定范围的元素。

> rpush mylist A
(integer) 1
> rpush mylist B
(integer) 2
> lpush mylist first
(integer) 3
> lrange mylist 0 -1
1) "first"
2) "A"
3) "B"

Note that LRANGE takes two indexes, the first and the last element of the range to return. Both the indexes can be negative, telling Redis to start counting from the end: so -1 is the last element, -2 is the penultimate element of the list, and so forth.

注意,LRANGE有两个索引值,需要返回的第一个和最后一个元素。两个索引都可以是负的,告诉Redis最后开始计数:因此,-1是最后一个元素,-2是列表的倒数第二个元素,等等。

As you can see RPUSH appended the elements on the right of the list, while the final LPUSH appended the element on the left.

如你所见,RPUSH将元素附加到列表的右边,而LPUSH附加元素到左边。

Both commands are variadic commands, meaning that you are free to push multiple elements into a list in a single call:

这两个命令都是可变参数命令,意味着你可以调用一个命令自由的将多个元素放到列表中:

> rpush mylist 1 2 3 4 5 "foo bar"
(integer) 9
> lrange mylist 0 -1
1) "first"
2) "A"
3) "B"
4) "1"
5) "2"
6) "3"
7) "4"
8) "5"
9) "foo bar"

An important operation defined on Redis lists is the ability to pop elements. Popping elements is the operation of both retrieving the element from the list, and eliminating it from the list, at the same time. You can pop elements from left and right, similarly to how you can push elements in both sides of the list:

弹出元素的能力是在Redis列表上定义的一个重要的操作。弹出元素是从列表中提取元素的同时并删除它的操作。你可以从列表的左边或右边弹出元素,就像你可以从列表的两端放入元素一样:

> rpush mylist a b c
(integer) 3
> rpop mylist
"c"
> rpop mylist
"b"
> rpop mylist
"a"

We added three elements and popped three elements, so at the end of this sequence of commands the list is empty and there are no more elements to pop. If we try to pop yet another element, this is the result we get:

我们添加3个元素并弹出3个元素,因此,在命令序列的最后列表是空的,且没有元素可以被弹出。如果我们还尝试弹出其他的元素,我们将得到这样的结果:

> rpop mylist
(nil)

Redis returned a NULL value to signal that there are no elements in the list.

Redis将返回一个NULL值标志着列表中没有元素。

Common use cases for lists

Lists are useful for a number of tasks, two very representative use cases are the following:

列表对一些任务是有用的,下面是两个典型的用例:

  • Remember the latest updates posted by users into a social network.
  • 记住用户在社交网络上发布的最近更新。
  • Communication between processes, using a consumer-producer pattern where the producer pushes items into a list, and a consumer (usually a worker) consumes those items and executed actions. Redis has special list commands to make this use case both more reliable and efficient.
  • 进程间的交互,使用生产者-消费者模型,生产者推送项目到一个列表中,然后消费者(通常是一个“worker”)消费这些项目,然后执行动作。Redis有特定的列表命令使这个用例更可靠和更高效。

For example both the popular Ruby libraries resque and sidekiq use Redis lists under the hood in order to implement background jobs.

例如,流行的Ruby类库resque 和 sidekiq在底层使用Redis列表来实现后台的任务。

The popular Twitter social network takes the latest tweets posted by users into Redis lists.

流行的Twitter社交网络会把用户发布的最新推文(takes the latest tweets)放到Redis列表中。

To describe a common use case step by step, imagine your home page shows the latest photos published in a photo sharing social network and you want to speedup access.

为了分步骤描述常用的案例,可以想象在一个分享图片的社交网络中显示你主页的最新照片,并且你希望快速的访问它们。

  • Every time a user posts a new photo, we add its ID into a list with LPUSH.
  • 每次用户发布一个新的照片时,我们使用LPUSH将它的ID放到列表中。
  • When users visit the home page, we use LRANGE 0 9 in order to get the latest 10 posted items.
  • 当用户访问主页时,我们可以使用LRANGE 0 9获取最新发布的10个项目。

Capped lists

In many use cases we just want to use lists to store the latest items, whatever they are: social network updates, logs, or anything else.

在很多情况下,我们仅仅想要使用列表来保存最新项目,无论它们是:社交网络更新,日志,或其它的任何东西。

Redis allows us to use lists as a capped collection, only remembering the latest N items and discarding all the oldest items using the LTRIM command.

Redis允许我们将列表作为一个有上限的集合来使用,仅需要记住最新的N条项目,使用LTRIM命令丢弃所有旧的项目。

The LTRIM command is similar to LRANGE, but instead of displaying the specified range of elements it sets this range as the new list value. All the elements outside the given range are removed.

LTRIM命令类似于LRANGE,它将这个范围设置为新的列表值,而不是显示特定范围内的元素。给定范围之外的所有元素将会被移除。

An example will make it more clear:

使用一个例子将会使它更清楚:

> rpush mylist 1 2 3 4 5
(integer) 5
> ltrim mylist 0 2
OK
> lrange mylist 0 -1
1) "1"
2) "2"
3) "3"

The above LTRIM command tells Redis to take just list elements from index 0 to 2, everything else will be discarded. This allows for a very simple but useful pattern: doing a List push operation + a List trim operation together in order to add a new element and discard elements exceeding a limit:

上面的 LTRIM命令告诉Redis仅获取索引从0到2的元素,其他的任何东西都将会被舍弃。这允许一个非常简单但又很实用的模式:将一个列表(List)推送操作+列表剪切操作合在一起,为了添加一个新元素和舍弃超过限制的元素。

LPUSH mylist 
LTRIM mylist 0 999

The above combination adds a new element and takes only the 1000 newest elements into the list. With LRANGE you can access the top items without any need to remember very old data.

上面的组合添加了一个新的元素,并且仅获取最新的1000个元素到一个列表。使用LRANGE你可以访问顶部的项目而不必记住非常旧的数据。

Note: while LRANGE is technically an O(N) command, accessing small ranges towards the head or the tail of the list is a constant time operation.

注意:虽然LRANGE是一个在技术上是O(N)的命令,从列表头部或者尾部访问一个小范围的数据是一个常量时间的操作。

Blocking operations on lists

Lists have a special feature that make them suitable to implement queues, and in general as a building block for inter process communication systems: blocking operations.

列表有一个特殊的特性使得它很适合实现队列,并且一般作为进程间通信的构建块:阻塞操作。

Imagine you want to push items into a list with one process, and use a different process in order to actually do some kind of work with those items. This is the usual producer / consumer setup, and can be implemented in the following simple way:

想象一下,如果你想要在一个进程中将一些项目放到一个列表中 ,而在其它进程中使用这些项目做一些任务。这通常是生产者/消费者设置,可以使用下面的简单的方式实现:

  • To push items into the list, producers call LPUSH.
  • 生产者调用LPUSH将项目放进列表中。
  • To extract / process items from the list, consumers call RPOP.
  • 为了从列表中提取/处理项目,消费者调用RPOP。

However it is possible that sometimes the list is empty and there is nothing to process, so RPOP just returns NULL. In this case a consumer is forced to wait some time and retry again with RPOP. This is called polling, and is not a good idea in this context because it has several drawbacks:

尽管有时列表可能是空的,或者没有东西可以处理,因此RPOP仅返回NULL。在这个案例中,一个客户强制等待一些时间和使用RPOP再次重试。这被称为轮询,在这个上下文中它不是一个好主意,因为它有几个缺点:

  1. Forces Redis and clients to process useless commands (all the requests when the list is empty will get no actual work done, they'll just return NULL).
  2. 强制Redis和客户端去处理无用的命令(当列表为空时,所有的请求都不会有设计的操作,它们仅仅返回NULL)。
  3. Adds a delay to the processing of items, since after a worker receives a NULL, it waits some time. To make the delay smaller, we could wait less between calls to RPOP, with the effect of amplifying problem number 1, i.e. more useless calls to Redis.
  4. 为项目的处理增加了延迟,自从一个工作者接收到NULL,它会等待一些时间。为了使等待更少,我们可以让两次调用RPOP之间的等待变少,其效果是放大问题数字1,等。更多的无用Redis调用。

So Redis implements commands called BRPOP and BLPOP which are versions of RPOP and LPOP able to block if the list is empty: they'll return to the caller only when a new element is added to the list, or when a user-specified timeout is reached.

因此Redis实现了BRPOP 和 BLPOP命令,是当列表为空时,RPOP 和 LPOP可以阻塞的版本:它们只有当一个新的元素被添加到列表时,或当用户设定的超时到达时,才会返回给调用者。

This is an example of a BRPOP call we could use in the worker:

这是一个我们可以用在Worker中的BRPOP调用的例子:

> brpop tasks 5
1) "tasks"
2) "do_something"

It means: "wait for elements in the list tasks, but return if after 5 seconds no element is available".

它意味着:“等待列表中的元素任务,如果5秒钟后没有可用的元素,则返回”。

Note that you can use 0 as timeout to wait for elements forever, and you can also specify multiple lists and not just one, in order to wait on multiple lists at the same time, and get notified when the first list receives an element.

注意,你可以使用0作为超时时间,则会一直等待元素,同时,你也可以指定多个列表(不仅仅是一个),为了同时等待多个列表,并且当第一个列表接收到一个元素时获得通知。

A few things to note about BRPOP:

关于BRPOP要注意的一些事情:

  1. Clients are served in an ordered way: the first client that blocked waiting for a list, is served first when an element is pushed by some other client, and so forth.
  2. 客户端被有序的服务:当第一个客户端阻塞等待一个列表,当一个元素被其他客户端推送时,它被第一个服务,以此类推。
  3. The return value is different compared to RPOP: it is a two-element array since it also includes the name of the key, because BRPOP and BLPOP are able to block waiting for elements from multiple lists.
  4. 返回值与RPOP不同:它是一个两个元素的数组,因为它还包含键名,因为BRPOP 和 BLPOP能够阻塞等待多个列表的元素。
  5. If the timeout is reached, NULL is returned.
  6. 如果超时时间已到,将会返回NULL。

There are more things you should know about lists and blocking ops. We suggest that you read more on the following:

关于列表和阻塞操作,你还有更多需要了解的东西。我们建议你阅读以下更多内容:

  • It is possible to build safer queues or rotating queues using LMOVE.
  • 可以使用LMOVE建立更安全的队列或者环形队列。
  • There is also a blocking variant of the command, called BLMOVE.
  • 这个命令有一个阻塞变体,被称为BLMOVE。

Automatic creation and removal of keys

So far in our examples we never had to create empty lists before pushing elements, or removing empty lists when they no longer have elements inside. It is Redis' responsibility to delete keys when lists are left empty, or to create an empty list if the key does not exist and we are trying to add elements to it, for example, with LPUSH.

到目前为止,在我们的例子中,我们在推入元素前从来不必创建空的列表,或者当一个列表内部没有元素时移除空的列表。当列表为空时,Redis的职责是删除键,或者当一个键不存在时创建空的列表,并且尝试向其上添加元素,例如,使用LPUSH。

This is not specific to lists, it applies to all the Redis data types composed of multiple elements -- Streams, Sets, Sorted Sets and Hashes.

列表应用于组成多元素--流(Stream)、集合(Set)、有序集合(Sort Set) 和 哈希(Hash),这并不是很特别。

Basically we can summarize the behavior with three rules:

我们基本上可以使用三条规则概括这种行为:

  1. When we add an element to an aggregate data type, if the target key does not exist, an empty aggregate data type is created before adding the element.
  2. 当我们添加一个元素到聚合数据类型时,如果目标键(key)不存在,在添加元素之前一个空的聚合数据类型将会被创建。
  3. When we remove elements from an aggregate data type, if the value remains empty, the key is automatically destroyed. The Stream data type is the only exception to this rule.
  4. 当我们从一个聚合数据类型移除数据时,如果剩下来的值为空,那么该键自动会被销毁。仅有流(Stream)数据类型例外。
  5. Calling a read-only command such as LLEN (which returns the length of the list), or a write command removing elements, with an empty key, always produces the same result as if the key is holding an empty aggregate type of the type the command expects to find.
  6. 调用一个只读命令,例如LLEN(返回列表的长度),或者在一个空的键上调用写命令来移除元素,总是产生同样的结果,就像这个键持有一个空的聚合数据类型,这个类型的命令期待去发现。

Examples of rule 1:

规则1的例子:

> del mylist
(integer) 1
> lpush mylist 1 2 3
(integer) 3

However we can't perform operations against the wrong type if the key exists:

可是,我们不能在一个已经存在的键上执行错误的类型:

> set foo bar
OK
> lpush foo 1 2 3
(error) WRONGTYPE Operation against a key holding the wrong kind of value
> type foo
string

Example of rule 2:

规则2的例子:

> lpush mylist 1 2 3
(integer) 3
> exists mylist
(integer) 1
> lpop mylist
"3"
> lpop mylist
"2"
> lpop mylist
"1"
> exists mylist
(integer) 0

The key no longer exists after all the elements are popped.

当所有的元素都弹出时,则键不再存在:

Example of rule 3:

规则3的例子:

> del mylist
(integer) 0
> llen mylist
(integer) 0
> lpop mylist
(nil)

Redis Hashes

Redis hashes look exactly how one might expect a "hash" to look, with field-value pairs:

Redis哈希看起来就像一个人期待的“哈希(hash)”样子,带有键-值对。

> hmset user:1000 username antirez birthyear 1977 verified 1
OK
> hget user:1000 username
"antirez"
> hget user:1000 birthyear
"1977"
> hgetall user:1000
1) "username"
2) "antirez"
3) "birthyear"
4) "1977"
5) "verified"
6) "1"

While hashes are handy to represent objects, actually the number of fields you can put inside a hash has no practical limits (other than available memory), so you can use hashes in many different ways inside your application.

哈希可以很方便的表示对象(object),实际上可以放进哈希的字段数量没有限制(除非超过可用内存),因此你可以在你的应用程序中以很多不同的方式使用哈希。

The command HMSET sets multiple fields of the hash, while HGET retrieves a single field. HMGET is similar to HGET but returns an array of values:

HMSET命令设置多字段哈希,而HGET取回单个的字段。HMGET类似于HGET,但是返回一个值的数组。

> hmget user:1000 username birthyear no-such-field
1) "antirez"
2) "1977"
3) (nil)

There are commands that are able to perform operations on individual fields as well, like HINCRBY:

有一些命令能够在单个字段上很好的执行操作,例如HINCRBY:

> hincrby user:1000 birthyear 10
(integer) 1987
> hincrby user:1000 birthyear 10
(integer) 1997

You can find the full list of hash commands in the documentation.

你可以查找全部哈希命令清单.

It is worth noting that small hashes (i.e., a few elements with small values) are encoded in special way in memory that make them very memory efficient.

值得注意的是小哈希(即一些拥有小的值的元素)是通过特殊的编码形式保存在内存中,这使得它们在内存中非常高效。

Redis Sets

Redis Sets are unordered collections of strings. The SADD command adds new elements to a set. It's also possible to do a number of other operations against sets like testing if a given element already exists, performing the intersection, union or difference between multiple sets, and so forth.

Redis集合(Set)是字符串的无需集合。SADD命令添加新的元素到一个集合中。它也可以在集合(Set)上做一些其他的操作,例如测试一个给定的元素是否存在,在多个集合间执行交集、并集或差集,等等。

> sadd myset 1 2 3
(integer) 3
> smembers myset
1. 3
2. 1
3. 2

Here I've added three elements to my set and told Redis to return all the elements. As you can see they are not sorted -- Redis is free to return the elements in any order at every call, since there is no contract with the user about element ordering.

这里,我已经在集合中添加了3个元素,并且告诉Redis返回所有的元素。如你所见,它们没有排序-- Rediscover在每次调用时以任何顺序随机返回元素,因为没有任何关于元素排序的契约。

Redis has commands to test for membership. For example, checking if an element exists:

Redis有测试成员的命令。比如,检查元素是否存在:

> sismember myset 3
(integer) 1
> sismember myset 30
(integer) 0

"3" is a member of the set, while "30" is not.

“3”是集合的成员,而“30”则不是。

Sets are good for expressing relations between objects. For instance we can easily use sets in order to implement tags.

集合(Set)可以很好的展现对象之间的关系。例如,我们可以很容易的使用集合实现标签。

A simple way to model this problem is to have a set for every object we want to tag. The set contains the IDs of the tags associated with the object.

一个为此问题建模的简单方式是为我们想要打标签的所有对象建一个集合。这个集合包含与该对象有关的标签的ID。

One illustration is tagging news articles. If article ID 1000 is tagged with tags 1, 2, 5 and 77, a set can associate these tag IDs with the news item:

一个例子是给新闻文章打标签。如果文章ID 1000被打上1,2,5和77的标签,集合可以将这些标签的ID与新闻条目相关联。

> sadd news:1000:tags 1 2 5 77
(integer) 4

We may also want to have the inverse relation as well: the list of all the news tagged with a given tag:

我们也许同样想有相反的关系:所有被打上给定标签的所有新闻列表:

> sadd tag:1:news 1000
(integer) 1
> sadd tag:2:news 1000
(integer) 1
> sadd tag:5:news 1000
(integer) 1
> sadd tag:77:news 1000
(integer) 1

To get all the tags for a given object is trivial:

获取一个给定对象的所有标签是轻而易举的:

> smembers news:1000:tags
1. 5
2. 1
3. 77
4. 2

Note: in the example we assume you have another data structure, for example a Redis hash, which maps tag IDs to tag names.

注意:在这个例子中我们假定你有另外一个数据结构,例如一个Redis哈希来映射标签的ID和标签的名字。

There are other non trivial operations that are still easy to implement using the right Redis commands. For instance we may want a list of all the objects with the tags 1, 2, 10, and 27 together. We can do this using the SINTER command, which performs the intersection between different sets. We can use:

还有一个些其他的并不简单的操作可以通过使用正确的Redis命令很容易的实现。例如,我们可能会想要一个拥有1,2,10和27的标签的所有对象列表。我们可以使用SINTER命令,在两个不同的集合之间得到交集。我们这样用:

> sinter tag:1:news tag:2:news tag:10:news tag:27:news
... results here ...

In addition to intersection you can also perform unions, difference, extract a random element, and so forth.

除交集之外,你还可以执行并集,差集,提取一个随机元素,等等。

The command to extract an element is called SPOP, and is handy to model certain problems. For example in order to implement a web-based poker game, you may want to represent your deck with a set. Imagine we use a one-char prefix for (C)lubs, (D)iamonds, (H)earts, (S)pades:

获取元素的命令是SPOP,可以很方便的对特定问题建模。例如,为了实现一个基于web的扑克游戏,你可能想要使用一个集合代表你的整幅牌。假设我们使用单字符前缀来代表(C)lubs, (D)iamonds, (H)earts, (S)pades。

>  sadd deck C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 CJ CQ CK
   D1 D2 D3 D4 D5 D6 D7 D8 D9 D10 DJ DQ DK H1 H2 H3
   H4 H5 H6 H7 H8 H9 H10 HJ HQ HK S1 S2 S3 S4 S5 S6
   S7 S8 S9 S10 SJ SQ SK
   (integer) 52

Now we want to provide each player with 5 cards. The SPOP command removes a random element, returning it to the client, so it is the perfect operation in this case.

现在我们想给每个玩家发5张牌。SPOP命令随机的移除一个元素,返回到客户端,在这个例子中它是完美的操作。

However if we call it against our deck directly, in the next play of the game we'll need to populate the deck of cards again, which may not be ideal. So to start, we can make a copy of the set stored in the deck key into the game:1:deck key.

但是,如果我们直接对我们的牌组调用它,在接下来的游戏中我们将需要再次填充这副牌,那并不是理想的。首先,我们把保存在键deck中的集合复制到键game:1:deck中。

This is accomplished using SUNIONSTORE, which normally performs the union between multiple sets, and stores the result into another set. However, since the union of a single set is itself, I can copy my deck with:

可以使用SUNIONSTORE完成,它通常执行多个集合之间的并集,并将结果保存在另外一个集合中。然而,因为一个单个的集合的并集就是它本身,我可以复制我的牌组:

> sunionstore game:1:deck deck
(integer) 52

Now I'm ready to provide the first player with five cards:

现在,我已经准备好为第一个玩家提供5张牌了:

> spop game:1:deck
"C6"
> spop game:1:deck
"CQ"
> spop game:1:deck
"D1"
> spop game:1:deck
"CJ"
> spop game:1:deck
"SJ"

One pair of jacks, not great...

对J(杰克),不太妙:

This is a good time to introduce the set command that provides the number of elements inside a set. This is often called the cardinality of a set in the context of set theory, so the Redis command is called SCARD.

现在是介绍集合命令的好时机,该命令能提供集合内部元素的数量。在集合理论的上下文中,通常被称为集合的基数,因此Redis的命令被称为SCARD。

> scard game:1:deck
(integer) 47

The math works: 52 - 5 = 47.

数据派上用场了:52 - 5 = 47。

When you need to just get random elements without removing them from the set, there is the SRANDMEMBER command suitable for the task. It also features the ability to return both repeating and non-repeating elements.

当你仅需要获取随机元素,但是不把它们从集合中移除,有一个SRANDMEMBER命令很适合这个工作。它也具有返回重复和不重复的元素的能力。

Redis Sorted sets

Sorted sets are a data type which is similar to a mix between a Set and a Hash. Like sets, sorted sets are composed of unique, non-repeating string elements, so in some sense a sorted set is a set as well.

有序集合是一个类似于集合和哈希混合的数据结构。像集合,有序集合由唯一的,不重复的字符串元素组成,所以在某种意义上一个有序集合也是一个集合。

However while elements inside sets are not ordered, every element in a sorted set is associated with a floating point value, called the score (this is why the type is also similar to a hash, since every element is mapped to a value).

然而,与集合中的元素没有排序相比,有序集合中的每个元素关联一个被称为分数(score)的浮点值(这也是为什么这种类型也类似于哈希,因为每个元素是一个映射一个值)。

Moreover, elements in a sorted sets are taken in order (so they are not ordered on request, order is a peculiarity of the data structure used to represent sorted sets). They are ordered according to the following rule:

此外,有序集合中的元素是按顺序取的(因此它们不是按请求排序的,顺序是用于表示有序集合数据结构的特性)。它们是按照下面的规则排序的:

  • If A and B are two elements with a different score, then A > B if A.score is > B.score.
  • 如果A和B是两个拥有不同分数的元素,那么如果A.score > B.score,则A > B。
  • If A and B have exactly the same score, then A > B if the A string is lexicographically greater than the B string. A and B strings can't be equal since sorted sets only have unique elements.
  • 如果A和B恰巧拥有相同的分数,那么如果A的字典序大于B的字典序,则A>B。A和B字符串不能相等,因为有序集合只有唯一的元素。

Let's start with a simple example, adding a few selected hackers names as sorted set elements, with their year of birth as "score".

让我们以一个简单的例子开始,添加一些选定的黑客名字到有序集合中,并以他们的出生年份作为“分数(score)”。

> zadd hackers 1940 "Alan Kay"
(integer) 1
> zadd hackers 1957 "Sophie Wilson"
(integer) 1
> zadd hackers 1953 "Richard Stallman"
(integer) 1
> zadd hackers 1949 "Anita Borg"
(integer) 1
> zadd hackers 1965 "Yukihiro Matsumoto"
(integer) 1
> zadd hackers 1914 "Hedy Lamarr"
(integer) 1
> zadd hackers 1916 "Claude Shannon"
(integer) 1
> zadd hackers 1969 "Linus Torvalds"
(integer) 1
> zadd hackers 1912 "Alan Turing"
(integer) 1

As you can see ZADD is similar to SADD, but takes one additional argument (placed before the element to be added) which is the score. ZADD is also variadic, so you are free to specify multiple score-value pairs, even if this is not used in the example above.

如你所见,ZADD类似于SADD,但是携带了一个额外的参数(放在要添加元素的前面),分数。ZADD也是一个可变参数命令,所以你可以自由的指定多分数-值对,甚至上面的例子中没有使用。

With sorted sets it is trivial to return a list of hackers sorted by their birth year because actually they are already sorted.

对于有序集合来说,返回一个按照黑客出生年份排序的列表是轻而易举的,因为实际上它们已经是排过序的

Implementation note: Sorted sets are implemented via a dual-ported data structure containing both a skip list and a hash table, so every time we add an element Redis performs an O(log(N)) operation. That's good, but when we ask for sorted elements Redis does not have to do any work at all, it's already all sorted:

实现注意点:有序集合是通过一个包含跳跃表和哈希表的双端数据结构实现的,因此每当我们添加一个元素时,Redis执行一个O(log(N))的操作。非常好,但是当我们向Redis请求有序的元素时不必再做任何工作,它已经是拍过序的:

> zrange hackers 0 -1
1) "Alan Turing"
2) "Hedy Lamarr"
3) "Claude Shannon"
4) "Alan Kay"
5) "Anita Borg"
6) "Richard Stallman"
7) "Sophie Wilson"
8) "Yukihiro Matsumoto"
9) "Linus Torvalds"

Note: 0 and -1 means from element index 0 to the last element (-1 works here just as it does in the case of the LRANGE command).

注意:0和-1意味着从索引为0的元素到最后一个元素(-1在这儿的作用和LRANGE命令的例子中作用一样)

What if I want to order them the opposite way, youngest to oldest? Use ZREVRANGE instead of ZRANGE:

如果我想要相反的顺序排列呢,从最小的到最大的?使用ZREVRANGE取代ZRANGE。

> zrevrange hackers 0 -1
1) "Linus Torvalds"
2) "Yukihiro Matsumoto"
3) "Sophie Wilson"
4) "Richard Stallman"
5) "Anita Borg"
6) "Alan Kay"
7) "Claude Shannon"
8) "Hedy Lamarr"
9) "Alan Turing"

It is possible to return scores as well, using the WITHSCORES argument:

它也可以返回分数(score),使用WITHSCORES参数:

> zrange hackers 0 -1 withscores
1) "Alan Turing"
2) "1912"
3) "Hedy Lamarr"
4) "1914"
5) "Claude Shannon"
6) "1916"
7) "Alan Kay"
8) "1940"
9) "Anita Borg"
10) "1949"
11) "Richard Stallman"
12) "1953"
13) "Sophie Wilson"
14) "1957"
15) "Yukihiro Matsumoto"
16) "1965"
17) "Linus Torvalds"
18) "1969"

Operating on ranges

Sorted sets are more powerful than this. They can operate on ranges. Let's get all the individuals that were born up to 1950 inclusive. We use the ZRANGEBYSCORE command to do it:

有序集合比这个更强大。它们可以按区间操作。让我们获取所有的1950年及以前出生的人。我们使用ZRANGEBYSCORE命令来做这个:

> zrangebyscore hackers -inf 1950
1) "Alan Turing"
2) "Hedy Lamarr"
3) "Claude Shannon"
4) "Alan Kay"
5) "Anita Borg"

We asked Redis to return all the elements with a score between negative infinity and 1950 (both extremes are included).

我们请求Redis以返回分数从负无穷到1950间所有的元素(包含两个极端)。

It's also possible to remove ranges of elements. Let's remove all the hackers born between 1940 and 1960 from the sorted set:

按照范围删除元素也是可以的。让我们从有序集合中移除出生于1940年到1960年的所有黑客。

> zremrangebyscore hackers 1940 1960
(integer) 4

ZREMRANGEBYSCORE is perhaps not the best command name, but it can be very useful, and returns the number of removed elements.

ZREMRANGEBYSCORE可能不是最好的命令名称,但是它非常有用的,并且返回被删除的元素的数量。

Another extremely useful operation defined for sorted set elements is the get-rank operation. It is possible to ask what is the position of an element in the set of the ordered elements.

另外一个为有序集合定义的设置元素的非常有用操作是获取等级(get-rank)操作。它可以查询元素在有序集合中的位置。

> zrank hackers "Anita Borg"
(integer) 4

The ZREVRANK command is also available in order to get the rank, considering the elements sorted a descending way.

考虑到元素以降序排序,ZREVRANK命令也可以获取等级。

Lexicographical scores

With recent versions of Redis 2.8, a new feature was introduced that allows getting ranges lexicographically, assuming elements in a sorted set are all inserted with the same identical score (elements are compared with the C memcmp function, so it is guaranteed that there is no collation, and every Redis instance will reply with the same output).

在Redis的最近版本2.8中,一个新的特性被引入进来,允许获取字典序列,假定元素是使用同样的分数被插入到一个有序集合中(元素与Cmemcmp函数进行比较,因此可以保证没有被校对,并且每个Redis实例将回复同样的输出)。

The main commands to operate with lexicographical ranges are ZRANGEBYLEX, ZREVRANGEBYLEX, ZREMRANGEBYLEX and ZLEXCOUNT.

使用字典序范围操作的主要命令是:ZRANGEBYLEX, ZREVRANGEBYLEX, ZREMRANGEBYLEX 和 ZLEXCOUNT

For example, let's add again our list of famous hackers, but this time use a score of zero for all the elements:

例如,让我们再次向我们的列表中添加著名的黑客,但是这一次我们对所有的元素都使用0分.

> zadd hackers 0 "Alan Kay" 0 "Sophie Wilson" 0 "Richard Stallman" 0
  "Anita Borg" 0 "Yukihiro Matsumoto" 0 "Hedy Lamarr" 0 "Claude Shannon"
  0 "Linus Torvalds" 0 "Alan Turing"

Because of the sorted sets ordering rules, they are already sorted lexicographically:

因为有序集合按照规则排序,它们已经按照字典序排序过了:

> zrange hackers 0 -1
1) "Alan Kay"
2) "Alan Turing"
3) "Anita Borg"
4) "Claude Shannon"
5) "Hedy Lamarr"
6) "Linus Torvalds"
7) "Richard Stallman"
8) "Sophie Wilson"
9) "Yukihiro Matsumoto"

Using ZRANGEBYLEX we can ask for lexicographical ranges:

使用ZRANGEBYLEX,我们可以使用字典序范围查询:

> zrangebylex hackers [B [P
1) "Claude Shannon"
2) "Hedy Lamarr"
3) "Linus Torvalds"

Ranges can be inclusive or exclusive (depending on the first character), also string infinite and minus infinite are specified respectively with the + and - strings. See the documentation for more information.

范围可以是包含的或者排除的(依赖于第一个字符),因此字符+-用于指定无穷大或者负无穷。查看文档以获取更多信息。

This feature is important because it allows us to use sorted sets as a generic index. For example, if you want to index elements by a 128-bit unsigned integer argument, all you need to do is to add elements into a sorted set with the same score (for example 0) but with an 16 byte prefix consisting of the 128 bit number in big endian. Since numbers in big endian, when ordered lexicographically (in raw bytes order) are actually ordered numerically as well, you can ask for ranges in the 128 bit space, and get the element's value discarding the prefix.

这个特性是重要的,因为它允许我们使用有序集合作为范型索引。例如,如果你想要通过128位无符号整型参数索引元素,你需要的仅仅是使用相同的分数(比如0)将元素添加到一个有序集合中,考虑到128位数字是大端序,使用16位前缀。因为数字是大端序,当我们以字典序(按原始字节)排序时实际上也是按照数字排序,你可以在128位空间内按照范围查找,并且获取元素时丢弃前缀。

If you want to see the feature in the context of a more serious demo, check the Redis autocomplete demo.

如果你想在一个严谨的demo上下文中查看这个特性,请检查Redis autocomplete demo。

Updating the score: leader boards

Just a final note about sorted sets before switching to the next topic. Sorted sets' scores can be updated at any time. Just calling ZADD against an element already included in the sorted set will update its score (and position) with O(log(N)) time complexity. As such, sorted sets are suitable when there are tons of updates.

在切换到下一个话题前,关于有序集合还有最后一个注意事项。有序集合的分数(score)可以在任何时候更新。仅需要对已经存在于有序集合中的元素上调用ZADD命令,就会以O(log(N))的时间复杂度更新它的分数(和位置)。同样,有序集合适用于需要大量更新的情况。

Because of this characteristic a common use case is leader boards. The typical application is a Facebook game where you combine the ability to take users sorted by their high score, plus the get-rank operation, in order to show the top-N users, and the user rank in the leader board (e.g., "you are the #4932 best score here").

这个特性有一个常见的用例是排行榜。典型的应用是Facebook的游戏排行,你可以结合这个能力根据用户的高分进行排序,加上获取排名(get-rank)操作,将top-N用户以及用户排名显示在排行榜上(例如,“你是第4932名最高分用户”)。

Bitmaps

Bitmaps are not an actual data type, but a set of bit-oriented operations defined on the String type. Since strings are binary safe blobs and their maximum length is 512 MB, they are suitable to set up to 232 different bits.

位图(Bitmap)并不是一个真实的数据类型,而是一个定义在字符串类型上的一系列按位操作。因为字符串是二进制安全blob,最大长度为512MB,它们适合设置232个不同的位。

Bit operations are divided into two groups: constant-time single bit operations, like setting a bit to 1 or 0, or getting its value, and operations on groups of bits, for example counting the number of set bits in a given range of bits (e.g., population counting).

按位操作分为两组:固定时间的单个位操作,像设置一个位数位1或0,或者获取它们的值,以及对比特组进行的操作,例如,在一个给定范围内的位上统计设置的位数(例如,人口计数)。

One of the biggest advantages of bitmaps is that they often provide extreme space savings when storing information. For example in a system where different users are represented by incremental user IDs, it is possible to remember a single bit information (for example, knowing whether a user wants to receive a newsletter) of 4 billion of users using just 512 MB of memory.

位图一个最大的优势是:当储存信息时它们能极大的节省存储空间。例如,在一个系统中,使用递增的用户ID代表用户,可能会仅仅使用512MB的内存记住40亿用户的一位的信息(例如,用户是否想要接收信件)。

Bits are set and retrieved using the SETBIT and GETBIT commands:

使用SETBIT 和 GETBIT 设置和获取位(bit)信息。

> setbit key 10 1
(integer) 1
> getbit key 10
(integer) 1
> getbit key 11
(integer) 0

The SETBIT command takes as its first argument the bit number, and as its second argument the value to set the bit to, which is 1 or 0. The command automatically enlarges the string if the addressed bit is outside the current string length.

SETBIT命令携带的第一个参数是位号,第二个参数是设置给位的值,即1或0。如果地址位超出当前字符串的长度的话,该命令会自动扩大字符串。

GETBIT just returns the value of the bit at the specified index. Out of range bits (addressing a bit that is outside the length of the string stored into the target key) are always considered to be zero.

GETBIT命令仅返回给定位号的值。超出位的长度(在目标键所存储的字符串的长度之外寻址)将总是被认作是0。

There are three commands operating on group of bits:

操作位组的三个命令:

  1. BITOP performs bit-wise operations between different strings. The provided operations are AND, OR, XOR and NOT.
  2. BITOP在不同的字符串之间按位操作。提供的操作有AND,OR,XOR,和NOT
  3. BITCOUNT performs population counting, reporting the number of bits set to 1.
  4. BITCOUNT执行填充计数,报告位的数量设置为1。
  5. BITPOS finds the first bit having the specified value of 0 or 1.
  6. BITPOS查找特定值0或1的第一次出现的位置。

Both BITPOS and BITCOUNT are able to operate with byte ranges of the string, instead of running for the whole length of the string. The following is a trivial example of BITCOUNT call:

BITPOS 和 BITCOUNT都可以以位范围操作字符串,而不是运行整个长度的字符串。下面是一个BITCOUNT调用的小例子:

> setbit key 0 1
(integer) 0
> setbit key 100 1
(integer) 0
> bitcount key
(integer) 2

Common use cases for bitmaps are:

位图的常见用例是:

  • Real time analytics of all kinds.
  • 实时统计所有种类。
  • Storing space efficient but high performance boolean information associated with object IDs.
  • 与对象ID有关的布尔信息的高效存储但是高性能。

For example imagine you want to know the longest streak of daily visits of your web site users. You start counting days starting from zero, that is the day you made your web site public, and set a bit with SETBIT every time the user visits the web site. As a bit index you simply take the current unix time, subtract the initial offset, and divide by the number of seconds in a day (normally, 3600*24).

例如,假设你想知道你的网站上连续访问时间最长的用户。你每天从0开始计数,从你网站发布的那天开始,并且每当用户访问你的网站时使用SETBIT设置位。作为位索引,你只需简单的将现在的unix时间减去初始化的偏移量,然后除以每天的秒数(通常是3600*24)。

This way for each user you have a small string containing the visit information for each day. With BITCOUNT it is possible to easily get the number of days a given user visited the web site, while with a few BITPOS calls, or simply fetching and analyzing the bitmap client-side, it is possible to easily compute the longest streak.

通过这种方式,对于每个用户你都有一个小的字符串包含每天的访问信息。你能够使用BITCOUNT很简单获得一个给定用户每天访问网站的数量,而只需要一点BITPOS调用,或者简单的获取和分析位图的客户端,就可以简单的计算出最长停留。

Bitmaps are trivial to split into multiple keys, for example for the sake of sharding the data set and because in general it is better to avoid working with huge keys. To split a bitmap across different keys instead of setting all the bits into a key, a trivial strategy is just to store M bits per key and obtain the key name with bit-number/M and the Nth bit to address inside the key with bit-number MOD M.

位图可以容易的拆分多个key,例如为了给数据集分片,因为通常来说最好避免使用大的键。通过不同的键来分解一个位图而不是将所有的位设置到一个键上,一个小的策略就是将每个键保存到M位,使用bit-number/M来获取键的名字,并且使用bit-number MOD M获得指向键的内部位置的第n位。

HyperLogLogs

A HyperLogLog is a probabilistic data structure used in order to count unique things (technically this is referred to estimating the cardinality of a set). Usually counting unique items requires using an amount of memory proportional to the number of items you want to count, because you need to remember the elements you have already seen in the past in order to avoid counting them multiple times. However there is a set of algorithms that trade memory for precision: you end with an estimated measure with a standard error, which in the case of the Redis implementation is less than 1%. The magic of this algorithm is that you no longer need to use an amount of memory proportional to the number of items counted, and instead can use a constant amount of memory! 12k bytes in the worst case, or a lot less if your HyperLogLog (We'll just call them HLL from now) has seen very few elements.

HLLs in Redis, while technically a different data structure, are encoded as a Redis string, so you can call GET to serialize a HLL, and SET to deserialize it back to the server.

Conceptually the HLL API is like using Sets to do the same task. You would SADD every observed element into a set, and would use SCARD to check the number of elements inside the set, which are unique since SADD will not re-add an existing element.

While you don't really add items into an HLL, because the data structure only contains a state that does not include actual elements, the API is the same:

  • Every time you see a new element, you add it to the count with PFADD.

  • Every time you want to retrieve the current approximation of the unique elements added with PFADD so far, you use the PFCOUNT.

    > pfadd hll a b c d
    (integer) 1
    > pfcount hll
    (integer) 4
    

An example of use case for this data structure is counting unique queries performed by users in a search form every day.

Redis is also able to perform the union of HLLs, please check the full documentation for more information.

Other notable features

There are other important things in the Redis API that can't be explored in the context of this document, but are worth your attention:

  • It is possible to iterate the key space of a large collection incrementally.
  • It is possible to run Lua scripts server side to improve latency and bandwidth.
  • Redis is also a Pub-Sub server.

Learn more

This tutorial is in no way complete and has covered just the basics of the API. Read the command reference to discover a lot more.

Thanks for reading, and have fun hacking with Redis!

你可能感兴趣的:(Redis数据类型和抽象介绍)