官网地址: https://redis.io/topics/data-types-intro
Redis is not a plain key-value store, it is actually a data structures server, supporting different kinds of values. What this means is that, while in traditional key-value stores you associate string keys to string values, in Redis the value is not limited to a simple string, but can also hold more complex data structures. The following is the list of all the data structures supported by Redis, which will be covered separately in this tutorial:
Redis 不是普通的键值存储,它实际上是一个数据结构服务器,支持不同类型的值。这意味着,虽然在传统的键值存储中,你可以将字符串键与字符串值关联起来,但在 Redis,这个值不仅限于一个简单的字符串,还可以存储更复杂的数据结构。以下是 Redis 支持的所有数据结构的列表,本教程将单独介绍这些数据结构:
It's not always trivial to grasp how these data types work and what to use in order to solve a given problem from the command reference, so this document is a crash course in Redis data types and their most common patterns.
从命令参考中掌握这些数据类型如何工作以及使用什么来解决给定问题并不总是那么简单,因此本文是Redis数据类型及其最常见模式的速成课程
For all the examples we'll use the redis-cli
utility, a simple but handy command-line utility, to issue commands against the Redis server.
对于所有示例,我们将使用 Redis-cli 实用程序,这是一个简单但方便的命令行实用程序,用于针对 Redis 服务器发出命令。
Redis keys are binary safe, this means that you can use any binary sequence as a key, from a string like "foo" to the content of a JPEG file. The empty string is also a valid key.
Redis 键是二进制安全的,这意味着您可以使用任何二进制序列作为键,从“ foo”这样的字符串到 JPEG 文件的内容。空字符串也是一个有效的键。
A few other rules about keys:
关于keys的一些其他规则:
The Redis String type is the simplest type of value you can associate with a Redis key. It is the only data type in Memcached, so it is also very natural for newcomers to use it in Redis.
Redis String 类型是可以与 Redis 键关联的最简单的值类型。它是Memcached中唯一的数据类型,因此对于新手来说,在Redis中使用它也是很自然的。
Since Redis keys are strings, when we use the string type as a value too, we are mapping a string to another string. The string data type is useful for a number of use cases, like caching HTML fragments or pages.
因为 Redis 键是字符串,所以当我们使用字符串类型作为值时,也是将一个字符串映射到另一个字符串。字符串数据类型对于许多用例很有用,比如缓存 HTML 片段或页面。
Let's play a bit with the string type, using redis-cli
(all the examples will be performed via redis-cli
in this tutorial).
让我们使用 redis-cli 对字符串类型进行一些操作(在本教程中,所有示例都将通过 redis-cli 执行)。
> set mykey somevalue
OK
> get mykey
"somevalue"
As you can see using the SET and the GET commands are the way we set and retrieve a string value. Note that SET will replace any existing value already stored into the key, in the case that the key already exists, even if the key is associated with a non-string value. So SET performs an assignment.
正如您可以看到的,使用 SET 和 GET 命令是我们设置和检索字符串值的方式。注意,在键已经存在的情况下,SET 将替换已经存储在键中的任何现有值,即使键与非字符串值相关联。通过 SET 执行赋值操作。
Values can be strings (including binary data) of every kind, for instance you can store a jpeg image inside a value. A value can't be bigger than 512 MB.
Values 值可以是各种类型的字符串(包括二进制数据) ,例如,您可以将 jpeg 图像存储在值中。这个值不能大于512mb。
The SET command has interesting options, that are provided as additional arguments. For example, I may ask SET to fail if the key already exists, or the opposite, that it only succeed if the key already exists:
SET 命令有一些有趣的选项,这些选项作为附加参数提供。例如,如果key已经存在,我可以要求 SET 失败,或者相反,如果key已经存在,SET 才能成功:
> set mykey newval nx
(nil)
> set mykey newval xx
OK
Even if strings are the basic values of Redis, there are interesting operations you can perform with them. For instance, one is atomic increment:
即使字符串是Redis的基本值,您也可以使用它们执行一些有趣的操作。 例如,一个是原子增量:
> set counter 100
OK
> incr counter
(integer) 101
> incr counter
(integer) 102
> incrby counter 50
(integer) 152
The INCR command parses the string value as an integer, increments it by one, and finally sets the obtained value as the new value. There are other similar commands like INCRBY, DECR and DECRBY. Internally it's always the same command, acting in a slightly different way.
INCR 命令将字符串值解析为整数,并将其增加一,最后将获得的值设置为新值。还有其他类似的命令,如 INCRBY、 DECR 和 DECRBY。在内部,它始终是相同的命令,以略有不同的方式行事。
What does it mean that INCR is atomic? That even multiple clients issuing INCR against the same key will never enter into a race condition. For instance, it will never happen that client 1 reads "10", client 2 reads "10" at the same time, both increment to 11, and set the new value to 11. The final value will always be 12 and the read-increment-set operation is performed while all the other clients are not executing a command at the same time.
INCR 是原子的,这意味着什么?即使是针对同一个key发出 INCR 的多个客户机也永远不会进入竞争状态。例如,客户机1读取“10” ,客户机2同时读取“10” ,两者都增加到11,并将新值设置为11,这种情况永远不会发生。当所有其他客户端不同时执行命令时,最终值始终为12,并执行 read-increment-set 操作。
There are a number of commands for operating on strings. For example the GETSET command sets a key to a new value, returning the old value as the result. You can use this command, for example, if you have a system that increments a Redis key using INCR every time your web site receives a new visitor. You may want to collect this information once every hour, without losing a single increment. You can GETSET the key, assigning it the new value of "0" and reading the old value back.
有许多对字符串进行操作的命令。例如,GETSET 命令为新值设置一个键,并返回旧值作为结果。例如,如果您的系统在每次网站接收到新访问者时使用 INCR 递增一个 Redis key,则可以使用此命令。您可能希望每小时收集一次此信息,而不会丢失任何一个增量。您可以 GETSET 键,将其分配为新值“0” ,然后读回原来的值。
The ability to set or retrieve the value of multiple keys in a single command is also useful for reduced latency. For this reason there are the MSET and MGET commands:
在一个命令中设置或检索多个键值的能力对于减少延迟也很有用。出于这个原因,有 MSET 和 MGET 命令:
> mset a 10 b 20 c 30
OK
> mget a b c
1) "10"
2) "20"
3) "30"
When MGET is used, Redis returns an array of values.
使用 MGET 时,Redis 将返回一个值数组。
There are commands that are not defined on particular types, but are useful in order to interact with the space of keys, and thus, can be used with keys of any type.
有些命令未在特定类型上定义,但是在与键的空间进行交互时很有用,因此可以与任何类型的键一起使用。
For example the EXISTS command returns 1 or 0 to signal if a given key exists or not in the database, while the DEL command deletes a key and associated value, whatever the value is.
例如,EXISTS 命令返回1或0表示数据库中是否存在给定的键,而 DEL 命令则删除键和关联值,不管该值是什么。
> set mykey hello
OK
> exists mykey
(integer) 1
> del mykey
(integer) 1
> exists mykey
(integer) 0
From the examples you can also see how DEL itself returns 1 or 0 depending on whether the key was removed (it existed) or not (there was no such key with that name).
从这些示例中,您还可以看到 DEL 本身如何返回1或0,具体取决于键是否被删除(它存在)(没有这样的键名)。
There are many key space related commands, but the above two are the essential ones together with the TYPE command, which returns the kind of value stored at the specified key:
有许多关键的空间相关的命令,但是上面两个命令和 TYPE 命令一起是必不可少的,它返回存储在指定键上的值类型:
> set mykey x
OK
> type mykey
string
> del mykey
(integer) 1
> type mykey
none
Before continuing with more complex data structures, we need to discuss another feature which works regardless of the value type, and is called Redis expires. Basically you can set a timeout for a key, which is a limited time to live. When the time to live elapses, the key is automatically destroyed, exactly as if the user called the DEL command with the key.
在继续讨论更复杂的数据结构之前,我们需要讨论另一个不管值类型如何都能正常工作的特性,称为 Redis expires。基本上,您可以为一个键设置超时,这是一个有限的生存时间。当生存时间消逝时,key将被自动销毁,就像用户使用key调用 DEL 命令一样。
A few quick info about Redis expires:
一些关于 Redis 的快速信息过期了:
Setting an expire is trivial:
设置一个过期是很简单的:
> set key some-value
OK
> expire key 5
(integer) 1
> get key (immediately)
"some-value"
> get key (after some time)
(nil)
The key vanished between the two GET calls, since the second call was delayed more than 5 seconds. In the example above we used EXPIRE in order to set the expire (it can also be used in order to set a different expire to a key already having one, like PERSIST can be used in order to remove the expire and make the key persistent forever). However we can also create keys with expires using other Redis commands. For example using SET options:
key在两个 GET 调用之间消失,因为第二个调用被延迟了超过5秒。在上面的示例中,我们使用了 EXPIRE 来设置终止(也可以使用它来为已经拥有终止的key设置不同的终止,比如 PERSIST 可以用来移除终止并使key永久持久化)。但是,我们也可以使用其他 Redis 命令创建过期键。例如,使用 SET 选项:
> set key 100 ex 10
OK
> ttl key
(integer) 9
The example above sets a key with the string value 100
, having an expire of ten seconds. Later the TTL command is called in order to check the remaining time to live for the key.
上面的示例设置了一个字符串值为100的键,其过期时间为10秒。稍后调用 TTL 命令以检查key的剩余存活时间。
In order to set and check expires in milliseconds, check the PEXPIRE and the PTTL commands, and the full list of SET options.
为了以毫秒为单位设置和检查过期,请检查 PEXPIRE 和 PTTL 命令,以及 SET 选项的完整列表。
To explain the List data type it's better to start with a little bit of theory, as the term List is often used in an improper way by information technology folks. For instance "Python Lists" are not what the name may suggest (Linked Lists), but rather Arrays (the same data type is called Array in Ruby actually).
为了解释 List 数据类型,最好从一点点理论开始,因为 List 这个术语经常被信息技术人员以不恰当的方式使用。例如,“ Python Lists”并不是名称所暗示的(Linked Lists) ,而是 Arrays (在 Ruby 中,相同的数据类型实际上被称为 Array)。
From a very general point of view a List is just a sequence of ordered elements: 10,20,1,2,3 is a list. But the properties of a List implemented using an Array are very different from the properties of a List implemented using a Linked List.
从非常普遍的观点来看,List 只是一个有序元素的序列: 10,20,1,2,3是一个 List。但是使用 Array 实现的 List 的属性与使用链表实现的 List 的属性非常不同。
Redis lists are implemented via Linked Lists. This means that even if you have millions of elements inside a list, the operation of adding a new element in the head or in the tail of the list is performed in constant time. The speed of adding a new element with the LPUSH command to the head of a list with ten elements is the same as adding an element to the head of list with 10 million elements.
Redis 列表通过链接列表实现。这意味着,即使在列表中有数百万个元素,在列表的头部或尾部添加新元素的操作也是在常量时间内执行的。使用 LPUSH 命令向包含10个元素的列表头部添加新元素的速度与向包含1000万个元素的列表头部添加元素的速度相同。
What's the downside? Accessing an element by index is very fast in lists implemented with an Array (constant time indexed access) and not so fast in lists implemented by linked lists (where the operation requires an amount of work proportional to the index of the accessed element).
缺点是什么?在使用 Array (常量时间索引访问)实现的列表中,按索引访问元素的速度非常快,而在使用链表实现的列表中(其中的操作需要的工作量与被访问元素的索引成比例)则不那么快。
Redis Lists are implemented with linked lists because for a database system it is crucial to be able to add elements to a very long list in a very fast way. Another strong advantage, as you'll see in a moment, is that Redis Lists can be taken at constant length in constant time.
Redis list 是用链表实现的,因为对于数据库系统来说,能够以非常快的速度将元素添加到一个非常长的列表中是至关重要的。另一个强大的优势,正如你马上就会看到的,是 Redis list 可以在固定时间内获得固定长度。
When fast access to the middle of a large collection of elements is important, there is a different data structure that can be used, called sorted sets. Sorted sets will be covered later in this tutorial.
当快速访问大量元素集合的中间部分很重要时,可以使用一种不同的数据结构,称为排序集。排序集将在本教程后面介绍。
The LPUSH command adds a new element into a list, on the left (at the head), while the RPUSH command adds a new element into a list, on the right (at the tail). Finally the LRANGE command extracts ranges of elements from lists:
LPUSH 命令将新元素添加到左侧的列表中(在头部) ,而 RPUSH 命令将新元素添加到右侧的列表中(在尾部)。最后,LRANGE 命令从 list 中提取元素的范围:
> rpush mylist A
(integer) 1
> rpush mylist B
(integer) 2
> lpush mylist first
(integer) 3
> lrange mylist 0 -1
1) "first"
2) "A"
3) "B"
Note that LRANGE takes two indexes, the first and the last element of the range to return. Both the indexes can be negative, telling Redis to start counting from the end: so -1 is the last element, -2 is the penultimate element of the list, and so forth.
注意,LRANGE 接受两个索引,即要返回的范围的第一个元素和最后一个元素。两个索引都可以是负的,告诉 Redis 从结尾开始计数: -1是最后一个元素,-2是列表的倒数第二个元素,等等。
As you can see RPUSH appended the elements on the right of the list, while the final LPUSH appended the element on the left.
正如您可以看到的,RPUSH 将元素追加到列表的右侧,而最终的 LPUSH 将元素追加到左侧。
Both commands are variadic commands, meaning that you are free to push multiple elements into a list in a single call:
这两个命令都是可变的命令,这意味着您可以在一个调用中将多个元素推入一个列表:
> rpush mylist 1 2 3 4 5 "foo bar"
(integer) 9
> lrange mylist 0 -1
1) "first"
2) "A"
3) "B"
4) "1"
5) "2"
6) "3"
7) "4"
8) "5"
9) "foo bar"
An important operation defined on Redis lists is the ability to pop elements. Popping elements is the operation of both retrieving the element from the list, and eliminating it from the list, at the same time. You can pop elements from left and right, similarly to how you can push elements in both sides of the list:
在 Redis 列表中定义的一个重要操作是弹出元素的能力。弹出元素是同时从列表中检索元素和从列表中消除元素的操作。你可以从左边和右边弹出元素,类似于你可以在列表的两边都弹出元素:
> rpush mylist a b c
(integer) 3
> rpop mylist
"c"
> rpop mylist
"b"
> rpop mylist
"a"
We added three elements and popped three elements, so at the end of this sequence of commands the list is empty and there are no more elements to pop. If we try to pop yet another element, this is the result we get:
我们添加了三个元素并弹出了三个元素,因此在这个命令序列的末尾,列表是空的,如果我们尝试弹出另一个元素,没有更多的元素可以弹出。结果如下:
> rpop mylist
(nil)
Redis returned a NULL value to signal that there are no elements in the list.
Redis 返回一个 NULL 值,表示列表中没有元素。
Lists are useful for a number of tasks, two very representative use cases are the following:
列表对于许多任务是有用的,两个非常具有代表性的用例如下:
For example both the popular Ruby libraries resque and sidekiq use Redis lists under the hood in order to implement background jobs.
例如,流行的 Ruby 库 reque 和 sidekiq 都在引擎盖下使用 Redis 列表来实现后台作业。
The popular Twitter social network takes the latest tweets posted by users into Redis lists.
流行的 Twitter 社交网络把用户发布的最新推文放到 Redis 列表中。
To describe a common use case step by step, imagine your home page shows the latest photos published in a photo sharing social network and you want to speedup access.
为了一步一步地描述一个常见的用例,假设你的主页显示了在照片分享社交网络上发布的最新照片,你想加速访问。
LRANGE 0 9
in order to get the latest 10 posted items. 为了得到最新的10个邮寄项目In many use cases we just want to use lists to store the latest items, whatever they are: social network updates, logs, or anything else.
在许多用例中,我们只想使用列表来存储最新的条目,不管它们是什么: 社交网络更新、日志或其他任何东西。
Redis allows us to use lists as a capped collection, only remembering the latest N items and discarding all the oldest items using the LTRIM command.
Redis允许我们将列表用作上限集合,只记住最新的 n 个项目,并使用 LTRIM 命令丢弃所有最旧的项目。
The LTRIM command is similar to LRANGE, but instead of displaying the specified range of elements it sets this range as the new list value. All the elements outside the given range are removed.
LTRIM 命令类似于 LRANGE,但是它没有显示指定的元素范围,而是将这个范围设置为新的列表值。超出给定范围的所有元素都被删除。
An example will make it more clear:
举个例子就能更清楚地说明这一点:
> rpush mylist 1 2 3 4 5
(integer) 5
> ltrim mylist 0 2
OK
> lrange mylist 0 -1
1) "1"
2) "2"
3) "3"
The above LTRIM command tells Redis to take just list elements from index 0 to 2, everything else will be discarded. This allows for a very simple but useful pattern: doing a List push operation + a List trim operation together in order to add a new element and discard elements exceeding a limit:
上面的 LTRIM 命令告诉 Redis 只从索引0到2获取列表元素,其他所有内容都将被丢弃。这允许一个非常简单但有用的模式: 执行一个 List push 操作 + 一个 List trim 操作,以添加一个新元素并丢弃超过限制的元素:
LPUSH mylist
LTRIM mylist 0 999
The above combination adds a new element and takes only the 1000 newest elements into the list. With LRANGE you can access the top items without any need to remember very old data.
上面的组合添加了一个新元素,并且只将1000个最新元素添加到列表中。使用 LRANGE,您可以访问顶级项目,而不需要记住任何非常旧的数据。
Note: while LRANGE is technically an O(N) command, accessing small ranges towards the head or the tail of the list is a constant time operation.
注意: 虽然 LRANGE 在技术上是一个 o (n)命令,访问列表头部或尾部的小范围是一个常量时间操作。
Lists have a special feature that make them suitable to implement queues, and in general as a building block for inter process communication systems: blocking operations.
列表有一个特殊的特性,使它们适合于实现队列,并且通常作为进程间通信系统的构建块: 阻塞操作。
Imagine you want to push items into a list with one process, and use a different process in order to actually do some kind of work with those items. This is the usual producer / consumer setup, and can be implemented in the following simple way:
假设您想用一个进程将项目推送到一个列表中,并使用不同的进程来实际处理这些项目。这是通常的生产者/消费者设置,可以用以下简单的方式实现:
However it is possible that sometimes the list is empty and there is nothing to process, so RPOP just returns NULL. In this case a consumer is forced to wait some time and retry again with RPOP. This is called polling, and is not a good idea in this context because it has several drawbacks:
然而,有时候列表可能是空的,没有什么需要处理的,所以 RPOP 只返回 NULL。在这种情况下,使用者被迫等待一段时间并使用 RPOP 重试。这就是所谓的轮询,在这种情况下不是一个好主意,因为它有几个缺点:
So Redis implements commands called BRPOP and BLPOP which are versions of RPOP and LPOP able to block if the list is empty: they'll return to the caller only when a new element is added to the list, or when a user-specified timeout is reached.
所以 Redis 实现了命令 BRPOP 和 BLPOP,它们是 RPOP 和 LPOP 的版本,如果列表是空的,它们就可以阻塞: 只有当一个新元素添加到列表中,或者达到用户指定的超时时,它们才会返回给调用者。
This is an example of a BRPOP call we could use in the worker:
这是一个 BRPOP 调用的例子,我们可以在 worker 中使用:
> brpop tasks 5
1) "tasks"
2) "do_something"
It means: "wait for elements in the list tasks
, but return if after 5 seconds no element is available".
它的意思是: “等待列表任务中的元素,但如果5秒后没有元素可用,则返回”。
Note that you can use 0 as timeout to wait for elements forever, and you can also specify multiple lists and not just one, in order to wait on multiple lists at the same time, and get notified when the first list receives an element.
请注意,您可以使用0作为超时来永久等待元素,您还可以指定多个列表,而不是一个,以便在同一时间等待多个列表,并在第一个列表接收到元素时得到通知。
A few things to note about BRPOP:
关于 BRPOP 需要注意的一些事情:
There are more things you should know about lists and blocking ops. We suggest that you read more on the following:
关于列表和阻塞操作,还有更多的事情你应该知道。我们建议你更多地阅读以下内容:
So far in our examples we never had to create empty lists before pushing elements, or removing empty lists when they no longer have elements inside. It is Redis' responsibility to delete keys when lists are left empty, or to create an empty list if the key does not exist and we are trying to add elements to it, for example, with LPUSH.
到目前为止,在我们的示例中,我们从未在推入元素之前创建空列表,或者在空列表中不再有元素时移除空列表。当列表为空时,Redis 有责任删除键; 如果键不存在,Redis 有责任创建一个空列表,并且我们正在尝试向其中添加元素,例如,使用 LPUSH。
This is not specific to lists, it applies to all the Redis data types composed of multiple elements -- Streams, Sets, Sorted Sets and Hashes.
这并不特定于列表,它适用于由多个元素组成的所有 Redis 数据类型—— Streams、 Sets、 Sorted Sets 和 hash。
Basically we can summarize the behavior with three rules:
基本上,我们可以用三条规则来总结这种行为:
Examples of rule 1:
规则1的例子:
> del mylist
(integer) 1
> lpush mylist 1 2 3
(integer) 3
However we can't perform operations against the wrong type if the key exists:
但是,如果键存在,我们不能对错误的类型执行操作:
> set foo bar
OK
> lpush foo 1 2 3
(error) WRONGTYPE Operation against a key holding the wrong kind of value
> type foo
string
Example of rule 2:
规则2的例子:
> lpush mylist 1 2 3
(integer) 3
> exists mylist
(integer) 1
> lpop mylist
"3"
> lpop mylist
"2"
> lpop mylist
"1"
> exists mylist
(integer) 0
The key no longer exists after all the elements are popped.
在弹出所有元素后,键不再存在。
Example of rule 3:
规则3的例子:
> del mylist
(integer) 0
> llen mylist
(integer) 0
> lpop mylist
(nil)
Redis hashes look exactly how one might expect a "hash" to look, with field-value pairs:
通过字段值对,Redis 散列看起来就像是一个“散列” :
> hmset user:1000 username antirez birthyear 1977 verified 1
OK
> hget user:1000 username
"antirez"
> hget user:1000 birthyear
"1977"
> hgetall user:1000
1) "username"
2) "antirez"
3) "birthyear"
4) "1977"
5) "verified"
6) "1"
While hashes are handy to represent objects, actually the number of fields you can put inside a hash has no practical limits (other than available memory), so you can use hashes in many different ways inside your application.
虽然哈希表示对象很方便,但实际上可以放在哈希中的字段数没有实际限制(除了可用内存) ,因此可以在应用程序中以多种方式使用哈希。
The command HMSET sets multiple fields of the hash, while HGET retrieves a single field. HMGET is similar to HGET but returns an array of values:
命令 HMSET 设置哈希的多个字段,而 HGET 检索单个字段。与 HGET 类似,但返回一个值数组:
> hmget user:1000 username birthyear no-such-field
1) "antirez"
2) "1977"
3) (nil)
There are commands that are able to perform operations on individual fields as well, like HINCRBY:
还有一些命令也可以对单个字段执行操作,比如 HINCRBY:
> hincrby user:1000 birthyear 10
(integer) 1987
> hincrby user:1000 birthyear 10
(integer) 1997
You can find the full list of hash commands in the documentation.
您可以在文档中找到散列命令的完整列表。
It is worth noting that small hashes (i.e., a few elements with small values) are encoded in special way in memory that make them very memory efficient.
值得注意的是,小散列(即少数具有小值的元素)在内存中以特殊方式编码,这使它们非常有效地提高了内存效率。
Redis Sets are unordered collections of strings. The SADD command adds new elements to a set. It's also possible to do a number of other operations against sets like testing if a given element already exists, performing the intersection, union or difference between multiple sets, and so forth.
Redis 集是字符串的无序集合。SADD 命令向集合中添加新元素。还可以对集合执行一些其他操作,比如测试给定的元素是否已经存在,执行多个集合之间的交集、并集或差集,等等。
> sadd myset 1 2 3
(integer) 3
> smembers myset
1. 3
2. 1
3. 2
Here I've added three elements to my set and told Redis to return all the elements. As you can see they are not sorted -- Redis is free to return the elements in any order at every call, since there is no contract with the user about element ordering.
这里我给我的集合添加了三个元素,并告诉 Redis 返回所有的元素。正如您所看到的,它们没有排序—— Redis 可以在每次调用时以任何顺序返回元素,因为没有与用户关于元素排序的约定。
Redis has commands to test for membership. For example, checking if an element exists:
Redis 有测试成员资格的命令。例如,检查一个元素是否存在:
> sismember myset 3
(integer) 1
> sismember myset 30
(integer) 0
"3" is a member of the set, while "30" is not.
“3”是集合中的一员,而“30”不是。
Sets are good for expressing relations between objects. For instance we can easily use sets in order to implement tags.
集合适于表达对象之间的关系。例如,我们可以很容易地使用集合来实现标记。
A simple way to model this problem is to have a set for every object we want to tag. The set contains the IDs of the tags associated with the object.
为这个问题建模的一个简单方法是为我们想要标记的每个对象都设置一个集合。该集合包含与对象关联的标记的 id。
One illustration is tagging news articles. If article ID 1000 is tagged with tags 1, 2, 5 and 77, a set can associate these tag IDs with the news item:
其中一个例子是给新闻文章加标签。如果文章 ID 1000用标签1、2、5和77标记,那么集合可以将这些标签 ID 与新闻项关联起来:
> sadd news:1000:tags 1 2 5 77
(integer) 4
We may also want to have the inverse relation as well: the list of all the news tagged with a given tag:
我们可能还需要一个反向关系: 所有新闻的列表都用给定的标签标记:
> sadd tag:1:news 1000
(integer) 1
> sadd tag:2:news 1000
(integer) 1
> sadd tag:5:news 1000
(integer) 1
> sadd tag:77:news 1000
(integer) 1
To get all the tags for a given object is trivial:
获取给定对象的所有标记是很简单的:
> smembers news:1000:tags
1. 5
2. 1
3. 77
4. 2
Note: in the example we assume you have another data structure, for example a Redis hash, which maps tag IDs to tag names.
注意: 在本例中,我们假设您有另一个数据结构,例如 Redis 散列,它将标记 id 映射到标记名。
There are other non trivial operations that are still easy to implement using the right Redis commands. For instance we may want a list of all the objects with the tags 1, 2, 10, and 27 together. We can do this using the SINTER command, which performs the intersection between different sets. We can use:
使用正确的 Redis 命令仍然可以很容易地实现其他非平凡的操作。例如,我们可能需要一个包含所有标记1、2、10和27的对象的列表。我们可以使用 semater 命令来完成这一操作,该命令执行不同集之间的交集。我们可以使用:
> sinter tag:1:news tag:2:news tag:10:news tag:27:news
... results here ...
In addition to intersection you can also perform unions, difference, extract a random element, and so forth.
除了交集之外,还可以执行联合、差分、提取随机元素等操作。
The command to extract an element is called SPOP, and is handy to model certain problems. For example in order to implement a web-based poker game, you may want to represent your deck with a set. Imagine we use a one-char prefix for (C)lubs, (D)iamonds, (H)earts, (S)pades:
提取元素的命令称为 SPOP,可以方便地对某些问题进行建模。例如,为了实现一个基于网络的扑克游戏,您可能希望用一个集合来表示您的牌。假设我们用一个字符前缀表示(c) lubs,(d) iamonds,(h) earts,(s) pades:
> sadd deck C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 CJ CQ CK
D1 D2 D3 D4 D5 D6 D7 D8 D9 D10 DJ DQ DK H1 H2 H3
H4 H5 H6 H7 H8 H9 H10 HJ HQ HK S1 S2 S3 S4 S5 S6
S7 S8 S9 S10 SJ SQ SK
(integer) 52
Now we want to provide each player with 5 cards. The SPOP command removes a random element, returning it to the client, so it is the perfect operation in this case.
现在我们要给每个玩家5张牌。SPOP 命令删除一个随机元素,并将其返回给客户机,因此在这种情况下,这是一个完美的操作。
However if we call it against our deck directly, in the next play of the game we'll need to populate the deck of cards again, which may not be ideal. So to start, we can make a copy of the set stored in the deck
key into the game:1:deck
key.
然而,如果我们直接对着我们的牌组调用它,在下一次游戏中,我们将需要再次填充牌组,这可能不是理想的。因此,开始时,我们可以复制一个集存储在甲板键到游戏: 1: 甲板键。
This is accomplished using SUNIONSTORE, which normally performs the union between multiple sets, and stores the result into another set. However, since the union of a single set is itself, I can copy my deck with:
这是通过使用 SUNIONSTORE 实现的,它通常执行多个集合之间的并集,并将结果存储到另一个集合中。然而,由于一个集合本身就是一个集合,我可以用以下命令来复制我的集合:
> sunionstore game:1:deck deck
(integer) 52
Now I'm ready to provide the first player with five cards:
现在我已经准备好向第一个玩家提供五张牌:
> spop game:1:deck
"C6"
> spop game:1:deck
"CQ"
> spop game:1:deck
"D1"
> spop game:1:deck
"CJ"
> spop game:1:deck
"SJ"
One pair of jacks, not great...
一对 j,不是很好..。
This is a good time to introduce the set command that provides the number of elements inside a set. This is often called the cardinality of a set in the context of set theory, so the Redis command is called SCARD.
现在正是介绍 set 命令的好时机,该命令提供集合中元素的数量。在集合论的上下文中,这通常称为集合的基数,因此 Redis 命令称为 SCARD。
> scard game:1:deck
(integer) 47
The math works: 52 - 5 = 47.
这个数学公式是: 52-5 = 47。
When you need to just get random elements without removing them from the set, there is the SRANDMEMBER command suitable for the task. It also features the ability to return both repeating and non-repeating elements.
当您只需要获取随机元素而不需要将它们从集合中移除时,可以使用适合该任务的 SRANDMEMBER 命令。它还具有返回重复和非重复元素的能力。
Sorted sets are a data type which is similar to a mix between a Set and a Hash. Like sets, sorted sets are composed of unique, non-repeating string elements, so in some sense a sorted set is a set as well.
排序集是一种类似于 Set 和 Hash 的混合的数据类型。与集合类似,排序集合由唯一的、不重复的字符串元素组成,因此在某种意义上,排序集合也是一个集合。
However while elements inside sets are not ordered, every element in a sorted set is associated with a floating point value, called the score (this is why the type is also similar to a hash, since every element is mapped to a value).
然而,当集合中的元素没有排序时,排序集合中的每个元素都与一个浮点值(称为得分)关联(这就是为什么该类型也类似于散列,因为每个元素都映射到一个值)。
Moreover, elements in a sorted sets are taken in order (so they are not ordered on request, order is a peculiarity of the data structure used to represent sorted sets). They are ordered according to the following rule:
此外,有序集合中的元素是按顺序排列的(所以它们不是按请求排序的,顺序是用于表示有序集合的数据结构的一个特点)。它们是根据以下规则订购的:
Let's start with a simple example, adding a few selected hackers names as sorted set elements, with their year of birth as "score".
让我们从一个简单的示例开始,添加一些选定的黑客名称作为排序集元素,其出生年份作为“ score”。
> zadd hackers 1940 "Alan Kay"
(integer) 1
> zadd hackers 1957 "Sophie Wilson"
(integer) 1
> zadd hackers 1953 "Richard Stallman"
(integer) 1
> zadd hackers 1949 "Anita Borg"
(integer) 1
> zadd hackers 1965 "Yukihiro Matsumoto"
(integer) 1
> zadd hackers 1914 "Hedy Lamarr"
(integer) 1
> zadd hackers 1916 "Claude Shannon"
(integer) 1
> zadd hackers 1969 "Linus Torvalds"
(integer) 1
> zadd hackers 1912 "Alan Turing"
(integer) 1
As you can see ZADD is similar to SADD, but takes one additional argument (placed before the element to be added) which is the score. ZADD is also variadic, so you are free to specify multiple score-value pairs, even if this is not used in the example above.
正如您所看到的,ZADD 类似于 SADD,但是它接受一个额外的参数(放在要添加的元素之前) ,这个参数就是分数。ZADD 也是可变的,因此您可以自由地指定多个分值对,即使上面的示例中没有使用这个值。
With sorted sets it is trivial to return a list of hackers sorted by their birth year because actually they are already sorted.
对于排序集,返回按出生年份排序的黑客列表是很简单的,因为实际上它们已经被排序了。
Implementation note: Sorted sets are implemented via a dual-ported data structure containing both a skip list and a hash table, so every time we add an element Redis performs an O(log(N)) operation. That's good, but when we ask for sorted elements Redis does not have to do any work at all, it's already all sorted:
实现注意: 排序集是通过包含跳跃列表和哈希表的双端口数据结构实现的,因此每次添加元素 Redis 时执行一个 o (log (n))操作。这很好,但是当我们获取 Redis 排序元素时,它根本不需要做任何工作,它已经被排序了:
> zrange hackers 0 -1
1) "Alan Turing"
2) "Hedy Lamarr"
3) "Claude Shannon"
4) "Alan Kay"
5) "Anita Borg"
6) "Richard Stallman"
7) "Sophie Wilson"
8) "Yukihiro Matsumoto"
9) "Linus Torvalds"
Note: 0 and -1 means from element index 0 to the last element (-1 works here just as it does in the case of the LRANGE command).
注意: 0和-1表示从元素索引0到最后一个元素(- 1在这里工作,就像在 LRANGE 命令中一样)。
What if I want to order them the opposite way, youngest to oldest? Use ZREVRANGE instead of ZRANGE:
如果我想按照相反的顺序排列,从最小到最大怎么办? 用 ZREVRANGE 代替 ZRANGE:
> zrevrange hackers 0 -1
1) "Linus Torvalds"
2) "Yukihiro Matsumoto"
3) "Sophie Wilson"
4) "Richard Stallman"
5) "Anita Borg"
6) "Alan Kay"
7) "Claude Shannon"
8) "Hedy Lamarr"
9) "Alan Turing"
It is possible to return scores as well, using the WITHSCORES
argument:
也可以使用 WITHSCORES 参数返回 scores:
> zrange hackers 0 -1 withscores
1) "Alan Turing"
2) "1912"
3) "Hedy Lamarr"
4) "1914"
5) "Claude Shannon"
6) "1916"
7) "Alan Kay"
8) "1940"
9) "Anita Borg"
10) "1949"
11) "Richard Stallman"
12) "1953"
13) "Sophie Wilson"
14) "1957"
15) "Yukihiro Matsumoto"
16) "1965"
17) "Linus Torvalds"
18) "1969"
Sorted sets are more powerful than this. They can operate on ranges. Let's get all the individuals that were born up to 1950 inclusive. We use the ZRANGEBYSCORE command to do it:
排序集比这个更强大。他们可以在范围内操作。让我们把所有1950年以前出生的人都包括进来。我们使用 ZRANGEBYSCORE 命令来实现:
> zrangebyscore hackers -inf 1950
1) "Alan Turing"
2) "Hedy Lamarr"
3) "Claude Shannon"
4) "Alan Kay"
5) "Anita Borg"
We asked Redis to return all the elements with a score between negative infinity and 1950 (both extremes are included).
我们要求 Redis 返回所有元素,其得分在负无穷到1950之间(两个极值都包含在内)。
It's also possible to remove ranges of elements. Let's remove all the hackers born between 1940 and 1960 from the sorted set:
还可以删除元素的范围。让我们把1940年至1960年间出生的所有黑客从排序集中移除:
> zremrangebyscore hackers 1940 1960
(integer) 4
ZREMRANGEBYSCORE is perhaps not the best command name, but it can be very useful, and returns the number of removed elements.
Zremangebyscore 可能不是最好的命令名,但它非常有用,并返回已移除元素的数量。
Another extremely useful operation defined for sorted set elements is the get-rank operation. It is possible to ask what is the position of an element in the set of the ordered elements.
为排序集元素定义的另一个非常有用的操作是 get-rank 操作。可以询问一个元素在有序元素集合中的位置。
> zrank hackers "Anita Borg"
(integer) 4
The ZREVRANK command is also available in order to get the rank, considering the elements sorted a descending way.
考虑到元素以降序的方式排序,ZREVRANK 命令也可用于获得排名。
With recent versions of Redis 2.8, a new feature was introduced that allows getting ranges lexicographically, assuming elements in a sorted set are all inserted with the same identical score (elements are compared with the C memcmp
function, so it is guaranteed that there is no collation, and every Redis instance will reply with the same output).
对于 Redis 2.8的最新版本,引入了一个新特性,允许按字母顺序获取范围,假设一个排序集中的元素都插入了相同的得分(元素与 c memcmp 函数进行比较,因此可以保证没有排序,并且每个 Redis 实例将用相同的输出进行回复)。
The main commands to operate with lexicographical ranges are ZRANGEBYLEX, ZREVRANGEBYLEX, ZREMRANGEBYLEX and ZLEXCOUNT.
使用字典编纂范围的主要命令是 ZRANGEBYLEX、 ZREVRANGEBYLEX、 ZREMRANGEBYLEX 和 ZLEXCOUNT。
For example, let's add again our list of famous hackers, but this time use a score of zero for all the elements:
例如,让我们再次添加我们的著名黑客名单,但这一次使用的所有元素得分为零:
> zadd hackers 0 "Alan Kay" 0 "Sophie Wilson" 0 "Richard Stallman" 0
"Anita Borg" 0 "Yukihiro Matsumoto" 0 "Hedy Lamarr" 0 "Claude Shannon"
0 "Linus Torvalds" 0 "Alan Turing"
Because of the sorted sets ordering rules, they are already sorted lexicographically:
由于排序集的排序规则,它们已经按字母顺序排序:
> zrange hackers 0 -1
1) "Alan Kay"
2) "Alan Turing"
3) "Anita Borg"
4) "Claude Shannon"
5) "Hedy Lamarr"
6) "Linus Torvalds"
7) "Richard Stallman"
8) "Sophie Wilson"
9) "Yukihiro Matsumoto"
Using ZRANGEBYLEX we can ask for lexicographical ranges:
使用 ZRANGEBYLEX,我们可以查询字典编纂范围:
> zrangebylex hackers [B [P
1) "Claude Shannon"
2) "Hedy Lamarr"
3) "Linus Torvalds"
Ranges can be inclusive or exclusive (depending on the first character), also string infinite and minus infinite are specified respectively with the +
and -
strings. See the documentation for more information.
Range 可以是 inclusive 或 exclusive (取决于第一个字符) ,也可以使用 + 和-字符串分别指定 string infinite 和 minus infinite。有关更多信息,请参见文档。
This feature is important because it allows us to use sorted sets as a generic index. For example, if you want to index elements by a 128-bit unsigned integer argument, all you need to do is to add elements into a sorted set with the same score (for example 0) but with an 16 byte prefix consisting of the 128 bit number in big endian. Since numbers in big endian, when ordered lexicographically (in raw bytes order) are actually ordered numerically as well, you can ask for ranges in the 128 bit space, and get the element's value discarding the prefix.
这个特性很重要,因为它允许我们使用排序集作为通用索引。例如,如果希望用128位无符号整数参数对元素进行索引,那么所需要做的就是将元素添加到一个排序集合中,该集合的得分相同(例如0) ,但前缀为16字节,前缀由 big endian 中的128位数组成。由于 big endian 中的数字在按字母顺序排列时(按原始字节顺序)实际上也是按数字顺序排列的,因此您可以要求128位空间中的范围,并且得到元素丢弃前缀的值。
If you want to see the feature in the context of a more serious demo, check the Redis autocomplete demo.
如果您希望在更严肃的演示上下文中看到该特性,请查看 Redis 自动完成演示。
Just a final note about sorted sets before switching to the next topic. Sorted sets' scores can be updated at any time. Just calling ZADD against an element already included in the sorted set will update its score (and position) with O(log(N)) time complexity. As such, sorted sets are suitable when there are tons of updates.
在切换到下一个主题之前,最后说明一下排序集。可以随时更新排序集的分数。仅仅针对已经包含在排序集中的元素调用 ZADD,就会将其得分(和位置)更新为 o (log (n))时间复杂度。因此,当存在大量更新时,排序集是合适的。
Because of this characteristic a common use case is leader boards. The typical application is a Facebook game where you combine the ability to take users sorted by their high score, plus the get-rank operation, in order to show the top-N users, and the user rank in the leader board (e.g., "you are the #4932 best score here").
由于这个特点,一个常见的用例就是主控板。典型的应用程序是一个 Facebook 游戏,在这个游戏中,你可以根据用户的最高分数,加上获得排名操作,来显示排名前 n 的用户,以及排名榜中的用户排名(例如,“你在这里的最高分数是 # 4932”)。
Bitmaps are not an actual data type, but a set of bit-oriented operations defined on the String type. Since strings are binary safe blobs and their maximum length is 512 MB, they are suitable to set up to 232 different bits.
位图不是实际的数据类型,而是在 String 类型上定义的一组面向位的操作。由于字符串是二进制安全的 blob,它们的最大长度为512 MB,因此它们适合设置为232个不同的位。
Bit operations are divided into two groups: constant-time single bit operations, like setting a bit to 1 or 0, or getting its value, and operations on groups of bits, for example counting the number of set bits in a given range of bits (e.g., population counting).
位操作可分为两组: 常时单位操作,如将位设置为1或0,或获得其值,以及对一组位的操作,例如计算给定范围内位的设置位数(如人口计数)。
One of the biggest advantages of bitmaps is that they often provide extreme space savings when storing information. For example in a system where different users are represented by incremental user IDs, it is possible to remember a single bit information (for example, knowing whether a user wants to receive a newsletter) of 4 billion of users using just 512 MB of memory.
位图的最大优点之一是,当存储信息时,它们通常可以极大地节省空间。例如,在一个系统中,不同的用户由增量用户 id 表示,只需512mb 内存就可以记住40亿用户的单个位信息(例如,知道用户是否希望收到新闻稿)。
Bits are set and retrieved using the SETBIT and GETBIT commands:
使用 SETBIT 和 GETBIT 命令设置和检索位:
> setbit key 10 1
(integer) 1
> getbit key 10
(integer) 1
> getbit key 11
(integer) 0
The SETBIT command takes as its first argument the bit number, and as its second argument the value to set the bit to, which is 1 or 0. The command automatically enlarges the string if the addressed bit is outside the current string length.
SETBIT 命令的第一个参数是位数,第二个参数是将位设置为的值,即1或0。如果寻址位超出当前字符串长度,则命令自动放大字符串。
GETBIT just returns the value of the bit at the specified index. Out of range bits (addressing a bit that is outside the length of the string stored into the target key) are always considered to be zero.
GETBIT 只返回指定索引处的位的值。超出范围的位(寻址超出存储在目标键中的字符串长度的位)始终被认为是零。
There are three commands operating on group of bits:
有三个命令操作一组比特:
Both BITPOS and BITCOUNT are able to operate with byte ranges of the string, instead of running for the whole length of the string. The following is a trivial example of BITCOUNT call:
BITPOS 和 BITCOUNT 都能够操作字符串的字节范围,而不是运行整个字符串的长度。下面是一个简单的 BITCOUNT 调用示例:
> setbit key 0 1
(integer) 0
> setbit key 100 1
(integer) 0
> bitcount key
(integer) 2
Common use cases for bitmaps are:
位图的常见用例如下:
For example imagine you want to know the longest streak of daily visits of your web site users. You start counting days starting from zero, that is the day you made your web site public, and set a bit with SETBIT every time the user visits the web site. As a bit index you simply take the current unix time, subtract the initial offset, and divide by the number of seconds in a day (normally, 3600*24).
例如,想象你想知道你的网站用户每天最长的连续访问时间。你开始计算从零开始的日子,这是你的网站公开的日子,并设置一个位的 SETBIT 每次用户访问网站。作为位索引,您只需要获取当前 unix 时间,减去初始偏移量,然后除以每天的秒数(通常是3600 * 24)。
This way for each user you have a small string containing the visit information for each day. With BITCOUNT it is possible to easily get the number of days a given user visited the web site, while with a few BITPOS calls, or simply fetching and analyzing the bitmap client-side, it is possible to easily compute the longest streak.
这样,每个用户都有一个小字符串,其中包含每天的访问信息。使用 BITCOUNT 可以很容易地得到给定用户访问网站的天数,而使用少量 BITPOS 调用,或者只是获取和分析位图客户端,就可以很容易地计算出最长的条纹。
Bitmaps are trivial to split into multiple keys, for example for the sake of sharding the data set and because in general it is better to avoid working with huge keys. To split a bitmap across different keys instead of setting all the bits into a key, a trivial strategy is just to store M bits per key and obtain the key name with bit-number/M
and the Nth bit to address inside the key with bit-number MOD M
.
位图很容易分割成多个键,例如为了分割数据集,因为通常最好避免使用巨大的键。为了在不同的key之间分割位图,而不是将所有的位都设置为一个key,一个简单的策略就是每个key存储 m 位,然后获得位数/m 的key名,n 位用位数 MOD m 在key内部寻址。
A HyperLogLog is a probabilistic data structure used in order to count unique things (technically this is referred to estimating the cardinality of a set). Usually counting unique items requires using an amount of memory proportional to the number of items you want to count, because you need to remember the elements you have already seen in the past in order to avoid counting them multiple times. However there is a set of algorithms that trade memory for precision: you end with an estimated measure with a standard error, which in the case of the Redis implementation is less than 1%. The magic of this algorithm is that you no longer need to use an amount of memory proportional to the number of items counted, and instead can use a constant amount of memory! 12k bytes in the worst case, or a lot less if your HyperLogLog (We'll just call them HLL from now) has seen very few elements.
HyperLogLog 是一种概率数据结构,用于计算唯一的东西(从技术上讲,这是指估计集合的基数)。通常计算独特的项目需要使用与要计算的项目数量成比例的内存量,因为您需要记住过去已经看到的元素,以避免多次计算它们。然而,有一组用内存换取精确度的算法: 您以一个标准误差的估计度量结束,在 Redis 实现的情况下,这个误差小于1% 。这个算法的神奇之处在于,您不再需要使用与计算的项目数量成比例的内存量,而是可以使用一个恒定的内存量!最坏的情况是12k 字节,如果 HyperLogLog (我们从现在开始就称它们为 HLL)只看到很少的元素,那么字节就会少很多。
HLLs in Redis, while technically a different data structure, are encoded as a Redis string, so you can call GET to serialize a HLL, and SET to deserialize it back to the server.
Redis 的 HLLs 虽然在技术上是不同的数据结构,但它被编码为 Redis 字符串,因此您可以调用 GET 来序列化 HLL,调用 SET 来将其反序列化回服务器。
Conceptually the HLL API is like using Sets to do the same task. You would SADD every observed element into a set, and would use SCARD to check the number of elements inside the set, which are unique since SADD will not re-add an existing element.
从概念上讲,HLL API 类似于使用 set 来执行相同的任务。您可以将每个观察到的元素添加到一个集合中,并使用 SCARD 检查集合中的元素数量,这些元素是唯一的,因为 SADD 不会重新添加现有的元素。
While you don't really add items into an HLL, because the data structure only contains a state that does not include actual elements, the API is the same:
虽然你不会真的在 HLL 中添加条目,因为数据结构只包含一个不包含实际元素的状态,但是 API 是一样的:
Every time you want to retrieve the current approximation of the unique elements added with PFADD so far, you use the PFCOUNT.
到目前为止,每当您想要检索添加了 PFADD 的独特元素的当前近似值时,您都可以使用 PFCOUNT。
> pfadd hll a b c d
(integer) 1
> pfcount hll
(integer) 4
An example of use case for this data structure is counting unique queries performed by users in a search form every day.
这种数据结构的一个用例是计算用户每天在搜索表单中执行的唯一查询。
Redis is also able to perform the union of HLLs, please check the full documentation for more information.
Redis 也可以执行 HLLs 的联合,请查看完整的文档以获得更多信息。
There are other important things in the Redis API that can't be explored in the context of this document, but are worth your attention:
在 Redis API 中还有其他重要的东西不能在本文中探讨,但是值得你注意:
This tutorial is in no way complete and has covered just the basics of the API. Read the command reference to discover a lot more.
本教程并不完整,仅仅介绍了 API 的基本知识。请阅读命令参考,以了解更多信息。
Thanks for reading, and have fun hacking with Redis!
感谢阅读,玩得开心与 Redis!