redis 大量数据的插入处理

Redis Mass Insertion

Sometimes Redis instances needs to be loaded with big amount of preexisting or user generated data in a short amount of time, so that millions of keys will be created as fast as possible.

This is called a mass insertion, and the goal of this document is to provide information about how to feed Redis with data as fast as possible.

Use the protocol, Luke

Using a normal Redis client to perform mass insertion is not a good idea for a few reasons: the naive approach of sending one command after the other is slow because you have to pay for the round trip time for every command. It is possible to use pipelining, but for mass insertion of many records you need to write new commands while you read replies at the same time to make sure you are inserting as fast as possible.

Only a small percentage of clients support non-blocking I/O, and not all the clients are able to parse the replies in an efficient way in order to maximize throughput. For all this reasons the preferred way to mass import data into Redis is to generate a text file containing the Redis protocol, in raw format, in order to call the commands needed to insert the required data.

For instance if I need to generate a large data set where there are billions of keys in the form: `keyN -> ValueN' I will create a file containing the following commands in the Redis protocol format:

SET Key0 Value0
SET Key1 Value1
...
SET KeyN ValueN

Once this file is created, the remaining action is to feed it to Redis as fast as possible. In the past the way to do this was to use the netcat with the following command:

(cat data.txt; sleep 10) | nc localhost 6379 > /dev/null

However this is not a very reliable way to perform mass import because netcat does not really know when all the data was transferred and can't check for errors. In the unstable branch of Redis at github the redis-cli utility supports a new mode called pipe mode that was designed in order to perform mass insertion. (This feature will be available in a few days in Redis 2.6-RC4 and in Redis 2.4.14).

Using the pipe mode the command to run looks like the following:

cat data.txt | redis-cli --pipe

That will produce an output similar to this:

All data transferred. Waiting for the last reply...
Last reply received from server.
errors: 0, replies: 1000000

The redis-cli utility will also make sure to only redirect errors received from the Redis instance to the standard output.

你可能感兴趣的:(redis)