来自:http://blog.liveramp.com/2013/04/08/kafka-0-8-producer-performance-2/
At LiveRamp, we constantly face scaling challenges as the volume of data that our infrastructure must deal with continues to grow. One such challenge involves the logging system. At present we useScribe as the transport mechanism to get logs from our webapp servers into our HDFS cluster. Scribe has served us well, but we are looking for alternatives because it has the following shortcomings:
One of the most promising alternatives to Scribe that addresses all of the above is Kafka. We used Kafka to build a real-time stats system prototype during our last Hackweek, and saw enough promise to do some more in-depth testing. In this post we will focus on producer performance and scaling. Since we intend to put producers in our webapp servers, we are interested in both high overall throughput and low latency when sending individual messages.
At the time of this writing, Kafka 0.8 has not been released, and documentation for it is scarce. However, since it is a backwards incompatible release that introduces a number of important features, it would make little sense for anyone just getting started with Kafka to invest development effort in the previous version.
All tests in this post were run on this revision of the 0.8 branch.
We are starting with a modestly sized cluster of three machines. The specs are as follows:
1
2
3
4
5
6
7
8
9
10
11
12
|
Num CPUs: 2
CPU Model: Intel(R) Xeon(R) CPU E5620 @ 2.40GHz
CPU Speed: 2400 MHz
Memory MB: 32768
Disk Controller Config
Layout: RAID-1
Size: 1,862.50 GB (1999844147200 bytes)
Layout: RAID-1
Size: 1,862.50 GB (1999844147200 bytes)
Disk config for controller 0:
4 Capacity: 1,862.50 GB (1999844147200 bytes)
7200 RPM 64MB Cache
|
Each machine has two pairs of disks in a mirroring configuration (RAID-1), which allow us to take advantage of the new multiple data directories feature introduced in Kafka 0.8. This makes it possible for a topic to have separate partitions on different disks, which should significantly increase the throughput per broker. This behavior is configured in the log.dirs setting as shown in the broker configuration below. We used default values for most other settings.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
|
broker.id=1
port=9092
num.network.threads=2
num.io.threads=2
socket.send.buffer.bytes=1048576
socket.receive.buffer.bytes=1048576
socket.request.max.bytes=104857600
log.dirs=/data1/kafka,/data2/kafka
num.partitions=1
log.flush.interval.messages=10000
log.flush.interval.ms=3000
log.retention.hours=168
log.segment.bytes=536870912
log.cleanup.interval.mins=1
enable.zookeeper=true
zk.connect=zookeeper01:2181,zookeeper02:2181,zookeeper03:2181/kafka
zk.connectiontimeout.ms=1000000
kafka.metrics.polling.interval.secs=5
kafka.metrics.reporters=kafka.metrics.KafkaCSVMetricsReporter
kafka.csv.metrics.dir=/tmp/kafka_metrics
kafka.csv.metrics.reporter.enabled=false
|
As recommended by the Kafka documentation, we use a separate cluster of three dedicated machines for ZooKeeper. All machines are connected with gigabit links.
Our real use case involves a number of webapp servers each producing a relatively modest volume of logs. For this test, however, we used only a few dedicated producer machines using a custom-made tool that simulates the real load. Each producer was configured as follows:
1
2
3
4
5
6
7
|
Properties props = new Properties();
props.put("broker.list", "kafka01:9092,kafka02:9092,kafka03:9092");
props.put("serializer.class", "kafka.serializer.StringEncoder");
props.put("producer.type", "async");
props.put("queue.enqueue.timeout.ms", "-1");
props.put("batch.num.messages", "200");
props.put("compression.codec", "1");
|
The most important setting here is producer.type, which we set toasync. Asynchronous mode is essential to get the most out of Kafka in terms of throughput. In this mode, each producer keeps an in-memory queue of messages that are sent in batch to the broker when a pre-configured batch size or time interval has been reached. This makes compression much more efficient, especially in a use case like ours in which log lines have string representations of JSON objects, and the same keys are repeated over and over across lines. Having fewer, larger messages also helps to achieve better network utilization.
The Kafka distribution provides a producer performance tool that can be invoked with the script bin/kafka-producer-perf-test.sh. While this tool is very useful and flexible, we only used it to corroborate that the results obtained with our own custom tool made sense. This is due to the following reasons:
The Kafka documentation claims that producers can push about 50MB/sec through a system with a single broker as long as the batch size is not too small (the default value of 200 should be large enough). We were able to verify this claim very quickly for Kafka 0.7.2 by running the following command on a fresh installation
1
|
bin/kafka-producer-perf-test.sh --brokerinfo broker.list=0:localhost:9092 --messages 10000000 --topic test --threads 10 --message-size 1000 --batch-size 200 --compression-codec 1 --async
|
and obtaining the following results:
1
2
|
start.time, end.time, compression, message.size, batch.size, total.data.sent.in.MB, MB.sec, total.data.sent.in.nMsg, nMsg.sec
2013-04-09 11:52:43:192, 2013-04-09 11:56:06:136, 1, 1000, 200, 9536.74, 46.9920, 10000000, 49274.6768
|
Running an equivalent command on a fresh installation of Kafka 0.8, however, gave us markedly worse results:
1
|
bin/kafka-producer-perf-test.sh --broker-list=localhost:9092 --messages 10000000 --topic test --threads 10 --message-size 1000 --batch-size 200 --compression-codec 1
|
1
2
|
start.time, end.time, compression, message.size, batch.size, total.data.sent.in.MB, MB.sec, total.data.sent.in.nMsg, nMsg.sec
2013-04-02 17:16:51:933, 2013-04-02 17:24:04:916, 1, 1000, 200, 9536.74, 22.0257, 10000000, 23095.5950
|
This is because in an effort to increase availability and durability, version 0.8 introduced intra-cluster replication support, and by default a producer waits for an acknowledgement response from the broker on every message (or batch of messages if async mode is used). It is possible to mimic the old behavior, but we were not very interested in that given that we intend to use replication in production.
Performance degraded further once we started using a sample of real ~1KB sized log messages rather than the synthetic messages produced by the Kafka tool, resulting in a throughput of about 10 MB/sec.
All throughput numbers refer to uncompressed data.
Our first test consisted in evaluating the impact of adding producer machines.
By adding identically configured producer machines, each pushing as many messages as it can, the overall throughput increases slightly. We also observed that throughput was distributed very evenly across the machines.
Next, using all ten machines at our disposal we tested the effect of using different numbers of partitions.
Throughput increases very markedly at first as more brokers and disks on them start hosting different partitions. Once all brokers and disks are used though, adding additional partitions does not seem to have any effect.
As we saw in the baseline performance tests, even using a single replica represents a big performance hit when compared to the old system which had no support for replication at all. We were interested in knowing how much of an additional hit we would get when using two and three replicas.
Fortunately, the extra performance hit turned out to be quite small.
Finally, we tested the effect of increasing the number of topics. Our use case requires only a handful of topics, so we only experimented with small numbers.
Update: Michael G. Noll (see comment below) kindly pointed out that throughput could be improved by disabling ack messages, and provided this post. as a reference of what could be expected. I rerun some of the tests and here are some preliminary results:
Having an idea of what is the maximum throughput that can be achieved, we investigated the average and maximum latency of sending an individual message, which directly impacts the loading time on a browser hitting our webapp servers (this is the time for a thread using the Kafka producer to return from a call to send, NOT the full producer-broker-consumer cycle). To do this, we configured our tool to limit the rate at which it pushes messages according to a target throughput, and monitored latency for different values of throughput.
The average latency is consistently below 0.02 ms for as long as the target throughput does not reach the maximum throughput. Unfortunately, the maximum latency hovers around 120 ms even for very low values of throughput. Once the producers start trying to push more messages than the brokers can handle, both average and maximum latency increase very dramatically.
Finally, we set queue.enqueue.timeout.ms to 0 in an attempt to prevent the Kafka producer from ever blocking on a call to send, hoping that this would decrease the maximum latency. Unfortunately, this had no effect whatsoever. We got identical results to the graphs above. The only difference was that, as expected, producers started throwing exceptions (kafka.common.QueueFullException) when the target throughput reached the maximum throughput. Also, we observed that once exceptions were thrown, the producers would hang indefinitely despite invoking the close method, and a call toSystem.exit was required to force the application to quit.
Based on the numbers obtained above, we can draw the following preliminary conclusions:
We have just scratched the surface and there is still a lot of work to be done. Following is a list of some of the things we will probably look into:
It is our hope that the information we provided will be useful for people considering using Kafka for the first time or switching from 0.7 to 0.8. If you have any questions, comments or suggestions please leave them below.