Producer作为kafak核心组件之一,学习和分析它是很有必要的。producer配置文件在kafak的config目录下,配置好它
在以后的学习和工作中都可以起到事半功倍的效果。
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with# format: host1:port1,host2:port2 ...
#指定节点列表(必填,集群需要写各个节点,我这里是单机)
#指定分区处理类。默认kafka.producer.DefaultPartitioner
partitioner.class=kafka.producer.DefaultPartitioner
# specifies whether the messages are sent asynchronously (async) or synchronously (sync)
#sync同步(默认),async异步可以提高发送吞吐量
# the old config values work as well: 0, 1, 2, 3 for none, gzip, snappy, lz4, respectively
#是否压缩,0代表不压缩,1代表用gzip压缩,2代表用snappy压缩
compression.codec=0
# message encoder
#指定序列化处理类
serializer.class=kafka.serializer.DefaultEncoder
# allow topic level compression
#如果要压缩消息,这里指定哪些topic要压缩消息,默认是empty,表示不压缩
#compressed.topics=
#设置发送数据是否需要服务端的反馈,有三个值0,1,-1
# 0:producer不会等待broker发送ack
# 1:当leader接收到消息后发送ack
# -1:当所有的follower都同步消息成功后发送ack
request.required.acks=0
# maximum time, in milliseconds, for buffering data on the producer queue
#在async模式下,当message缓存超时后,将会批量发送给broker,默认5000ms
queue.buffering.max.ms=5000# the maximum size of the blocking queue for buffering on the producer
#在async模式下,Producer端允许buffer的最大消息量
queue.buffering.max.messages=20000#在向producer发送ack之前,broker均需等待的最大时间
request.timeout.ms=10000
# +ve: enqueue will block up to this many milliseconds if the queue is full
#当消息在producer端沉积的条数达到“queue.buffering.max.messages"后
#阻塞一定时间后,队列仍然没有enqueue(producer仍然没有发送出任何消息)
#此时producer可以继续阻塞,或者将消息抛弃
# -1:无阻塞超时限制,消息不会被抛弃
# 0 :立即清空队列,消息被抛弃
queue.enqueue.timeout.ms=-1
# the number of messages batched at the producer
#在async模式下,指定每次批量发送的数据量,默认200
batch.num.messages=200
补充:producer.properties的详细配置
属性 | 默认值 | 描述 |
---|---|---|
metadata.broker.list | 启动时producer查询brokers的列表,可以是集群中所有brokers的一个子集。注意,这个参数只是用来获取topic的元信息用,producer会从元信息中挑选合适的broker并与之建立socket连接。格式是:host1:port1,host2:port2。 | |
request.required.acks | 0 | 参见3.2节介绍 |
request.timeout.ms | 10000 | Broker等待ack的超时时间,若等待时间超过此值,会返回客户端错误信息。 |
producer.type | sync | 同步异步模式。async表示异步,sync表示同步。如果设置成异步模式,可以允许生产者以batch的形式push数据,这样会极大的提高broker性能,推荐设置为异步。 |
serializer.class | kafka.serializer.DefaultEncoder | 序列号类,.默认序列化成 byte[] 。 |
key.serializer.class | Key的序列化类,默认同上。 | |
partitioner.class | kafka.producer.DefaultPartitioner | Partition类,默认对key进行hash。 |
compression.codec | none | 指定producer消息的压缩格式,可选参数为: “none”, “gzip” and “snappy”。关于压缩参见4.1节 |
compressed.topics | null | 启用压缩的topic名称。若上面参数选择了一个压缩格式,那么压缩仅对本参数指定的topic有效,若本参数为空,则对所有topic有效。 |
message.send.max.retries | 3 | Producer发送失败时重试次数。若网络出现问题,可能会导致不断重试。 |
retry.backoff.ms | 100 | Before each retry, the producer refreshes the metadata of relevant topics to see if a new leader has been elected. Since leader election takes a bit of time, this property specifies the amount of time that the producer waits before refreshing the metadata. |
topic.metadata.refresh.interval.ms | 600 * 1000 | The producer generally refreshes the topic metadata from brokers when there is a failure (partition missing, leader not available…). It will also poll regularly (default: every 10min so 600000ms). If you set this to a negative value, metadata will only get refreshed on failure. If you set this to zero, the metadata will get refreshed after each message sent (not recommended). Important note: the refresh happen only AFTER the message is sent, so if the producer never sends a message the metadata is never refreshed |
queue.buffering.max.ms | 5000 | 启用异步模式时,producer缓存消息的时间。比如我们设置成1000时,它会缓存1秒的数据再一次发送出去,这样可以极大的增加broker吞吐量,但也会造成时效性的降低。 |
queue.buffering.max.messages | 10000 | 采用异步模式时producer buffer 队列里最大缓存的消息数量,如果超过这个数值,producer就会阻塞或者丢掉消息。 |
queue.enqueue.timeout.ms | -1 | 当达到上面参数值时producer阻塞等待的时间。如果值设置为0,buffer队列满时producer不会阻塞,消息直接被丢掉。若值设置为-1,producer会被阻塞,不会丢消息。 |
batch.num.messages | 200 | 采用异步模式时,一个batch缓存的消息数量。达到这个数量值时producer才会发送消息。 |
send.buffer.bytes | 100 * 1024 | Socket write buffer size |
client.id | “” | The client id is a user-specified string sent in each request to help trace calls. It should logically identify the application making the request. |