启动zookeeper,在master、salve1和slave2上分别启动
[root@master bin]# ./zkServer.sh start
三个节点启动后,别忘了查看一下启动状态:
[root@master bin]# ./zkServer.sh status
JMX enabled by default
Using config: /usr/local/src/zookeeper/zookeeper-3.4.5/bin/../conf/zoo.cfg
Mode: follower
[root@slave1 bin]# ./zkServer.sh status
JMX enabled by default
Using config: /usr/local/src/zookeeper/zookeeper-3.4.5/bin/../conf/zoo.cfg
Mode: follower
[root@slave2 bin]# ./zkServer.sh status
JMX enabled by default
Using config: /usr/local/src/zookeeper/zookeeper-3.4.5/bin/../conf/zoo.cfg
Mode: leader
我只在在master上启动kafka
[root@master bin]# ./kafka-server-start.sh ../config/server.properties
用kafka-topics.sh创建测试主体mytest,副本数设为1,分区数设为3,这3个分区都位于master节点上,直接运行kafka-topics.sh可知其解释
[root@master bin]# ./kafka-topics.sh
Create, delete, describe, or change a topic.
Option Description
------ -----------
--alter Alter the number of partitions,
replica assignment, and/or
configuration for the topic.
--config A topic configuration override for the
topic being created or altered.The
following is a list of valid
configurations:
cleanup.policy
compression.type
delete.retention.ms
file.delete.delay.ms
flush.messages
flush.ms
follower.replication.throttled.
replicas
index.interval.bytes
leader.replication.throttled.replicas
max.message.bytes
message.format.version
message.timestamp.difference.max.ms
message.timestamp.type
min.cleanable.dirty.ratio
min.compaction.lag.ms
min.insync.replicas
preallocate
retention.bytes
retention.ms
segment.bytes
segment.index.bytes
segment.jitter.ms
segment.ms
unclean.leader.election.enable
See the Kafka documentation for full
details on the topic configs.
--create Create a new topic.
--delete Delete a topic
--delete-config A topic configuration override to be
removed for an existing topic (see
the list of configurations under the
--config option).
--describe List details for the given topics.
--disable-rack-aware Disable rack aware replica assignment
--force Suppress console prompts
--help Print usage information.
--if-exists if set when altering or deleting
topics, the action will only execute
if the topic exists
--if-not-exists if set when creating topics, the
action will only execute if the
topic does not already exist
--list List all available topics.
--partitions The number of partitions for the topic
being created or altered (WARNING:
If partitions are increased for a
topic that has a key, the partition
logic or ordering of the messages
will be affected
--replica-assignment
--replication-factor partition in the topic being created.
--topic The topic to be create, alter or
describe. Can also accept a regular
expression except for --create option
--topics-with-overrides if set when describing topics, only
show topics that have overridden
configs
--unavailable-partitions if set when describing topics, only
show partitions whose leader is not
available
--under-replicated-partitions if set when describing topics, only
show under replicated partitions
--zookeeper REQUIRED: The connection string for
the zookeeper connection in the form
host:port. Multiple URLS can be
given to allow fail-over.
[root@master bin]# ./kafka-topics.sh --zookeeper master:2181 --create --topic mytest --replication-factor 1 --partitions 3
Created topic "mytest".
[root@master bin]# ./kafka-topics.sh --zookeeper master:2181 --describe --topic mytest
Topic:mytest PartitionCount:3 ReplicationFactor:1 Configs:
Topic: mytest Partition: 0 Leader: 0 Replicas: 0 Isr: 0
Topic: mytest Partition: 1 Leader: 0 Replicas: 0 Isr: 0
Topic: mytest Partition: 2 Leader: 0 Replicas: 0 Isr: 0
Partition是分区编号,Leader,Replicats和Isr里是broker_id,而broker_id是在$KAFKA_HOME/config/server.properties里配置的。
因为创建topic时必须指定zookeeper,所以此时zookeeper上存有该topic及其分区的元数据信息,执行zkCli.sh进入zookeeper客户端查看:
[root@master bin]# ./zkCli.sh
Connecting to localhost:2181
2019-06-27 09:46:24,135 [myid:] - INFO [main:Environment@100] - Client environment:zookeeper.version=3.4.5-1392090, built on 09/30/2012 17:52 GMT
2019-06-27 09:46:24,139 [myid:] - INFO [main:Environment@100] - Client environment:host.name=master
2019-06-27 09:46:24,139 [myid:] - INFO [main:Environment@100] - Client environment:java.version=1.8.0_172
2019-06-27 09:46:24,139 [myid:] - INFO [main:Environment@100] - Client environment:java.vendor=Oracle Corporation
2019-06-27 09:46:24,140 [myid:] - INFO [main:Environment@100] - Client environment:java.home=/usr/local/src/java/jdk1.8.0_172/jre
2019-06-27 09:46:24,140 [myid:] - INFO [main:Environment@100] - Client environment:java.class.path=/usr/local/src/zookeeper/zookeeper-3.4.5/bin/../build/classes:/usr/local/src/zookeeper/zookeeper-3.4.5/bin/../build/lib/*.jar:/usr/local/src/zookeeper/zookeeper-3.4.5/bin/../lib/slf4j-log4j12-1.6.1.jar:/usr/local/src/zookeeper/zookeeper-3.4.5/bin/../lib/slf4j-api-1.6.1.jar:/usr/local/src/zookeeper/zookeeper-3.4.5/bin/../lib/netty-3.2.2.Final.jar:/usr/local/src/zookeeper/zookeeper-3.4.5/bin/../lib/log4j-1.2.15.jar:/usr/local/src/zookeeper/zookeeper-3.4.5/bin/../lib/jline-0.9.94.jar:/usr/local/src/zookeeper/zookeeper-3.4.5/bin/../zookeeper-3.4.5.jar:/usr/local/src/zookeeper/zookeeper-3.4.5/bin/../src/java/lib/*.jar:/usr/local/src/zookeeper/zookeeper-3.4.5/bin/../conf:.:/usr/local/src/java/jdk1.8.0_172/lib:/usr/local/src/java/jdk1.8.0_172/jre/lib:/usr/local/src/hive/apache-hive-1.2.2-bin/lib
2019-06-27 09:46:24,140 [myid:] - INFO [main:Environment@100] - Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
2019-06-27 09:46:24,141 [myid:] - INFO [main:Environment@100] - Client environment:java.io.tmpdir=/tmp
2019-06-27 09:46:24,141 [myid:] - INFO [main:Environment@100] - Client environment:java.compiler=
2019-06-27 09:46:24,141 [myid:] - INFO [main:Environment@100] - Client environment:os.name=Linux
2019-06-27 09:46:24,141 [myid:] - INFO [main:Environment@100] - Client environment:os.arch=amd64
2019-06-27 09:46:24,141 [myid:] - INFO [main:Environment@100] - Client environment:os.version=3.10.0-862.el7.x86_64
2019-06-27 09:46:24,142 [myid:] - INFO [main:Environment@100] - Client environment:user.name=root
2019-06-27 09:46:24,142 [myid:] - INFO [main:Environment@100] - Client environment:user.home=/root
2019-06-27 09:46:24,142 [myid:] - INFO [main:Environment@100] - Client environment:user.dir=/usr/local/src/zookeeper/zookeeper-3.4.5/bin
2019-06-27 09:46:24,145 [myid:] - INFO [main:ZooKeeper@438] - Initiating client connection, connectString=localhost:2181 sessionTimeout=30000 watcher=org.apache.zookeeper.ZooKeeperMain$MyWatcher@41906a77
2019-06-27 09:46:24,204 [myid:] - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@966] - Opening socket connection to server localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate using SASL (unknown error)
Welcome to ZooKeeper!
JLine support is enabled
2019-06-27 09:46:24,352 [myid:] - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@849] - Socket connection established to localhost/0:0:0:0:0:0:0:1:2181, initiating session
2019-06-27 09:46:24,389 [myid:] - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@1207] - Session establishment complete on server localhost/0:0:0:0:0:0:0:1:2181, sessionid = 0x16b964e816b000a, negotiated timeout = 30000
WATCHER::
WatchedEvent state:SyncConnected type:None path:null
[zk: localhost:2181(CONNECTED) 7] ls /
[cluster, controller_epoch, controller, brokers, zookeeper, admin, isr_change_notification, consumers, config, hbase]
[zk: localhost:2181(CONNECTED) 8] ls /brokers
[ids, topics, seqid]
[zk: localhost:2181(CONNECTED) 9] ls /brokers/topics
[mytest, test, __consumer_offsets]
[zk: localhost:2181(CONNECTED) 10] ls /brokers/topics/mytest
[partitions]
[zk: localhost:2181(CONNECTED) 11] ls /brokers/topics/mytest/partitions
[0, 1, 2]
[zk: localhost:2181(CONNECTED) 12] ls /brokers/topics/mytest/partitions/0
[state]
[zk: localhost:2181(CONNECTED) 13] ls /brokers/topics/mytest/partitions/0/state
[]
[zk: localhost:2181(CONNECTED) 14] get /brokers/topics/mytest/partitions/0/state
{"controller_epoch":4,"leader":0,"version":1,"leader_epoch":0,"isr":[0]}
cZxid = 0x7000000bf
ctime = Thu Jun 27 08:46:04 CST 2019
mZxid = 0x7000000bf
mtime = Thu Jun 27 08:46:04 CST 2019
pZxid = 0x7000000bf
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 72
numChildren = 0
[zk: localhost:2181(CONNECTED) 15]
用kafka-console-producer.sh 创建生产者,直接运行kafka-console-producer.sh 就出现其解释如下
[root@master bin]# ./kafka-console-producer.sh
Read data from standard input and publish it to Kafka.
Option Description
------ -----------
--batch-size Number of messages to send in a single
batch if they are not being sent
synchronously. (default: 200)
--broker-list REQUIRED: The broker list string in
the form HOST1:PORT1,HOST2:PORT2.
--compression-codec [String: The compression codec: either 'none',
compression-codec] 'gzip', 'snappy', or 'lz4'.If
specified without value, then it
defaults to 'gzip'
--key-serializer implementation to use for
serializing keys. (default: kafka.
serializer.DefaultEncoder)
--line-reader The class name of the class to use for
reading lines from standard in. By
default each line is read as a
separate message. (default: kafka.
tools.
ConsoleProducer$LineMessageReader)
--max-block-ms block for during a send request
(default: 60000)
--max-memory-bytes to buffer records waiting to be sent
to the server. (default: 33554432)
--max-partition-memory-bytes partition. When records are received
which are smaller than this size the
producer will attempt to
optimistically group them together
until this size is reached.
(default: 16384)
--message-send-max-retries Brokers can fail receiving the message
for multiple reasons, and being
unavailable transiently is just one
of them. This property specifies the
number of retires before the
producer give up and drop this
message. (default: 3)
--metadata-expiry-ms after which we force a refresh of
metadata even if we haven't seen any
leadership changes. (default: 300000)
--old-producer Use the old producer implementation.
--producer-property properties in the form key=value to
the producer.
--producer.config Producer config properties file. Note
that [producer-property] takes
precedence over this config.
--property A mechanism to pass user-defined
properties in the form key=value to
the message reader. This allows
custom configuration for a user-
defined message reader.
--queue-enqueuetimeout-ms 2147483647)
--queue-size If set and the producer is running in
asynchronous mode, this gives the
maximum amount of messages will
queue awaiting sufficient batch
size. (default: 10000)
--request-required-acks requests (default: 1)
--request-timeout-ms requests. Value must be non-negative
and non-zero (default: 1500)
--retry-backoff-ms Before each retry, the producer
refreshes the metadata of relevant
topics. Since leader election takes
a bit of time, this property
specifies the amount of time that
the producer waits before refreshing
the metadata. (default: 100)
--socket-buffer-size The size of the tcp RECV size.
(default: 102400)
--sync If set message send requests to the
brokers are synchronously, one at a
time as they arrive.
--timeout If set and the producer is running in
asynchronous mode, this gives the
maximum amount of time a message
will queue awaiting sufficient batch
size. The value is given in ms.
(default: 1000)
--topic REQUIRED: The topic id to produce
messages to.
--value-serializer implementation to use for
serializing values. (default: kafka.
serializer.DefaultEncoder)
[root@master bin]# ./kafka-console-producer.sh --broker-list master:9092 --topic mytest
用kafka-console-consumer.sh创建消费者,并指定其所在的消费组为group_mytest
[root@master bin]# ./kafka-console-consumer.sh
The console consumer is a tool that reads data from Kafka and outputs it to standard output.
Option Description
------ -----------
--blacklist Blacklist of topics to exclude from
consumption.
--bootstrap-server used): The server to connect to.
--consumer-property properties in the form key=value to
the consumer.
--consumer.config Consumer config properties file. Note
that [consumer-property] takes
precedence over this config.
--csv-reporter-enabled If set, the CSV metrics reporter will
be enabled
--delete-consumer-offsets If specified, the consumer path in
zookeeper is deleted when starting up
--enable-systest-events Log lifecycle events of the consumer
in addition to logging consumed
messages. (This is specific for
system tests.)
--formatter The name of a class to use for
formatting kafka messages for
display. (default: kafka.tools.
DefaultMessageFormatter)
--from-beginning If the consumer does not already have
an established offset to consume
from, start with the earliest
message present in the log rather
than the latest message.
--key-deserializer
--max-messages The maximum number of messages to
consume before exiting. If not set,
consumption is continual.
--metrics-dir this parameter isset, the csv
metrics will be outputed here
--new-consumer Use the new consumer implementation.
This is the default.
--offset The offset id to consume from (a non-
negative number), or 'earliest'
which means from beginning, or
'latest' which means from end
(default: latest)
--partition The partition to consume from.
--property The properties to initialize the
message formatter.
--skip-message-on-error If there is an error when processing a
message, skip it instead of halt.
--timeout-ms If specified, exit if no message is
available for consumption for the
specified interval.
--topic The topic id to consume on.
--value-deserializer
--whitelist Whitelist of topics to include for
consumption.
--zookeeper REQUIRED (only when using old
consumer): The connection string for
the zookeeper connection in the form
host:port. Multiple URLS can be
given to allow fail-over.
[root@master bin]# ./kafka-console-consumer.sh --bootstrap-server master:9092 --topic mytest --consumer-property group.id=group_mytes
[root@master bin]# ./kafka-console-producer.sh --broker-list master:9092 --topic mytest
a
b
c
d
e
a
b
c
此时消费者可接收到消息,但是顺序和生产者的发送顺序不一样,也就是说当一个主题有多个分区时,消费者接受消息的顺序是乱序的,因为每个partition的网络性能不一样,读写性能也不一样。
[root@master bin]# ./kafka-console-consumer.sh --bootstrap-server master:9092 --topic mytest --consumer-property group.id=group_mytest
a
c
b
d
a
e
b
c
用kafka-consumer-offset-checker.sh查看消费者的消费位移offseet,有没有消息堆积
[root@master bin]# ./kafka-consumer-offset-checker.sh
[2019-06-27 08:47:42,469] WARN WARNING: ConsumerOffsetChecker is deprecated and will be dropped in releases following 0.9.0. Use ConsumerGroupCommand instead. (kafka.tools.ConsumerOffsetChecker$)
Check the offset of your consumers.
Option Description
------ -----------
--broker-info Print broker info
--group Consumer group.
--help Print this message.
--retry.backoff.ms Retry back-off to use for failed offset queries.
(default: 3000)
--socket.timeout.ms Socket timeout to use when querying for offsets.
(default: 6000)
--topic Comma-separated list of consumer topics (all
topics if absent).
--zookeeper ZooKeeper connect string. (default: localhost:
2181)
[root@master bin]# ./kafka-consumer-offset-checker.sh --zookeeper master:2181 --topic mytest --group group_mytest --broker-info
[2019-06-27 08:49:31,431] WARN WARNING: ConsumerOffsetChecker is deprecated and will be dropped in releases following 0.9.0. Use ConsumerGroupCommand instead. (kafka.tools.ConsumerOffsetChecker$)
Group Topic Pid Offset logSize Lag Owner
group_mytest mytest 0 3 3 0 none
group_mytest mytest 1 2 2 0 none
group_mytest mytest 2 3 3 0 none
BROKER INFO
0 -> master:9092
上述结果显示该消费者所属的Group为group_mytest,所消费的Topic为mytest,Pid表示所消费的三个分区,Offset表示消费位移,logSize表示各分区日志的总偏移量,Lag表示滞后的位移(此时堆积了消息还没真正消费掉,还没去提交更新offSet),此时可看到Offset=logSize,Lag=0,说明主体分区内的消息都已经消费掉,没有滞后的消息。另外生产者发送了8个消息(每行一个消息),Topic的logSize总共也是8。最后一行显示的是此broker的信息,即broker id为0的kafka实例(如果一台机器上只有一个kafka实例,那么broker也可表示这台机器),其端口为9092。
另外,消费者打印出接收到的消息并不代表已经消费了这个消息,消费者从kafka里拉取到所需的消息后,进行处理,才算是消费了消息,然后提交位移(更新offSet),有时因为网络原因,消费者在控制台打印出了消息,但是查看消费位移时的Lag不为0,Offset 因为之前创建消费者时没有指定zookeeper,所在在zookeeper上的/consumers里没有group_mytest消费组的信息 而group_test是另一个topic为test的消费组,不是我们刚刚创建mytest的消费组group_mytest 如果创建消费者时指定zookeeper,生产者再发送几条消息,此时查看消费位移出现了消息堆积: 此时可以按照指定的offset进行消费数据: 再进入zookeeper查看,发现有group_mytest这个消费组对于mytest主题各分区的消费位移信息了 对于mytest,在group_test消费组内创建两个消费组,共同消费mytest里三个分区的消息,现在生产者再发送一组消息,查看两个消费者的消费信息情况 consumer1: consumer2: 可以看出,对于在同一个消费组内的两个消费者consumer1和consumer2,各自获取同一topic内的消息,这验证了一个topic内的一个分区只能被同一消费组内的一个消费者所消费,不能被一个组内的多个消费者所消费 不同的消费组:消费组1有两个消费者,消费组2有1个消费者 producer 消费组1 consumer2 消费组2 可以看出,对于只有一个消费者的消费组,该组内的这个唯一的消费者消费一个topic的所有分区的消息;而在具有多个消费者的消费组中,其消费者各自消费topic内的部分分区消息,但绝不会存在一个分区同时被组内的多个消费者消费的情况 1、删除kafka存储目录(server.properties文件log.dirs配置,默认为”/tmp/kafka-logs”)相关topic目录 你可以通过命令如下来查看所有topic: 此时你若想真正删除它,可以如下操作: (2)找到topic所在的目录: (3)找到要删除的topic,执行如下命令即可,此时topic被彻底删除: (4)重启kafka.zookeeper查看消费位移
[root@master bin]# ./zkCli.sh
[zk: localhost:2181(CONNECTED) 1] ls /
[cluster, controller_epoch, controller, brokers, zookeeper, admin, isr_change_notification, consumers, config, hbase]
[zk: localhost:2181(CONNECTED) 2] ls /consumers
[group_test]
[root@master bin]# ./kafka-console-consumer.sh --zookeeper master:2181 --topic mytest --consumer-property group.id=group_mytest
[root@master bin]# ./kafka-consumer-offset-checker.sh --zookeeper master:2181 --topic mytest --group group_mytest --broker-info
[2019-06-27 10:16:59,703] WARN WARNING: ConsumerOffsetChecker is deprecated and will be dropped in releases following 0.9.0. Use ConsumerGroupCommand instead. (kafka.tools.ConsumerOffsetChecker$)
Group Topic Pid Offset logSize Lag Owner
group_mytest mytest 0 8 10 2 group_mytest_master-1561601666132-ea1c06cb-0
group_mytest mytest 1 7 9 2 group_mytest_master-1561601666132-ea1c06cb-0
group_mytest mytest 2 7 9 2 group_mytest_master-1561601666132-ea1c06cb-0
BROKER INFO
0 -> master:9092
./kafka-console-consumer.sh --bootstrap-server master:9092 --topic mytest --consumer-property group.id=group_mytest --offset 8 --partition 0
[zk: localhost:2181(CONNECTED) 15] ls /
[cluster, controller_epoch, controller, brokers, zookeeper, admin, isr_change_notification, consumers, config, hbase]
[zk: localhost:2181(CONNECTED) 16] ls /consumers
[group_test, group_mytest]
[zk: localhost:2181(CONNECTED) 18] ls /consumers/group_mytest
[ids, owners, offsets]
[zk: localhost:2181(CONNECTED) 19] ls /consumers/group_mytest/offsets
[mytest]
[zk: localhost:2181(CONNECTED) 20] ls /consumers/group_mytest/offsets/mytest
[0, 1, 2]
[zk: localhost:2181(CONNECTED) 21] ls /consumers/group_mytest/offsets/mytest/0
[]
[zk: localhost:2181(CONNECTED) 22] get /consumers/group_mytest/offsets/mytest/0
10
cZxid = 0x7000000e4
ctime = Thu Jun 27 10:06:24 CST 2019
mZxid = 0x700000100
mtime = Thu Jun 27 10:15:26 CST 2019
pZxid = 0x7000000e4
cversion = 0
dataVersion = 1
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 2
numChildren = 0
关于同一消费组内消费者和不同消费组的消费者的实验
producer:[root@master bin]# ./kafka-console-producer.sh --broker-list master:9092 --topic mytest
1
2
3
4
5
6
7
8
9
10
11
12
13
[root@master bin]# ./kafka-console-consumer.sh --zookeeper master:2181 --topic mytest --consumer-property group.id=group_mytest
Using the ConsoleConsumer with old consumer is deprecated and will be removed in a future major release. Consider using the new consumer by passing [bootstrap-server] instead of [zookeeper].
2
1
4
5
7
8
11
10
13
[root@master bin]# ./kafka-console-consumer.sh --zookeeper master:2181 --topic mytest --consumer-property group.id=group_mytest
Using the ConsoleConsumer with old consumer is deprecated and will be removed in a future major release. Consider using the new consumer by passing [bootstrap-server] instead of [zookeeper].
3
6
9
12
[root@master bin]# ./kafka-console-producer.sh --broker-list master:9092 --topic mytest
1
2
3
4
5
6
7
8
9
0
10
11
12
13
14
15
16
consumer1[root@master bin]# ./kafka-console-consumer.sh --zookeeper master:2181 --topic mytest --consumer-property group.id=group_mytest
Using the ConsoleConsumer with old consumer is deprecated and will be removed in a future major release. Consider using the new consumer by passing [bootstrap-server] instead of [zookeeper].
2
1
5
4
7
8
0
10
12
13
15
16
[root@master bin]# ./kafka-console-consumer.sh --zookeeper master:2181 --topic mytest --consumer-property group.id=group_mytest
Using the ConsoleConsumer with old consumer is deprecated and will be removed in a future major release. Consider using the new consumer by passing [bootstrap-server] instead of [zookeeper].
3
6
9
11
14
consumer3[root@master bin]# ./kafka-console-consumer.sh --zookeeper master:2181 --topic mytest --consumer-property group.id=group_addmytest
Using the ConsoleConsumer with old consumer is deprecated and will be removed in a future major release. Consider using the new consumer by passing [bootstrap-server] instead of [zookeeper].
2
1
4
3
5
7
6
8
9
0
10
11
12
13
14
15
16
删除对应的topic
2、Kafka 删除topic(集群中每台都要删除对应的topic):
如果kafaka启动时加载的配置文件中server.properties没有配置”delete.topic.enable=true”,那么此时的删除并不是真正的删除,而是把topic标记为删除:marked for deletion
kafka/bin/kafka-topics.sh –list –zookeeper Zookeeper地址
(1)登录zookeeper客户端的命令:
./zkCli.sh -server 0.0.0.0:2181,0.0.0.0.2181,0.0.0.0:2181
ls /brokers/topics
rmr /brokers/topics/new_skxb_fatigueChatReset