本文更好的观看体验在博主的个人博客上:点我访问
之前我说过,我希望能够快速开始实践,所以选择了利用docker来安装组件,docker-compose来解决通信问题,这次也一样,先用docker安装一个kafka集群来上手玩玩,之后有时间会再从头到尾搭建一遍集群的。
通过docker network create zookeeper_network
命令创建一个名为zookeeper_network
的Docker网络
这时候你利用docker network ls
可以看到创建出来了
为了解耦,我们把zookeeper集群和kafka集群成两个docker-compose.yml文件
version: '3.8'
networks:
default:
external:
name: zookeeper_network
services:
zoo1:
image: zookeeper:3.7.0
container_name: zoo1
hostname: zoo1
ports:
- 2181:2181
volumes:
- "/root/kafka_learn/zookeeper/zoo1/data:/data"
- "/root/kafka_learn/zookeeper/zoo1/datalog:/datalog"
environment:
ZOO_MY_ID: 1
ZOO_SERVERS: server.1=0.0.0.0:2888:3888;2181 server.2=zoo2:2888:3888;2181 server.3=zoo3:2888:3888;2181
zoo2:
image: zookeeper:3.7.0
container_name: zoo2
hostname: zoo2
ports:
- 2182:2181
volumes:
- "/root/kafka_learn/zookeeper/zoo2/data:/data"
- "/root/kafka_learn/zookeeper/zoo2/datalog:/datalog"
environment:
ZOO_MY_ID: 2
ZOO_SERVERS: server.1=zoo1:2888:3888;2181 server.2=0.0.0.0:2888:3888;2181 server.3=zoo3:2888:3888;2181
zoo3:
image: zookeeper:3.7.0
container_name: zoo3
hostname: zoo3
ports:
- 2183:2181
volumes:
- "/root/kafka_learn/zookeeper/zoo3/data:/data"
- "/root/kafka_learn/zookeeper/zoo3/datalog:/datalog"
environment:
ZOO_MY_ID: 3
ZOO_SERVERS: server.1=zoo1:2888:3888;2181 server.2=zoo2:2888:3888;2181 server.3=0.0.0.0:2888:3888;2181
然后利用docker-compose up -d
启动zookeeper集群
version: '3.8'
networks:
default:
external:
name: zookeeper_network
services:
kafka1:
image: wurstmeister/kafka:2.12-2.4.1
restart: unless-stopped
container_name: kafka1
hostname: kafka1
ports:
- "9092:9092"
external_links:
- zoo1
- zoo2
- zoo3
environment:
KAFKA_BROKER_ID: 1
KAFKA_LISTENERS: PLAINTEXT://0.0.0.0:9092
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://192.168.0.225:9092 ## 宿主机IP
KAFKA_ADVERTISED_HOST_NAME: kafka1
KAFKA_ADVERTISED_PORT: 9092
KAFKA_ZOOKEEPER_CONNECT: "zoo1:2181,zoo2:2181,zoo3:2181"
volumes:
- "/root/kafka_learn/kafka/kafka1/data/:/kafka"
kafka2:
image: wurstmeister/kafka:2.12-2.4.1
restart: unless-stopped
container_name: kafka2
hostname: kafka2
ports:
- "9093:9092"
external_links:
- zoo1
- zoo2
- zoo3
environment:
KAFKA_BROKER_ID: 2
KAFKA_LISTENERS: PLAINTEXT://0.0.0.0:9092
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://192.168.0.225:9093 ## 宿主机IP
KAFKA_ADVERTISED_HOST_NAME: kafka2
KAFKA_ADVERTISED_PORT: 9093
KAFKA_ZOOKEEPER_CONNECT: "zoo1:2181,zoo2:2181,zoo3:2181"
volumes:
- "/root/kafka_learn/kafka/kafka2/data/:/kafka"
kafka3:
image: wurstmeister/kafka:2.12-2.4.1
restart: unless-stopped
container_name: kafka3
hostname: kafka3
ports:
- "9094:9092"
external_links:
- zoo1
- zoo2
- zoo3
environment:
KAFKA_BROKER_ID: 3
KAFKA_LISTENERS: PLAINTEXT://0.0.0.0:9092
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://192.168.0.225:9094 ## 宿主机IP
KAFKA_ADVERTISED_HOST_NAME: kafka3
KAFKA_ADVERTISED_PORT: 9094
KAFKA_ZOOKEEPER_CONNECT: "zoo1:2181,zoo2:2181,zoo3:2181"
volumes:
- "/root/kafka_learn/kafka/kafka3/data/:/kafka"
kafka-manager: # Kafka 图形管理界面
image: sheepkiller/kafka-manager:latest
restart: unless-stopped
container_name: kafka-manager
hostname: kafka-manager
ports:
- "9000:9000"
links: # 连接本compose文件创建的container
- kafka1
- kafka2
- kafka3
external_links: # 连接外部compose文件创建的container
- zoo1
- zoo2
- zoo3
environment:
ZK_HOSTS: zoo1:2181,zoo2:2181,zoo3:2181
KAFKA_BROKERS: kafka1:9092,kafka2:9093,kafka3:9094
然后利用docker-compose up -d
启动kafka集群
docker ps
看看启动起来没有服务器ip:9000
的形式访问一下我们的kafka manager界面,配置一下我们的集群docker exec -it kafka1 /bin/bash
kafka-console-producer.sh --broker-list kafka1:9092,kafka2:9092,kafka3:9092 --topic test
>msg1
>msg2
docker exec -it kafka2 /bin/bash
kafka-console-consumer.sh --bootstrap-server kafka1:9092,kafka2:9092,kafka3:9092 --topic test --from-beginning
msg1
msg2
你自己演示下,生产者那边生产了一消息后,消费者这边就出现一条新消息
上面搭建的kafka集群进入容器里面进行生产者和消费者测试都是没有问题的,但是在我利用kafka tools连接和利用java连接的时候发现无论如何都连接不上,所以考虑到是我外界应用无法连接docker内的应用,开始debug。
我用的是华为云的服务器,公网ip是123.xxxxxx内网ip是192.xxxxxx,上面我的配置是KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://192.168.0.225:9093
这里用了内网ip,那么我们现在改成公网ip试一试,发现出现了新的问题Broker may not be available
…
行吧,抱着不解决就不睡觉的心思,耐心去看了解决方案和文档,发现了答案——
直接部署的时候我们kafka的参数设置是这样的形式:
broker.id=1
# 监听端口指定为 9093
listeners=PLAINTEXT://:9093
# 对外部暴露端口 本机IP:端口
advertised.listeners=PLAINTEXT://192.168.206.155:9093
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
log.dirs=/tmp/kafka-logs-1
而我们docker-compose.yml文件中配置项是这样的形式:
environment:
KAFKA_BROKER_ID: 2
KAFKA_LISTENERS: PLAINTEXT://0.0.0.0:9092
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://192.168.0225:9093 ## 宿主机IP
KAFKA_ADVERTISED_HOST_NAME: kafka2
KAFKA_ADVERTISED_PORT: 9093
KAFKA_ZOOKEEPER_CONNECT: "zoo1:2181,zoo2:2181,zoo3:2181"
所以其实就是在原来kafka配置上全部大写,.
变_
,然后加了个KAFKA前缀
可以看到我们在配置的时候设置了两个listener,那么他们分别是干嘛的呢?
advertised.host.name: DEPRECATED: only used when advertised.listeners or listeners are not set. Use advertised.listeners instead. Hostname to publish to ZooKeeper for clients to use. In IaaS environments, this may need to be different from the interface to which the broker binds. If this is not set, it will use the value for host.name if configured. Otherwise it will use the value returned from java.net.InetAddress.getCanonicalHostName().
advertised.listeners: Listeners to publish to ZooKeeper for clients to use, if different than the listeners config property. In IaaS environments, this may need to be different from the interface to which the broker binds. If this is not set, the value for listeners will be used. Unlike listeners it is not valid to advertise the 0.0.0.0 meta-address.
解释以下就是说如果没有设置任何的listener
参数,那么就会利用hostname和port来上报使用,如果设置了listener
参数就利用listener
参数中的值来使用,而listeners
和advertised.listeners
是为了区分内网和外网的,如果只有内网就可以只是用listeners
,如果涉及了外网,比如要在云服务器上部署使用就需要用到advertised.listeners
参数了。
那么我们现在内网设置的内容是KAFKA_LISTENERS: PLAINTEXT://0.0.0.0:9092
,想要外网访问就可以设置为KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://<服务器ip>:<暴露端口>
。advertised.listeners
, 用于开发给用户,如果没有设定,直接使用listeners
,而我们的listeners
是监控的0.0.0.0
这外界怎么可能访问的通嘛!
这个问题我们首先需要知道zookeeper在kafka中的用处,我简单的提一个现在和我们现在密切相关的用处——Broker注册
Broker是分布式部署并且相互之间相互独立,但是需要有一个注册系统能够将整个集群中的Broker管理起来,此时就使用到了Zookeeper。在Zookeeper上会有一个专门用来进行Broker服务器列表记录的节点:/brokers/ids
我们来验证一下,首先利用docker exec -it zoo1 /bin/sh
进入到zoo1节点中,然后zkCli.sh
一下,利用ls /brokers/ids
命令来查看我们yml文件中创建的三个broker节点
{height=“100px”}
然后我们看看每个节点里面都存了什么数据get /brokers/ids/1
{"listener_security_protocol_map":{"PLAINTEXT":"PLAINTEXT"},"endpoints":["PLAINTEXT://123.xxxxxxx:9092"],"jmx_port":-1,"host":"123.xxxxxxxx","timestamp":"1650424784738","port":9092,"version":4}
[Kafka 客户端在连接 Broker 的时候,Broker 将客户端发来的请求附加信息与 Broker 启动时上报给 Zookeeper 的 listeners参数信息、host(来自listeners的中间域名或主机名部分)、port (来自listeners的端口部分) 进行了验证,认证通过建立连接执行请求,反之建立连接失败]{.red}
【就是说如果你在advertised.listeners
中配置了什么内容在zookeeper节点中就是endpoints中的内容,然后连接的时候就严格按照endpoints中的内容来进行匹配,不同就不让你连】
这样就很好的解释了为什么有的博客里面说要用advertised.listeners=PLAINTEXT://kafka1:9092
来配置在java程序中就需要配置为kafka1:9092
了!
通俗一点来讲:
advertised.listeners=PLAINTEXT://123.xxxxxx:9092
,那么你java配置中就需要用props.put("bootstrap.servers", "123.xxxxx:9092");
advertised.listeners=PLAINTEXT://kafka1:9092
,那么你java配置中就需要用props.put("bootstrap.servers", "kafka1:9092");
但是需要多一步,因为你这里写的是kafka1,你的java程序并不知道要去哪里找这个ip来发送请求,所以你的这个java程序要在哪运行,就要修改那个机器的hosts文件,加上ip和主机名的映射才行!123.xxxxxx kafka1
其实我利用公网ip和利用主机名来访问的两种形式都试过了,都是可以成功的,至于上面说的利用公网ip会导致出现Broker may not be available
的问题,我姑且把他归罪于网络动荡问题,因为我大半夜的用公网ip的形式有这个问题,但是白天想不明白又用公网ip测试了一遍又成了…
advertised.listeners
参数,并且一定要用公网ip!advertised.listeners
中PLAINTEXT://
后面的内容完全相同才行!