Docker搭建Kafka集群(Docker & Kafka & Cluster)

Docker搭建Kafka集群

  • 环境准备
    • 操作系统
    • 安装Docker
  • 单实例(Without Docker)
    • 安装JDK
    • 下载安装包
    • 启动进程
    • 测试
  • 集群安装(Without Docker)
    • 准备
    • 启动zookeeper 集群
    • 更改配置
    • 启动
    • 测试
  • Docker 单实例
    • 启动zookeeper集群
    • 启动镜像
    • 测试
  • 监控节点
  • overlay网络集群
    • 准备基础环境
    • 创建swarm overlay网络
    • 创建数据目录
    • 启动zookeeper
    • 启动命令
    • 检查节点
    • 测试
  • Docker Stack搭建集群
    • 启动zookeeper
    • 编写配置
    • 启动集群
    • 检查节点
    • 测试

环境准备

操作系统

CentOS7.6

安装Docker

参照安装(点击)

单实例(Without Docker)

安装JDK

去官网上下载1.8版本的tar.gz ,如果使用yum安装或者下载rpm包安装,则会缺少Scala2.11需要的部分文件。

tar xf jdk-8u221-linux-x64.tar -C /usr/lib/jvm
rm -rf /usr/bin/java
ln -s /usr/lib/jvm/jdk1.8.0_221/bin/java /usr/bin/java

编辑文件

vim /etc/profile.d/java.sh

添加

export JAVA_HOME=/usr/lib/jvm/jdk1.8.0_221
export JRE_HOME=${JAVA_HOME}/jre
export CLASSPATH=${JAVA_HOME}/lib:${JRE_HOME}/lib:$CLASSPATH
export PATH=${JAVA_HOME}/bin:$PATH

然后使环境变量生效

source /etc/profile

执行以下命令检查环境变量

[root@vm1 bin]# echo $JAVA_HOME
/usr/lib/jvm/jdk1.8.0_221
[root@vm1 bin]# echo $JAVA_HOME
/usr/lib/jvm/jdk1.8.0_221

下载安装包

从官网获取下载地址

wget https://www.apache.org/dyn/closer.cgi?path=/kafka/2.3.0/kafka_2.11-2.3.0.tgz

解压

tar xf kafka_2.11-2.3.0.tgz -C /opt/

启动进程

因为kafka启动依赖于zookeeper,先启动zookeeper

cd /opt/kafka_2.11-2.3.0/bin
./zookeeper-server-start.sh -daemon ../config/zookeeper.properties 

启动以后启动kafka进程

./kafka-server-start.sh -daemon ../config/server.properties

检查端口2181和9092,确认zookeeper和kafka已经启动

测试

#创建topic
./kafka-topics.sh --create --bootstrap-server localhost:9092 --replication-factor 1 --partitions 1 --topic test
#查看topic
./kafka-topics.sh --list --bootstrap-server localhost:9092
#生产消息
/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test --from-beginning
This is a message
This is another message
#消费队列
./kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test --from-beginning

确保以上操作都能够正常进行

集群安装(Without Docker)

准备

启动三台虚拟机,vm1、vm2、vm3在一个子网当中。三台机器按照单实例模式进行安装。

启动zookeeper 集群

zookeeper集群配置参考这里
可以启动使用kafka内部的zookeeper搭建集群,也可以使用zookeeper的docker集群。
后续过程使用zookeeper的docker集群,即集群配置为"localhost:2181,localhost:2182,localhost:2183"

更改配置

  • 需要把kafka的broker id 设置为不同的,这里分别把vm1,vm2,vm3的上面实例的broker id 设置为1,2,3
  • 需要把kafka的zookeeper地址设置为集群地址
  • /tmp/kafka-logs目录下的meta.properties里面由kafka的broker id,可能会造成broker不匹配,可以删掉。

这里使用
执行以下命令替换配置

#vm1
sed -i 's/broker.id=0/broker.id=1/g' /opt/kafka_2.11-2.3.0/config/server.properties
sed -i 's/zookeeper.connect=localhost:2181/zookeeper.connect=localhost:2181,localhost:2182,localhost:2183\/kafka/g' /opt/kafka_2.11-2.3.0/config/server.properties
rm -rf /tmp/kafka-logs
#vm2
sed -i 's/broker.id=0/broker.id=2/g' /opt/kafka_2.11-2.3.0/config/server.properties
sed -i 's/zookeeper.connect=localhost:2181/zookeeper.connect=localhost:2181,localhost:2182,localhost:2183\/kafka/g' /opt/kafka_2.11-2.3.0/config/server.properties
rm -rf /tmp/kafka-logs
#vm3
sed -i 's/broker.id=0/broker.id=3/g' /opt/kafka_2.11-2.3.0/config/server.properties
sed -i 's/zookeeper.connect=localhost:2181/zookeeper.connect=localhost:2181,localhost:2182,localhost:2183\/kafka/g' /opt/kafka_2.11-2.3.0/config/server.properties
rm -rf /tmp/kafka-logs

启动

在vm1,vm2,vm3上执行启动命令

./kafka-server-start.sh -daemon ../config/server.properties

执行jps,检查进程是否存在
继续通过zookeeper检查

docker run -it --rm  zookeeper zkCli.sh -server vm1:2181

进入后敲

ls /kafka/brokers/ids

看是否1,2,3三个节点全在,如果不能发现,有时重启可以解决。

测试

#创建topic
./kafka-topics.sh --create --bootstrap-server localhost:9092 --replication-factor 3 --partitions 3 --topic test-cluster
#查看topic
./kafka-topics.sh --list --bootstrap-server localhost:9092
#生产消息
./kafka-console-producer.sh --broker-list localhost:9092 --topic test-cluster 
This is a message
This is another message
#消费队列
./kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test-cluster --from-beginning

Docker 单实例

启动zookeeper集群

仍然使用上面搭建zookeeper集群

启动镜像

docker run -d  --name kafka --hostname kafka \
-p 9092:9092 --restart=always \
-e KAFKA_ADVERTISED_HOST_NAME=vm1 -e KAFKA_ADVERTISED_PORT=9092 \
-e KAFKA_ZOOKEEPER_CONNECT=vm1:2181,vm1:2182,vm1:2183 \
wurstmeister/kafka:latest

KAFKA_ADVERTISED_HOST_NAME设置的是docker host的ip
KAFKA_ADVERTISED_PORT设置的是docker host上暴露的端口
server.configure的大部分配置都可以通过设置环境变量转换成想要的值,例如KAFKA_ADVERTISED_HOST_NAME对应的是advertised.host.name

测试

#创建topic
docker run -it --rm wurstmeister/kafka:latest /opt/kafka/bin/kafka-topics.sh --create --bootstrap-server vm1:9092 --replication-factor 1 --partitions 1 --topic test
#查看topic
docker run -it --rm wurstmeister/kafka:latest /opt/kafka/bin/kafka-topics.sh --list --bootstrap-server vm1:9092
#生产消息
docker run -it --rm wurstmeister/kafka:latest /opt/kafka/bin/kafka-console-producer.sh --broker-list vm1:9092 --topic test
This is a message
This is another message
#消费队列
docker run -it --rm wurstmeister/kafka:latest /opt/kafka/bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test --from-beginning

监控节点

docker run -itd --name kafka-manager --hostname kafka-manager \
-p 9000:9000 --restart=always \
-e ZK_HOSTS=vm1:2181,vm1:2182,vm1:2183 \
sheepkiller/kafka-manager

访问监控节点 http://vm1:9000

overlay网络集群

准备基础环境

部署三台机器vm1,vm2,vm3。基础环境和单实例一样。

创建swarm overlay网络

创建swarm overlay网络(点击)

创建数据目录

在vm1,vm2,vm3上分别创建目录

#vm1
mkdir -p /opt/volumns/kafka-1/kafka-logs-kafka
mkdir -p /opt/volumns/kafka-1/logs
#vm2
mkdir -p /opt/volumns/kafka-2/kafka-logs-kafka
mkdir -p /opt/volumns/kafka-2/logs
#vm3
mkdir -p /opt/volumns/kafka-3/kafka-logs-kafka
mkdir -p /opt/volumns/kafka-3/logs

启动zookeeper

#vm1
docker run -d --name=zookeeper-1 --hostname=zookeeper-1 \
--network=overlay --restart=always \
-p 2181:2181 -p 8080:8080 -e ZOO_MY_ID=1 \
-e ZOO_SERVERS="server.1=zookeeper-1:2888:3888;2181 server.2=zookeeper-2:2888:3888;2181 server.3=zookeeper-3:2888:3888;2181" \
-v /etc/localtime:/etc/localtime -v \
zookeeper
#vm2
docker run -d --name=zookeeper-2 --hostname=zookeeper-2 \
--network=overlay --restart=always \
-p 2181:2181 -p 8080:8080 -e ZOO_MY_ID=2 \
-e ZOO_SERVERS="server.1=zookeeper-1:2888:3888;2181 server.2=zookeeper-2:2888:3888;2181 server.3=zookeeper-3:2888:3888;2181" \
-v /etc/localtime:/etc/localtime \
zookeeper
#vm3
docker run -d --name=zookeeper-3 --hostname=zookeeper-3 \
--network=overlay --restart=always \
-p 2181:2181 -p 8080:8080 -e ZOO_MY_ID=3 \
-e ZOO_SERVERS="server.1=zookeeper-1:2888:3888;2181 server.2=zookeeper-2:2888:3888;2181 server.3=zookeeper-3:2888:3888;2181" \
-v /etc/localtime:/etc/localtime \
zookeeper

这里的zookeeper必须启动在overlay网络当中

启动命令

#vm1
docker run -d --name kafka-1 --hostname kafka-1 -p 9092:9092 \
--restart=always --network=overlay \
-e KAFKA_ADVERTISED_HOST_NAME=vm1 -e KAFKA_ADVERTISED_PORT=9092 \
-e KAFKA_ZOOKEEPER_CONNECT=zookeeper-1:2181,zookeeper-2:2181,zookeeper-3:2181 \
-e KAFKA_HOST_NAME=0.0.0.0 -e KAFKA_BROKER_ID=1 \
-v /opt/volumns/kafka-1/kafka-logs-kafka:/kafka/kafka-logs-kafka \
-v /opt/volumns/kafka-1/logs:/opt/kafka/logs \
-v /etc/localtime:/etc/localtime \
wurstmeister/kafka:latest

#vm2
docker run -d --name kafka-2 --hostname kafka-2 -p 9092:9092 \
--restart=always --network=overlay \
-e KAFKA_ADVERTISED_HOST_NAME=vm2 -e KAFKA_ADVERTISED_PORT=9092 \
-e KAFKA_ZOOKEEPER_CONNECT=zookeeper-1:2181,zookeeper-2:2181,zookeeper-3:2181 \
-e KAFKA_HOST_NAME=0.0.0.0 -e KAFKA_BROKER_ID=2 \
-v /opt/volumns/kafka-2/kafka-logs-kafka:/kafka/kafka-logs-kafka \
-v /opt/volumns/kafka-2/logs:/opt/kafka/logs \
-v /etc/localtime:/etc/localtime \
wurstmeister/kafka:latest

#vm3
docker run -d --name kafka-3 --hostname kafka-3 -p 9092:9092 \
--restart=always --network=overlay \
-e KAFKA_ADVERTISED_HOST_NAME=vm3 -e KAFKA_ADVERTISED_PORT=9092 \
-e KAFKA_ZOOKEEPER_CONNECT=zookeeper-1:2181,zookeeper-2:2181,zookeeper-3:2181 \
-e KAFKA_HOST_NAME=0.0.0.0 -e KAFKA_BROKER_ID=3 \
-v /opt/volumns/kafka-3/kafka-logs-kafka:/kafka/kafka-logs-kafka \
-v /opt/volumns/kafka-3/logs:/opt/kafka/logs \
-v /etc/localtime:/etc/localtime \
wurstmeister/kafka:latest
  • KAFKA_ADVERTISED_HOST_NAME要使用docker host或者DNS的ip,不能使用容器的hostname
    advertise系列配置是为了返回访问数据的地址,所以需要获得client能访问到ip和端口。
  • KAFKA_ADVERTISED_PORT要使用docker host或者DNS的port,理由同上。
  • KAFKA_HOST_NAME要用0.0.0.0,这样的话 -p 参数才能够代理kafka的地址。

检查节点

测试zookeeper 里面的kafka节点是否都在

docker run -it --rm  zookeeper zkCli.sh -server vm1:2181

进去后执行

ls /brokers/ids

显示

[1, 2, 3]

测试

#创建topic
docker run -it --rm wurstmeister/kafka:latest /opt/kafka/bin/kafka-topics.sh --create --bootstrap-server  vm1:9092,vm2:9092,vm3:9092 --replication-factor 3 --partitions 3 --topic test
#查看topic
docker run -it --rm wurstmeister/kafka:latest /opt/kafka/bin/kafka-topics.sh --list --bootstrap-server  vm1:9092,vm2:9092,vm3:9092
#生产消息
docker run -it --rm wurstmeister/kafka:latest /opt/kafka/bin/kafka-console-producer.sh --broker-list  vm1:9092,vm2:9092,vm3:9092 --topic test
This is a message
This is another message
#消费队列
docker run -it --rm wurstmeister/kafka:latest /opt/kafka/bin/kafka-console-consumer.sh --bootstrap-server vm1:9092,vm2:9092,vm3:9092 --topic test --from-beginning

Docker Stack搭建集群

启动zookeeper

zookeeper集群配置参考这里
参考上文的docker stack集群部署
使用上文的目录i和配置,但不启动。zookeeper的启动配置和kafka一起都放在下面的yaml配置文件中,作为一个stack使用。

编写配置

由于docker stack 不支持depends_on语法,这里没有让kafka依赖zookeeper。
你可以自行搜索dockerize功能来实现依赖。

version: "3"
services:
  zookeeper-1:
    image: zookeeper
    hostname: zookeeper-1
    networks:
      - overlay
    ports:
      - 2181:2181
      - 8080:8080
    environment:
      ZOO_MY_ID: 1
      ZOO_SERVERS: server.1=zookeeper-1:2888:3888;2181 server.2=zookeeper-2:2888:3888;2181 server.3=zookeeper-3:2888:3888;2181
    volumes:
      - /etc/localtime:/etc/localtime
      - /opt/volumns/zookeeper-1/data:/data
      - /opt/volumns/zookeeper-1/datalog:/datalog
    deploy:
      restart_policy:
        condition: on-failure
      replicas: 1
      placement:
        constraints:
          - node.hostname==vm1
  zookeeper-2:
    image: zookeeper
    hostname: zookeeper-2
    networks:
      - overlay
    ports:
      - 2182:2181
      - 8081:8080
    environment:
      ZOO_MY_ID: 2
      ZOO_SERVERS: server.1=zookeeper-1:2888:3888;2181 server.2=zookeeper-2:2888:3888;2181 server.3=zookeeper-3:2888:3888;2181
    volumes:
      - /etc/localtime:/etc/localtime
      - /opt/volumns/zookeeper-2/data:/data
      - /opt/volumns/zookeeper-2/datalog:/datalog
    deploy:
      restart_policy:
        condition: on-failure
      replicas: 1
      placement:
        constraints:
          - node.hostname==vm2
  zookeeper-3:
    image: zookeeper
    hostname: zookeeper-3
    networks:
      - overlay
    ports:
      - 2183:2181
      - 8082:8080
    environment:
      ZOO_MY_ID: 3
      ZOO_SERVERS: server.1=zookeeper-1:2888:3888;2181 server.2=zookeeper-2:2888:3888;2181 server.3=zookeeper-3:2888:3888;2181
    volumes:
      - /etc/localtime:/etc/localtime
      - /opt/volumns/zookeeper-3/data:/data
      - /opt/volumns/zookeeper-3/datalog:/datalog
    deploy:
      restart_policy:
        condition: on-failure
      replicas: 1
      placement:
        constraints:
          - node.hostname==vm3
  kafka-1:
    image: wurstmeister/kafka
    hostname: kafka-1
    networks:
      - overlay
    ports:
      - 9092:9092
    environment:
      - KAFKA_ADVERTISED_HOST_NAME=vm1
      - KAFKA_ADVERTISED_PORT=9092
      - KAFKA_ZOOKEEPER_CONNECT=zookeeper-1:2181,zookeeper-2:2181,zookeeper-3:2181
      - KAFKA_HOST_NAME=0.0.0.0
      - KAFKA_BROKER_ID=1
    volumes:
      - /etc/localtime:/etc/localtime
      - /opt/volumns/kafka-1/kafka-logs-kafka:/kafka/kafka-logs-kafka
      - /opt/volumns/kafka-1/logs:/opt/kafka/logs
    deploy:
      restart_policy:
        condition: on-failure
      replicas: 1
      placement:
        constraints:
          - node.hostname==vm1
  kafka-2:
    image: wurstmeister/kafka
    hostname: kafka-2
    networks:
      - overlay
    ports:
      - 9093:9092
    environment:
      - KAFKA_ADVERTISED_HOST_NAME=vm1
      - KAFKA_ADVERTISED_PORT=9093
      - KAFKA_ZOOKEEPER_CONNECT=zookeeper-1:2181,zookeeper-2:2181,zookeeper-3:2181
      - KAFKA_HOST_NAME=0.0.0.0
      - KAFKA_BROKER_ID=2
    volumes:
      - /etc/localtime:/etc/localtime
      - /opt/volumns/kafka-2/kafka-logs-kafka:/kafka/kafka-logs-kafka
      - /opt/volumns/kafka-2/logs:/opt/kafka/logs
    deploy:
      restart_policy:
        condition: on-failure
      replicas: 1
      placement:
        constraints:
          - node.hostname==vm2
  kafka-3:
    image: wurstmeister/kafka
    hostname: kafka-3
    networks:
      - overlay
    ports:
      - 9094:9092
    environment:
      - KAFKA_ADVERTISED_HOST_NAME=vm1
      - KAFKA_ADVERTISED_PORT=9094
      - KAFKA_ZOOKEEPER_CONNECT=zookeeper-1:2181,zookeeper-2:2181,zookeeper-3:2181
      - KAFKA_HOST_NAME=0.0.0.0
      - KAFKA_BROKER_ID=3
    volumes:
      - /etc/localtime:/etc/localtime
      - /opt/volumns/kafka-3/kafka-logs-kafka:/kafka/kafka-logs-kafka
      - /opt/volumns/kafka-3/logs:/opt/kafka/logs
    deploy:
      restart_policy:
        condition: on-failure
      replicas: 1
      placement:
        constraints:
          - node.hostname==vm3

networks:
  overlay:
    driver: overlay

启动集群

 docker stack deploy -c kafka.yaml kafka

检查节点

测试zookeeper 里面的kafka节点是否都在

docker run -it --rm  zookeeper zkCli.sh -server vm2:2182

进去后执行

ls /brokers/ids

显示

[1, 2, 3]

测试

#创建topic
docker run -it --rm wurstmeister/kafka:latest /opt/kafka/bin/kafka-topics.sh --create --bootstrap-server vm1:9092,vm1:9093,vm1:9094 --replication-factor 3 --partitions 3 --topic test
#查看topic
docker run -it --rm wurstmeister/kafka:latest /opt/kafka/bin/kafka-topics.sh --list --bootstrap-server vm1:9092,vm1:9093,vm1:9094
#生产消息
docker run -it --rm wurstmeister/kafka:latest /opt/kafka/bin/kafka-console-producer.sh --broker-list vm1:9092,vm1:9093,vm1:9094 --topic test
This is a message
This is another message
#消费队列
docker run -it --rm wurstmeister/kafka:latest /opt/kafka/bin/kafka-console-consumer.sh --bootstrap-server vm1:9092,vm1:9093,vm1:9094 --topic test --from-beginning

你可能感兴趣的:(Docker)