双主双从模式集群
在生产环境中为了保障集群无单点故障问题,保证高可用性,需要采用双主双从模式来构建RocketMQ集群。双主双从模式部署需要四台机器,两台机器分别部署Broker-Master & NameServer,另外两台机器分别部署Broker-Slave & NameServer。
RocketMQ双主双从模式集群拓扑图:
环境准备
机器说明
由于我们搭建的是双主双从模式,所以首先需要准备四台机器,如下表所示:
机器IP | hostname | 角色 | 内存 | CPU |
---|---|---|---|---|
192.168.243.169 | rocketmq01 | NameServer & Master1 | 4G | 2核 |
192.168.243.170 | rocketmq02 | NameServer & Master2 | 4G | 2核 |
192.168.243.171 | rocketmq03 | NameServer & Master1的Slave | 4G | 2核 |
192.168.243.172 | rocketmq04 | NameServer & Master2的Slave | 4G | 2核 |
配置这四台机器的hosts
文件如下:
$ vim /etc/hosts
192.168.243.169 rocketmq01 rocketmq-nameserver1 rocketmq-master1
192.168.243.170 rocketmq02 rocketmq-nameserver2 rocketmq-master2
192.168.243.171 rocketmq03 rocketmq-nameserver3 rocketmq-master1-slave1
192.168.243.172 rocketmq04 rocketmq-nameserver4 rocketmq-master2-slave1
编译安装RocketMQ
我们需要在所有机器上安装RocketMQ,这里以rocketmq01
节点为例做演示。第一步是在所有机器上准备好Java和Maven环境:
[root@rocketmq01 ~]# java -version
java version "1.8.0_261"
Java(TM) SE Runtime Environment (build 1.8.0_261-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.261-b12, mixed mode)
[root@rocketmq01 ~]# mvn -v
Apache Maven 3.6.3 (cecedd343002696d0abb50b32b541b8a6ba2883f)
Maven home: /usr/local/maven
Java version: 1.8.0_261, vendor: Oracle Corporation, runtime: /usr/local/jdk/1.8/jre
Default locale: zh_CN, platform encoding: UTF-8
OS name: "linux", version: "3.10.0-1062.el7.x86_64", arch: "amd64", family: "unix"
[root@rocketmq01 ~]#
- Tips:最好是使用JDK1.8,因为目前版本的RocketMQ的启动脚本都是基于1.8的,使用高版本的JDK需要自己去修改启动脚本比较麻烦
根据官方文档的描述下载最新版本的源码包:
- http://rocketmq.apache.org/docs/quick-start/
然后上传到服务器:
[root@rocketmq01 /usr/local/src]# ls
rocketmq-all-4.7.1-source-release.zip
[root@rocketmq01 /usr/local/src]#
解压源码包:
[root@rocketmq01 /usr/local/src]# unzip rocketmq-all-4.7.1-source-release.zip
[root@rocketmq01 /usr/local/src]# cd rocketmq-all-4.7.1-source-release
[root@rocketmq01 /usr/local/src/rocketmq-all-4.7.1-source-release]# ls
acl BUILDING common dev docs filter logappender namesrv openmessaging README.md srvutil style tools
broker client CONTRIBUTING.md distribution example LICENSE logging NOTICE pom.xml remoting store test
[root@rocketmq01 /usr/local/src/rocketmq-all-4.7.1-source-release]#
RocketMQ源码包结构说明:
- broker:主要的业务逻辑,消息收发,主从同步,pagecache
- client:客户端接口,比如生产者和消费者
- example:示例,比如生产者和消费者
- common:公用数据结构等等
- distribution:编译模块,编译输出等
- fliter:进行Broker过滤的不感兴趣的消息传输,减小带宽压力
- logappender、logging:日志相关
- namesrv:Namesrver服务,用于服务协调
- openmessaging:对外提供服务
- remoting:远程调用接口,封装Netty底层通信
- srvutil:提供-些公用的工具方法,比如解析命令行参数
- store:消息存储
- tools:管理工具,比如有名的mqadmin工具
然后使用如下命令对源码进行编译:
[root@rocketmq01 /usr/local/src/rocketmq-all-4.7.1-source-release]# mvn -Prelease-all -DskipTests clean install -U
编译成功最后会输出如下内容,所有模块的编译结果都是SUCCESS状态:
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary for Apache RocketMQ 4.7.1 4.7.1:
[INFO]
[INFO] Apache RocketMQ 4.7.1 .............................. SUCCESS [04:15 min]
[INFO] rocketmq-logging 4.7.1 ............................. SUCCESS [ 24.843 s]
[INFO] rocketmq-remoting 4.7.1 ............................ SUCCESS [ 13.108 s]
[INFO] rocketmq-common 4.7.1 .............................. SUCCESS [ 26.304 s]
[INFO] rocketmq-client 4.7.1 .............................. SUCCESS [ 11.852 s]
[INFO] rocketmq-store 4.7.1 ............................... SUCCESS [ 8.760 s]
[INFO] rocketmq-srvutil 4.7.1 ............................. SUCCESS [ 0.855 s]
[INFO] rocketmq-filter 4.7.1 .............................. SUCCESS [ 6.202 s]
[INFO] rocketmq-acl 4.7.1 ................................. SUCCESS [ 7.349 s]
[INFO] rocketmq-broker 4.7.1 .............................. SUCCESS [ 2.162 s]
[INFO] rocketmq-tools 4.7.1 ............................... SUCCESS [ 1.289 s]
[INFO] rocketmq-namesrv 4.7.1 ............................. SUCCESS [ 0.628 s]
[INFO] rocketmq-logappender 4.7.1 ......................... SUCCESS [ 3.754 s]
[INFO] rocketmq-openmessaging 4.7.1 ....................... SUCCESS [ 26.613 s]
[INFO] rocketmq-example 4.7.1 ............................. SUCCESS [ 0.729 s]
[INFO] rocketmq-test 4.7.1 ................................ SUCCESS [ 14.090 s]
[INFO] rocketmq-distribution 4.7.1 ........................ SUCCESS [01:20 min]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 08:08 min
[INFO] Finished at: 2020-11-30T10:40:29+08:00
[INFO] ------------------------------------------------------------------------
然后在distribution/target/
目录下,可以看到编译打包好的可分发包:
[root@rocketmq01 /usr/local/src/rocketmq-all-4.7.1-source-release]# ls distribution/target/
archive-tmp checkstyle-cachefile checkstyle-checker.xml checkstyle-result.xml maven-shared-archive-resources rocketmq-4.7.1 rocketmq-4.7.1.tar.gz rocketmq-4.7.1.zip
[root@rocketmq01 /usr/local/src/rocketmq-all-4.7.1-source-release]#
将编译打包好的可分发包解压到合适的目录下,并进入解压后的目录:
[root@rocketmq01 /usr/local/src/rocketmq-all-4.7.1-source-release]# tar -zxvf distribution/target/rocketmq-4.7.1.tar.gz -C /usr/local
[root@rocketmq01 /usr/local/src/rocketmq-all-4.7.1-source-release]# cd /usr/local/rocketmq-4.7.1/
[root@rocketmq01 /usr/local/rocketmq-4.7.1]# ls
benchmark bin conf lib LICENSE NOTICE README.md
[root@rocketmq01 /usr/local/rocketmq-4.7.1]#
创建数据存储目录:
[root@rocketmq01 /usr/local/rocketmq-4.7.1]# mkdir -p store/commitlog store/consumequeue store/index
- commitlog:生产者投递到rocketmq的数据所存储的目录
- consumequeue:存储offset数据,用于对commitlog的数据进行索引
- index:随机读时用到的索引文件
修改日志配置文件:
[root@rocketmq01 /usr/local/rocketmq-4.7.1]# mkdir logs
[root@rocketmq01 /usr/local/rocketmq-4.7.1]# sed -i 's#${user.home}#/usr/local/rocketmq-4.7.1#g' conf/*.xml
如果机器内存比较小(小于8G),就需要根据实际情况修改下启动脚本的JVM参数,但不能低于1G:
[root@rocketmq01 /usr/local/rocketmq-4.7.1]# vim bin/runbroker.sh
JAVA_OPT="${JAVA_OPT} -server -Xms2g -Xmx2g -Xmn2g"
[root@rocketmq01 /usr/local/rocketmq-4.7.1]# vim bin/runserver.sh
JAVA_OPT="${JAVA_OPT} -server -Xms2g -Xmx2g -Xmn2g -XX:MetaspaceSize=128m -XX:MaxMetaspaceSize=320m"
然后将RocketMQ的安装目录分发到其他机器上:
[root@rocketmq01 ~]# scp -r /usr/local/rocketmq-4.7.1 rocketmq02:/usr/local/rocketmq-4.7.1
[root@rocketmq01 ~]# scp -r /usr/local/rocketmq-4.7.1 rocketmq03:/usr/local/rocketmq-4.7.1
[root@rocketmq01 ~]# scp -r /usr/local/rocketmq-4.7.1 rocketmq04:/usr/local/rocketmq-4.7.1
配置环境变量,所有机器都需要配置:
[root@rocketmq01 /usr/local/rocketmq-4.7.1]# vim /etc/profile
export ROCKETMQ_HOME=/usr/local/rocketmq-4.7.1
export PATH=$PATH:$ROCKETMQ_HOME/bin
[root@rocketmq01 /usr/local/rocketmq-4.7.1]# source /etc/profile
部署双主双从模式集群
准备配置文件
配置项相关的官方文档:
- https://github.com/apache/rocketmq/blob/master/docs/cn/best_practice.md
在四台机器上都安装好RocketMQ之后,我们就可以开始部署RocketMQ的双主双从模式集群了,其实部署起来也挺简单,就是准备好相应的配置文件即可。这里依旧以rocketmq01
节点为例,首先,清空如下文件的内容:
[root@rocketmq01 ~]# echo "" > /usr/local/rocketmq-4.7.1/conf/2m-2s-sync/broker-a.properties
[root@rocketmq01 ~]# echo "" > /usr/local/rocketmq-4.7.1/conf/2m-2s-sync/broker-b.properties
然后编辑broker-a.properties
文件的内容如下:
#所属的集群名称
brokerClusterName=rocketmq-cluster
#broker 的名称,注意此处不同的Master配置文件填写的不一样
brokerName=broker-a
#0 表示 Master,>0 表示 Slave
brokerId=0
#nameServer 地址,分号分割
namesrvAddr=rocketmq-nameserver1:9876;rocketmq-nameserver2:9876;rocketmq-nameserver3:9876;rocketmq-nameserver4:9876
#在发送消息时,自动创建服务器不存在的 topic,默认创建的队列数
defaultTopicQueueNums=4
#是否允许 Broker 自动创建 Topic,建议线下开启,线上关闭
autoCreateTopicEnable=true
#是否允许 Broker 自动创建订阅组,建议线下开启,线上关闭
autoCreateSubscriptionGroup=true
#Broker 对外服务的监听端口
listenPort=10911
#删除文件时间点,默认凌晨 4 点
deleteWhen=04
#文件保留时间,默认 48 小时
fileReservedTime=120
#commitLog 每个文件的大小默认 1G
mapedFileSizeCommitLog=1073741824
#ConsumeQueue 每个文件默认存 30W 条,根据业务情况调整
mapedFileSizeConsumeQueue=300000
#destroyMapedFileIntervalForcibly=120000
#redeleteHangedFileInterval=120000
#检测物理文件磁盘空间
diskMaxUsedSpaceRatio=88
#存储路径
storePathRootDir=/usr/local/rocketmq-4.7.1/store
#commitLog 存储路径
storePathCommitLog=/usr/local/rocketmq-4.7.1/store/commitlog
#消费队列存储路径存储路径
storePathConsumeQueue=/usr/local/rocketmq-4.7.1/store/consumequeue
#消息索引存储路径
storePathIndex=/usr/local/rocketmq-4.7.1/store/index
#checkpoint 文件存储路径
storeCheckpoint=/usr/local/rocketmq-4.7.1/store/checkpoint
#abort 文件存储路径
abortFile=/usr/local/rocketmq-4.7.1/store/abort
#限制的消息大小
maxMessageSize=65536
#flushCommitLogLeastPages=4
#flushConsumeQueueLeastPages=2
#flushCommitLogThoroughInterval=10000
#flushConsumeQueueThoroughInterval=60000
#Broker 的角色
#- ASYNC_MASTER 异步复制 Master
#- SYNC_MASTER 同步双写 Master
#- SLAVE
brokerRole=SYNC_MASTER
#刷盘方式
#- ASYNC_FLUSH 异步刷盘
#- SYNC_FLUSH 同步刷盘
flushDiskType=ASYNC_FLUSH
#checkTransactionMessageEnable=false
#发消息线程池数量
#sendMessageThreadPoolNums=128
#拉消息线程池数量
#pullMessageThreadPoolNums=128
broker-b.properties
文件的内容与broker-a.properties
文件的内容基本一致,就brokerName
需要改一下:
brokerName=broker-b
以上是master节点的配置,接着我们来完成slave节点的配置,清空如下文件的内容:
[root@rocketmq01 ~]# echo "" > /usr/local/rocketmq-4.7.1/conf/2m-2s-sync/broker-a-s.properties
[root@rocketmq01 ~]# echo "" > /usr/local/rocketmq-4.7.1/conf/2m-2s-sync/broker-b-s.properties
编辑broker-a-s.properties
文件的内容如下:
#所属的集群名称
brokerClusterName=rocketmq-cluster
#broker 的名称,Slave 与 Master 是通过 brokerName 来配对的
brokerName=broker-a
#0 表示 Master,>0 表示 Slave
brokerId=1
#nameServer 地址,分号分割
namesrvAddr=rocketmq-nameserver1:9876;rocketmq-nameserver2:9876;rocketmq-nameserver3:9876;rocketmq-nameserver4:9876
#在发送消息时,自动创建服务器不存在的 topic,默认创建的队列数
defaultTopicQueueNums=4
#是否允许 Broker 自动创建 Topic,建议线下开启,线上关闭
autoCreateTopicEnable=true
#是否允许 Broker 自动创建订阅组,建议线下开启,线上关闭
autoCreateSubscriptionGroup=true
#Broker 对外服务的监听端口
listenPort=10911
#删除文件时间点,默认凌晨 4 点
deleteWhen=04
#文件保留时间,默认 48 小时
fileReservedTime=120
#commitLog 每个文件的大小默认 1G
mapedFileSizeCommitLog=1073741824
#ConsumeQueue 每个文件默认存 30W 条,根据业务情况调整
mapedFileSizeConsumeQueue=300000
#destroyMapedFileIntervalForcibly=120000
#redeleteHangedFileInterval=120000
#检测物理文件磁盘空间
diskMaxUsedSpaceRatio=88
#存储路径
storePathRootDir=/usr/local/rocketmq-4.7.1/store
#commitLog 存储路径
storePathCommitLog=/usr/local/rocketmq-4.7.1/store/commitlog
#消费队列存储路径存储路径
storePathConsumeQueue=/usr/local/rocketmq-4.7.1/store/consumequeue
#消息索引存储路径
storePathIndex=/usr/local/rocketmq-4.7.1/store/index
#checkpoint 文件存储路径
storeCheckpoint=/usr/local/rocketmq-4.7.1/store/checkpoint
#abort 文件存储路径
abortFile=/usr/local/rocketmq-4.7.1/store/abort
#限制的消息大小
maxMessageSize=65536
#flushCommitLogLeastPages=4
#flushConsumeQueueLeastPages=2
#flushCommitLogThoroughInterval=10000
#flushConsumeQueueThoroughInterval=60000
#Broker 的角色
#- ASYNC_MASTER 异步复制 Master
#- SYNC_MASTER 同步双写 Master
#- SLAVE
brokerRole=SLAVE
#刷盘方式
#- ASYNC_FLUSH 异步刷盘
#- SYNC_FLUSH 同步刷盘
flushDiskType=ASYNC_FLUSH
#checkTransactionMessageEnable=false
#发消息线程池数量
#sendMessageThreadPoolNums=128
#拉消息线程池数量
#pullMessageThreadPoolNums=128
同样的,broker-b-s.properties
文件的内容与broker-a-s.properties
文件的内容基本一致,就brokerName
需要改一下:
brokerName=broker-b
准备好配置文件后,将这几个配置文件所在的目录分发给其他节点:
[root@rocketmq01 ~]# scp -r /usr/local/rocketmq-4.7.1/conf/2m-2s-sync/ rocketmq02:/usr/local/rocketmq-4.7.1/conf
[root@rocketmq01 ~]# scp -r /usr/local/rocketmq-4.7.1/conf/2m-2s-sync/ rocketmq03:/usr/local/rocketmq-4.7.1/conf
[root@rocketmq01 ~]# scp -r /usr/local/rocketmq-4.7.1/conf/2m-2s-sync/ rocketmq04:/usr/local/rocketmq-4.7.1/conf
启动集群
完成配置文件的分发,并且确认无误后,就可以启动我们的集群了。首先,在四台机器上执行如下命令启动NameServer:
$ nohup sh mqnamesrv &
在rocketmq01上启动Master Broker:
[root@rocketmq01 ~]# nohup sh mqbroker -c /usr/local/rocketmq-4.7.1/conf/2m-2s-sync/broker-a.properties >/dev/null 2>&1 &
在rocketmq02上启动Master Broker:
[root@rocketmq02 ~]# nohup sh mqbroker -c /usr/local/rocketmq-4.7.1/conf/2m-2s-sync/broker-b.properties >/dev/null 2>&1 &
在rocketmq03上启动Slave Broker:
[root@rocketmq03 ~]# nohup sh mqbroker -c /usr/local/rocketmq-4.7.1/conf/2m-2s-sync/broker-a-s.properties >/dev/null 2>&1 &
在rocketmq04上启动Slave Broker:
[root@rocketmq04 ~]# nohup sh mqbroker -c /usr/local/rocketmq-4.7.1/conf/2m-2s-sync/broker-b-s.properties >/dev/null 2>&1 &
检查NameServer和Broker的进程及监听端口是否正常:
[root@rocketmq01 ~]# jps
[root@rocketmq01 ~]# netstat -ntlp |grep java
通过如下命令可以查看NameServer和Broker的日志:
[root@rocketmq01 ~]# tail -f -n 500 /usr/local/rocketmq-4.7.1/logs/rocketmqlogs/broker.log
[root@rocketmq01 ~]# tail -f -n 500 /usr/local/rocketmq-4.7.1/logs/rocketmqlogs/namesrv.log
与NameServer正常通信的情况下broker.log
会有如下心跳日志,代表节点之间通信正常:
2020-12-02 11:37:38 INFO brokerOutApi_thread_1 - register broker[0]to name server rocketmq-nameserver1:9876 OK
2020-12-02 11:37:38 INFO brokerOutApi_thread_3 - register broker[0]to name server rocketmq-nameserver2:9876 OK
2020-12-02 11:37:38 INFO brokerOutApi_thread_2 - register broker[0]to name server rocketmq-nameserver3:9876 OK
2020-12-02 11:37:38 INFO brokerOutApi_thread_4 - register broker[0]to name server rocketmq-nameserver4:9876 OK
搭建RocketMQ管控台
接下来我们在任意一个节点上搭建一个RocketMQ的管控台,RocketMQ官方提供了一个基于Spring Boot开发的可视化控制台,可以方便我们查看RocketMQ的运行情况以及提升运维效率。RocketMQ在如下仓库提供了一些扩展组件,我们要使用到的控制台就包含在内:
- https://github.com/apache/rocketmq-externals/tree/master/
RocketMQ控制台是使用Spring Boot编写的,我们需要将源码克隆下载并修改相关配置即可使用:
[root@rocketmq01 /usr/local/src]# git clone https://github.com/apache/rocketmq-externals.git
修改rocketmq-console
项目中的application.properties
配置文件,我这里主要是修改了监听端口和Name Server的连接地址,至于其他配置项有需要的话可按照说明自行修改:
[root@rocketmq01 /usr/local/src]# cd rocketmq-externals/rocketmq-console/
[root@rocketmq01 /usr/local/src/rocketmq-externals/rocketmq-console]# vim src/main/resources/application.properties
# console的监听端口,默认是8080
server.port=8999
# Name Server的连接地址;非必须,可以在启动了console后,在控制台导航栏 - 运维 - NameSvrAddrList一栏设置
rocketmq.config.namesrvAddr=rocketmq-nameserver1:9876;rocketmq-nameserver2:9876;rocketmq-nameserver3:9876;rocketmq-nameserver4:9876
然后执行如下命令进行编译打包:
[root@rocketmq01 /usr/local/src/rocketmq-externals/rocketmq-console]# mvn clean package -DskipTests
...
[INFO] Replacing main artifact with repackaged archive
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 04:52 min
[INFO] Finished at: 2020-11-30T14:26:20+08:00
[INFO] ------------------------------------------------------------------------
打包完成后就可以启动RocketMQ控制台了:
[root@rocketmq01 /usr/local/src/rocketmq-externals/rocketmq-console]# java -jar target/rocketmq-console-ng-2.0.0.jar
RocketMQ控制台在运行的过程中,可能会输出如下错误:
ERROR Exception caught: mqAdminExt get broker stats data TOPIC_PUT_NUMS failed
但是这个错误并不影响正常运行,具体原因可以参考如下文章的说明:
- https://juejin.cn/post/6870762347166728200
使用浏览器访问控制台,正常的情况下能看到如下界面:
在“Cluster”页面可以查看集群中各个节点的信息代表我们的集群已经构建成功:
停止集群
停止集群的方式和停止单个节点一样,首先,关闭所有的 BrokerServer:
$ sh mqshutdown broker
- Tips:由于需要在多个节点上执行,节点多了的话也比较麻烦,可以自己尝试写个脚本来实现集群的一键启停
然后再关闭所有的NameServer:
$ sh mqshutdown namesrv