在上文中,笔者阐述了Ozone OM服务HA的内部机理,但是没有介绍其是如何配置使用的。本文笔者结合自己在测试环境中的HA搭建过程,来补充介绍下这块的实际配置过程以及中间可能存在的坑。想了解OM HA背后更多的原理方面的内容,可阅读笔者的上一篇文章Ozone OM服务HA原理分析。
与HDFS NameNode的Active/Standby服务角色分类不同,OM HA采用的是基于Raft协议方式来实现服务状态的一致性控制。基于此协议的实现下,OM服务被分为1个Leader,多Follower的模式,一般我们会设定2N+1个服务。2N+1的设定能保证在领导投票选举过程中能够出现半数以上的投票结果,避免出现同票数的情况。
因此在笔者的测试环境下,笔者设定了3个OM服务,此3个服务将分布部署在以下机器中:
因为要使用Raft协议的HA实现方式,需要开启om ratis(raits为Raft的Java实现库)选项,再配置好om metastore db的位置信息:
<property>
<name>ozone.om.db.dirsname>
<value>/home/hdfs/data/metavalue>
property>
<property>
<name>ozone.om.ratis.enablename>
<value>truevalue>
property>
在OM HA模式下,同样也有service的定义,这里笔者的配置service id如下所示:=
<property>
<name>ozone.om.service.idsname>
<value>om-service-testvalue>
property>
然后是对应此om service下的om node id,用以区别不同的OM service,
<property>
<name>ozone.om.nodes.om-service-testname>
<value>omNode-1,omNode-2,omNode-3value>
property>
目前Ozone OM暂不支持OM federation的方式,因此ozone.om.internal.service.id并不需要配置。不过如果OM在未来支持多nameservice的情况,还是需要额外配置internal service id来指定启动哪个nameservice下的OM服务。
定义好上述OM HA的基本配置后,后面就是具体om id下的RPC/Http地址的配置了,每个om节点3个配置项,3个om就是9个配置项,配置结果如下:
<property>
<name>ozone.om.address.om-service-test.omNode-1name>
<value>lyq-m1-xx.xx.xx.xx:9862value>
property>
<property>
<name>ozone.om.http-address.om-service-test.omNode-1name>
<value>lyq-m1-xx.xx.xx.xx:9874value>
property>
<property>
<name>ozone.om.https-address.om-service-test.omNode-1name>
<value>lyq-m1-xx.xx.xx.xx:9875value>
property>
<property>
<name>ozone.om.address.om-service-test.omNode-2name>
<value>lyq-m2-xx.xx.xx.xx:9862value>
property>
<property>
<name>ozone.om.http-address.om-service-test.omNode-2name>
<value>lyq-m2-xx.xx.xx.xx:9874value>
property>
<property>
<name>ozone.om.https-address.om-service-test.omNode-2name>
<value>lyq-m2-xx.xx.xx.xx:9875value>
property>
<property>
<name>ozone.om.address.om-service-test.omNode-3name>
<value>lyq-m3-xx.xx.xx.xx:9862value>
property>
<property>
<name>ozone.om.http-address.om-service-test.omNode-3name>
<value>lyq-m3-xx.xx.xx.xx:9874value>
property>
<property>
<name>ozone.om.https-address.om-service-test.omNode-3name>
<value>lyq-m3-xx.xx.xx.xx:9875value>
property>
完成到这里,OM的HA配置可以算作是完成了,将这个配置文件分发到lyq-m1-xx.xx.xx.xx,lyq-m2-xx.xx.xx.xx和lyq-m3-xx.xx.xx.xx这3个节点上即可。
HA配置完毕后,就是后面的服务启动操作了。
此时如果你急匆匆地执行om启动命令,
ozone/bin/ozone --daemon start om
将会发现OM没有被启动,日志中还会出现如下“OM not initialization”的错误。
2020-01-17 08:57:55,210 [main] INFO - registered UNIX signal handlers for [TERM, HUP, INT]
2020-01-17 08:57:55,846 [main] INFO - ozone.om.internal.service.id is not defined, falling back to ozone.om.service.ids to find serviceID for OzoneManager if it is HA enabled cluster
2020-01-17 08:57:55,872 [main] INFO - Found matching OM address with OMServiceId: om-service-test, OMNodeId: omNode-1, RPC Address: lyq-m1-xx.xx.xx.xx:9862 and Ratis port: 9872
2020-01-17 08:57:55,872 [main] INFO - Setting configuration key ozone.om.http-address with value of key ozone.om.http-address.omNode-1: lyq-m1-xx.xx.xx.xx:9874
2020-01-17 08:57:55,872 [main] INFO - Setting configuration key ozone.om.https-address with value of key ozone.om.https-address.omNode-1: lyq-m1-xx.xx.xx.xx:9875
2020-01-17 08:57:55,872 [main] INFO - Setting configuration key ozone.om.address with value of key ozone.om.address.omNode-1: lyq-m1-xx.xx.xx.xx:9862
OM not initialized.
2020-01-17 08:57:55,887 [shutdown-hook-0] INFO - SHUTDOWN_MSG:
这里,我们需要预先执行一步om初始化目录的操作命令,命令如下:
ozone/bin/ozone om --init
然后我们再执行上述的om daemon的start启动命令即可。
在OM服务启动的过程中,OM会比较和匹配启动服务节点的地址和配置中配置的om id地址,然后识别出所属的OM Id。
2020-01-17 08:57:55,872 [main] INFO - Found matching OM address with OMServiceId: om-service-test, OMNodeId: omNode-1, RPC Address: lyq-m1-xx.xx.xx.xx:9862 and Ratis port: 9872
2020-01-17 08:57:55,872 [main] INFO - Setting configuration key ozone.om.http-address with value of key ozone.om.http-address.omNode-1: lyq-m1-xx.xx.xx.xx:9874
2020-01-17 08:57:55,872 [main] INFO - Setting configuration key ozone.om.https-address with value of key ozone.om.https-address.omNode-1: lyq-m1-xx.xx.xx.xx:9875
2020-01-17 08:57:55,872 [main] INFO - Setting configuration key ozone.om.address with value of key ozone.om.address.omNode-1: lyq-m1-xx.xx.xx.xx:9862
等到我们把3个OM服务都启动完毕后,我们能从OM的log中看到领导选举的过程,下面日志表明omNode-1成为了Leader服务角色,omNode-2和omNode-3为Follower角色。
2020-01-19 00:27:10,878 INFO org.apache.ratis.server.impl.RaftServerImpl: omNode-2@group-C0483FFA3DBE: changes role from FOLLOWER to FOLLOWER at term 600 for recognizeCandidate:omNode-1
2020-01-19 00:27:10,878 INFO org.apache.ratis.server.impl.RoleInfo: omNode-2: shutdown FollowerState
2020-01-19 00:27:10,878 INFO org.apache.ratis.server.impl.RoleInfo: omNode-2: start FollowerState
2020-01-19 00:27:10,878 INFO org.apache.ratis.server.impl.FollowerState: omNode-2@group-C0483FFA3DBE-FollowerState was interrupted: java.lang.InterruptedException: sleep interrupted
2020-01-19 00:27:11,221 INFO org.apache.ratis.server.impl.RaftServerImpl: omNode-2@group-C0483FFA3DBE: change Leader from null to omNode-1 at term 600 for appendEntries, leader elected after 56782ms
2020-01-19 00:27:11,261 INFO org.apache.ratis.server.impl.RaftServerImpl: omNode-2@group-C0483FFA3DBE: set configuration 0: [omNode-3:lyq-m3-xx.xx.xx.xx:9872, omNode-1:lyq-m1-xx.xx.xx.xx:9872, omNode-2:lyq-m2-xx.xx.xx.xx:9872], old=null at 0
2020-01-19 00:27:11,267 INFO org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogWorker: omNode-2@group-C0483FFA3DBE-SegmentedRaftLogWorker: Starting segment from index:0
2020-01-19 00:27:11,397 INFO org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogWorker: omNode-2@group-C0483FFA3DBE-SegmentedRaftLogWorker: created new log segment /home/hdfs/data/meta/ratis/3bf88a72-722a-315e-b909-c0483ffa3dbe/current/log_inprogress_0
上述OM服务完全启动完毕后,我们可以使用om ha的admin命令来查看当前各个节点的服务角色,命令执行结果如下所示(下面的id为om service id):
[hdfs@lyq-m1 yiqlin]$ ~/apache/ozone/bin/ozone admin om getserviceroles -id=om-service-test
omNode-2 : FOLLOWER
omNode-3 : FOLLOWER
omNode-1 : LEADER
在OM HA模式下,volume/bucket/key相关的操作命令需要额外带上om service id的schema前缀模式,命令如下:
[hdfs@lyq-m1 yiqlin]$ ~/apache/ozone/bin/ozone sh volume create o3://om-service-test/volumetest
2020-01-19 02:35:59,087 [main] INFO - Creating Volume: volumetest, with hdfs as owner.
否则将会提示以下的错误信息:
[hdfs@lyq-m1 yiqlin]$ ~/apache/ozone/bin/ozone sh volume create /volumetest
Service ID or host name must not be omitted when ozone.om.service.ids is defined.
最后附上笔者测试使用的OM HA配置:
<property>
<name>ozone.om.db.dirsname>
<value>/home/hdfs/data/metavalue>
property>
<property>
<name>ozone.om.ratis.enablename>
<value>truevalue>
property>
<property>
<name>ozone.om.service.idsname>
<value>om-service-testvalue>
property>
<property>
<name>ozone.om.nodes.om-service-testname>
<value>omNode-1,omNode-2,omNode-3value>
property>
<property>
<name>ozone.om.address.om-service-test.omNode-1name>
<value>lyq-m1-xx.xx.xx.xx:9862value>
property>
<property>
<name>ozone.om.http-address.om-service-test.omNode-1name>
<value>lyq-m1-xx.xx.xx.xx:9874value>
property>
<property>
<name>ozone.om.https-address.om-service-test.omNode-1name>
<value>lyq-m1-xx.xx.xx.xx:9875value>
property>
<property>
<name>ozone.om.address.om-service-test.omNode-2name>
<value>lyq-m2-xx.xx.xx.xx:9862value>
property>
<property>
<name>ozone.om.http-address.om-service-test.omNode-2name>
<value>lyq-m2-xx.xx.xx.xx:9874value>
property>
<property>
<name>ozone.om.https-address.om-service-test.omNode-2name>
<value>lyq-m2-xx.xx.xx.xx:9875value>
property>
<property>
<name>ozone.om.address.om-service-test.omNode-3name>
<value>lyq-m3-xx.xx.xx.xx:9862value>
property>
<property>
<name>ozone.om.http-address.om-service-test.omNode-3name>
<value>lyq-m3-xx.xx.xx.xx:9874value>
property>
<property>
<name>ozone.om.https-address.om-service-test.omNode-3name>
<value>lyq-m3-xx.xx.xx.xx:9875value>
property>