ZooKeeper从入门到精通13:使用ZooKeeper实现Hadoop的HA

Hadoop本身并没有提供HA的功能,需要借助ZooKeeper来实现Hadoop的HA功能。Hadoop的HA搭建过程是所有Hadoop生态圈组件中最复杂的,本节就来详细说明如何使用ZooKeeper来搭建Hadoop的HA环境。

环境说明:

bigdata131 192.168.126.131
bigdata132 192.168.126.132
bigdata133 192.168.126.133
bigdata134 192.168.126.134

安装介质下载:

zookeeper-3.4.10.tar.gz 提取码:nvv2
hadoop-2.7.3.tar.gz 提取码:r8xo

1.使用ZooKeeper实现Hadoop的HA的原理

image

通过上图的分析,使用ZooKeeper集群搭建一个最小规模的Hadoop HA集群至少需要4台机器:

Zookeeper集群:

bigdata131
bigdata132
bigdata133

Hadoop集群:

bigdata131 NameNode1 ResourceManager1 Journalnode
bigdata132 NameNode2 ResourceManager2 Journalnode
bigdata133 DataNode1
bigdata134 DataNode2

2.搭建ZooKeeper集群

参考文章《ZooKeeper从入门到精通9:ZooKeeper环境搭建之集群模式》。

3.搭建Hadoop的HA集群环境

以下步骤在bigdata131节点上执行:

3.1上传Hadoop安装包

使用winscp工具将Hadoop安装包上传到bigdata131节点的/root/tools/目录中,该目录是事先创建的。

# ls /root/tools/
hadoop-2.7.3.tar.gz

3.2解压Hadoop安装包

进入/root/tools/目录,将hadoop安装包解压到/root/trainings/目录中,该目录也是事先创建的。

# cd /root/tools/
# tar -zxvf hadoop-2.7.3.tar.gz -C /root/trainings/

3.3配置Hadoop环境变量(4台主机上都做一遍)

# cd /root/trainings/hadoop-2.7.3/
# pwd
/root/trainings/hadoop-2.7.3
# vim /root/.bash_profile
在文件末尾追加如下内容:
HADOOP_HOME=/root/trainings/hadoop-2.7.3
export HADOOP_HOME
PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
export PATH

按Esc:wq保存退出,使用source命令使配置文件立即生效:

# source /root/.bash_profile

3.4配置Hadoop HA模式的参数

进入Hadoop配置文件目录:

# cd /root/trainings/hadoop-2.7.3/etc/hadoop/

(1)配置hadoop-env.sh文件:

# echo $JAVA_HOME
/root/trainings/jdk1.8.0_144
# vim hadoop-env.sh
#export JAVA_HOME=${JAVA_HOME}
export JAVA_HOME=/root/trainings/jdk1.8.0_144

(2)配置hdfs-site.xml文件(配置nameservice中有几个namenode):

# vim hdfs-site.xml



dfs.nameservices
ns1



dfs.ha.namenodes.ns1
nn1,nn2



dfs.namenode.rpc-address.ns1.nn1
bigdata131:9000



dfs.namenode.http-address.ns1.nn1
bigdata131:50070



dfs.namenode.rpc-address.ns1.nn2
bigdata132:9000



dfs.namenode.http-address.ns1.nn2
bigdata132:50070



dfs.namenode.shared.edits.dir
qjournal://bigdata131:8485;bigdata132:8485;/ns1



dfs.journalnode.edits.dir
/root/trainings/hadoop-2.7.3/journal



dfs.ha.automatic-failover.enabled
true



dfs.client.failover.proxy.provider.ns1
org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider



dfs.ha.fencing.methods

sshfence
shell(/bin/true)




dfs.ha.fencing.ssh.private-key-files
/root/.ssh/id_rsa



dfs.ha.fencing.ssh.connect-timeout
30000

(3)配置core-site.xml文件:

# mkdir /root/trainings/hadoop-2.7.3/tmp
# vim core-site.xml



fs.defaultFS
hdfs://ns1



hadoop.tmp.dir
/root/trainings/hadoop-2.7.3/tmp



ha.zookeeper.quorum
bigdata131:2181,bigdata132:2181,bigdata133:2181

(4)配置mapred-site.xml文件:
将模板文件mapred-site.xml.template拷贝一份重命名为mapred-site.xml然后编辑:

# cp mapred-site.xml.template mapred-site.xml
# vim mapred-site.xml



mapreduce.framework.name
yarn

(5)配置yarn-site.xml文件:

# vim yarn-site.xml



yarn.resourcemanager.ha.enabled
true



yarn.resourcemanager.cluster-id
yrc



yarn.resourcemanager.ha.rm-ids
rm1,rm2



yarn.resourcemanager.hostname.rm1
bigdata131


yarn.resourcemanager.hostname.rm2
bigdata132



yarn.resourcemanager.zk-address
bigdata131:2181,bigdata132:2181,bigdata133:2181


yarn.nodemanager.aux-services
mapreduce_shuffle

(6)配置slaves文件:

# vim slaves
bigdata133
bigdata134

3.5将配置好的hadoop拷贝到其他节点

[root@bigdata131 ~]# scp -r /root/trainings/hadoop-2.7.3/ root@bigdata132:/root/trainings/

[root@bigdata131 ~]# scp -r /root/trainings/hadoop-2.7.3/ root@bigdata133:/root/trainings/
[root@bigdata131 ~]# scp -r /root/trainings/hadoop-2.7.3/ root@bigdata134:/root/trainings/

3.6启动Zookeeper集群

[root@bigdata131 ~]# zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /root/trainings/zookeeper-3.4.10/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED

[root@bigdata132 ~]# zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /root/trainings/zookeeper-3.4.10/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED

[root@bigdata133 ~]# zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /root/trainings/zookeeper-3.4.10/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED

3.7在bigdata131和bigdata132上启动journalnode

[root@bigdata131 ~]# hadoop-daemon.sh start journalnode
starting journalnode, logging to /root/trainings/hadoop-2.7.3/logs/hadoop-root-journalnode-bigdata131.out
[root@bigdata132 ~]# hadoop-daemon.sh start journalnode
starting journalnode, logging to /root/trainings/hadoop-2.7.3/logs/hadoop-root-journalnode-bigdata132.out

3.8格式化HDFS和ZooKeeper(在bigdata131上执行)

[root@bigdata131 ~]# hdfs namenode -format
18/12/02 00:08:47 INFO common.Storage: Storage directory /root/trainings/hadoop-2.7.3/tmp/dfs/name has been successfully formatted.
[root@bigdata131 ~]# scp -r /root/trainings/hadoop-2.7.3/tmp root@bigdata132:/root/trainings/hadoop-2.7.3/
# 格式化zookeeper
[root@bigdata131 ~]# hdfs zkfc -formatZK
18/12/02 00:09:59 INFO ha.ActiveStandbyElector: Successfully created /hadoop-ha/ns1 in ZK.

3.9启动Hadoop HA集群

在bigdata131上启动Hadoop集群:

[root@bigdata131 ~]# start-all.sh
This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
Starting namenodes on [bigdata131 bigdata132]
bigdata131: starting namenode, logging to /root/trainings/hadoop-2.7.3/logs/hadoop-root-namenode-bigdata131.out
bigdata132: starting namenode, logging to /root/trainings/hadoop-2.7.3/logs/hadoop-root-namenode-bigdata132.out
bigdata133: starting datanode, logging to /root/trainings/hadoop-2.7.3/logs/hadoop-root-datanode-bigdata133.out
bigdata134: starting datanode, logging to /root/trainings/hadoop-2.7.3/logs/hadoop-root-datanode-bigdata134.out
Starting journal nodes [bigdata131 bigdata132 ]
bigdata131: journalnode running as process 1275. Stop it first.
bigdata132: journalnode running as process 1265. Stop it first.
Starting ZK Failover Controllers on NN hosts [bigdata131 bigdata132]
bigdata131: starting zkfc, logging to /root/trainings/hadoop-2.7.3/logs/hadoop-root-zkfc-bigdata131.out
bigdata132: starting zkfc, logging to /root/trainings/hadoop-2.7.3/logs/hadoop-root-zkfc-bigdata132.out

starting yarn daemons
starting resourcemanager, logging to /root/trainings/hadoop-2.7.3/logs/yarn-root-resourcemanager-bigdata131.out
bigdata133: starting nodemanager, logging to /root/trainings/hadoop-2.7.3/logs/yarn-root-nodemanager-bigdata133.out
bigdata134: starting nodemanager, logging to /root/trainings/hadoop-2.7.3/logs/yarn-root-nodemanager-bigdata134.out

[root@bigdata131 ~]# jps
1232 QuorumPeerMain
1939 ResourceManager
1524 NameNode
2200 Jps
1275 JournalNode
1839 DFSZKFailoverController

[root@bigdata132 ~]# jps
1217 QuorumPeerMain
1265 JournalNode
1346 NameNode
1461 DFSZKFailoverController
1518 Jps

[root@bigdata133 ~]# jps
1365 NodeManager
1213 QuorumPeerMain
1469 Jps
1263 DataNode

[root@bigdata134 ~]# jps
1303 NodeManager
1435 Jps
1228 DataNode

在bigdata132上需要单独启动ResourceManager:

[root@bigdata132 ~]# yarn-daemon.sh start resourcemanager
starting resourcemanager, logging to /root/trainings/hadoop-2.7.3/logs/yarn-root-resourcemanager-bigdata132.out
[root@bigdata132 ~]# jps
1217 QuorumPeerMain
1265 JournalNode
1346 NameNode
1603 Jps
1556 ResourceManager
1461 DFSZKFailoverController

4.测试Hadoop的HA集群环境

(1)正常情况下

image
image
image

可以看到:ZooKeeper中ns1当前Active的是bigdata131;通过hadoop网页发现:bigdata131是active状态,bigdata132是standby状态。

(2)杀死bigdata131上的NameNode进程,刷新ZooKeeper和网页观察变化

[root@bigdata131 ~]# jps
1232 QuorumPeerMain
1939 ResourceManager
1524 NameNode
2266 Jps
1275 JournalNode
1839 DFSZKFailoverController
[root@bigdata131 ~]# kill -9 1524

image
image
image

可以看到:ZooKeeper中ns1当前Active的变成了bigdata132;通过hadoop网页发现:bigdata131已经无法访问,bigdata132变成active状态。

因此Hadoop HA能够实现正确的失败迁移功能,可以更加高可用的对外提供Hadoop服务了。

至此,使用ZooKeeper搭建Hadoop的HA环境已经介绍完毕。祝你玩得愉快!

你可能感兴趣的:(ZooKeeper从入门到精通13:使用ZooKeeper实现Hadoop的HA)