部署规划
主机
|
用途
|
IP
|
rm01.hadoop.com
|
ResourceManager01
|
192.168.137.11
|
nn01.hadoop.com
|
NameNode01
、
DFSZKFailoverController
|
192.168.137.12
|
rm02.hadoop.com
(backup resourcemanager)
|
ResourceManager02
|
192.168.137.13
|
nn02.hadoop.com
(backup namenode)
|
NameNode02
、
DFSZKFailoverController
|
192.168.137.14
|
dn01.hadoop.com
|
Dat
a
Node
、
NodeManager
、
QuorumPeerMain
、
JournalNode
|
192.168.137.21
|
dn02.hadoop.com
|
Dat
a
Node
、
NodeManager
、
QuorumPeerMain
、
JournalNode
|
192.168.137.22
|
dn03.hadoop.com
|
Dat
a
Node
、
NodeManager
、
QuorumPeerMain
、
JournalNode
|
192.168.137.23
|
[hadoop@dn01 ~]$ tar -zxf /nfs_share/software/zookeeper-3.4.11.tar.gz -C ~
[hadoop@dn01 ~]$ vi .bashrc
export ZOOKEEPER_HOME=/home/hadoop/zookeeper-3.4.11
export PATH=$PATH:/home/hadoop/zookeeper-3.4.11/bin
[hadoop@dn01 ~]$ source .bashrc
[hadoop@dn01 ~]$ cd zookeeper-3.4.11/conf
[hadoop@dn01 conf]$ mv zoo_sample.cfg zoo.cfg
[hadoop@dn01 conf]$ vi zoo.cfg
dataLogDir=/home/hadoop/zookeeper-3.4.11/log
dataDir=/home/hadoop/zookeeper-3.4.11/data
server.1=192.168.137.21:2888:3888
server.2=192.168.137.22:2888:3888
server.3=192.168.137.23:2888:3888
[hadoop@dn01 conf]$ cd ..
[hadoop@dn01 zookeeper-3.4.11]$ mkdir data && mkdir log && cd data && echo "1">>myid
[hadoop@dn01 zookeeper-3.4.11]$ cd
[hadoop@dn01 ~]$ scp -r zookeeper-3.4.11 dn02.hadoop.com:/home/hadoop
[hadoop@dn01 ~]$ scp -r zookeeper-3.4.11 dn03.hadoop.com:/home/hadoop
[hadoop@dn01 ~]$ ssh [email protected] 'cd /home/hadoop/zookeeper-3.4.11/data && echo "2">myid'
[hadoop@dn01 ~]$ ssh [email protected] 'cd /home/hadoop/zookeeper-3.4.11/data && echo "3">myid'
[hadoop@dn01 ~]$ zkServer.sh start
[hadoop@dn02 ~]$ zkServer.sh start
[hadoop@dn03 ~]$ zkServer.sh start
[hadoop@dn01 ~]$ zkServer.sh status
[hadoop@dn02 ~]$ zkServer.sh status
[hadoop@dn03 ~]$ zkServer.sh status
[hadoop@dn01 ~]$ cd hadoop-2.9.0 && mkdir journal
[hadoop@dn02 ~]$ cd hadoop-2.9.0 && mkdir journal
[hadoop@dn03 ~]$ cd hadoop-2.9.0 && mkdir journal
[hadoop@nn01 ~]$ cd hadoop-2.9.0/etc/hadoop/
[hadoop@nn01 hadoop]$ vi core-site.xml
fs.defaultFS
hdfs://ns1
hadoop.tmp.dir
file:/home/hadoop/hadoop-2.9.0/tmp
io.file.buffer.size
131072
hadoop.proxyuser.hadoop.hosts
*
hadoop.proxyuser.hadoop.groups
*
ha.zookeeper.quorum
dn01.hadoop.com:2181,dn02.hadoop.com:2181,dn03.hadoop.com:2181
ha.zookeeper.session-timeout.ms
1000
|
[hadoop@nn01 hadoop]$ vi hdfs-site.xml
dfs.namenode.name.dir
/home/hadoop/hadoop-2.9.0/dfs/name
dfs.datanode.data.dir
/home/hadoop/hadoop-2.9.0/dfs/data
dfs.blocksize
64m
dfs.replication
3
dfs.webhdfs.enabled
true
dfs.permissions
false
dfs.permissions.enabled
false
dfs.nameservices
ns1
dfs.ha.namenodes.ns1
nn1,nn2
dfs.namenode.rpc-address.ns1.nn1
nn01.hadoop.com:8020
dfs.namenode.rpc-address.ns1.nn2
nn02.hadoop.com:8020
dfs.namenode.servicerpc-address.ns1.nn1
nn01.hadoop.com:53310
dfs.namenode.servicerpc-address.ns1.nn2
nn02.hadoop.com:53310
dfs.namenode.http-address.ns1.nn1
nn01.hadoop.com:50070
dfs.namenode.http-address.ns1.nn2
nn02.hadoop.com:50070
dfs.namenode.shared.edits.dir.ns1.nn1
qjournal://
dn01.hadoop.com:8485;dn02.hadoop.com:8485;dn03.hadoop.com:8485/ns1
dfs.namenode.shared.edits.dir.ns1.nn2
qjournal://
dn01.hadoop.com:8485;dn02.hadoop.com:8485;dn03.hadoop.com:8485/ns1
dfs.ha.automatic-failover.enabled.ns1
true
dfs.client.failover.proxy.provider.ns1
org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
dfs.ha.fencing.methods
sshfence
dfs.ha.fencing.ssh.private-key-files
/home/hadoop/.ssh/id_rsa
dfs.ha.fencing.ssh.connect-timeout
30000
dfs.journalnode.edits.dir
/home/hadoop/hadoop-2.9.0/journal
ha.failover-controller.cli-check.rpc-timeout.ms
50000
ipc.client.connect.timeout
60000
dfs.image.transfer.bandwidthPerSec
4194304
|
[hadoop@nn01 hadoop]$ vi
mapred
-site.xml
mapreduce.framework.name
yarn
mapreduce.jobhistory.address
rm01.hadoop.com:10020
mapreduce.jobhistory.webapp.address
rm01.hadoop.com:19888
|
[hadoop@nn01 hadoop]$ vi
yarn-site.xml
yarn.nodemanager.aux-services
mapreduce_shuffle
yarn.nodemanager.aux-services.mapreduce.shuffle.class
org.apache.hadoop.mapred.ShuffleHandler
yarn.resourcemanager.connect.retry-interval.ms
2000
yarn.resourcemanager.scheduler.class
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler
yarn.app.mapreduce.am.scheduler.connection.wait.interval-ms
5000
yarn.resourcemanager.recovery.enabled
true
yarn.resourcemanager.store.class
org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore
yarn.resourcemanager.zk-address
dn01.hadoop.com:2181,dn02.hadoop.com:2181,dn03.hadoop.com:2181
yarn.resourcemanager.zk-state-store.parent-path
/rmstore
yarn.resourcemanager.zk-num-retries
500
yarn.resourcemanager.zk-retry-interval-ms
2000
yarn.resourcemanager.zk-timeout-ms
10000
yarn.resourcemanager.zk-acl
world:anyone:rwcda
yarn.resourcemanager.am.max-attempts
2
yarn.resourcemanager.cluster-id
yarn-cluster
yarn.resourcemanager.ha.enabled
true
yarn.resourcemanager.ha.rm-ids
rm1,rm2
yarn.resourcemanager.hostname.rm1
rm01.hadoop.com
yarn.resourcemanager.hostname.rm2
rm02.hadoop.com
yarn.resourcemanager.ha.automatic-failover.enabled
true
yarn.resourcemanager.ha.automatic-failover.embedded
true
yarn.resourcemanager.address.rm1
rm01.hadoop.com:8032
yarn.resourcemanager.scheduler.address.rm1
rm01.hadoop.com:8030
yarn.resourcemanager.admin.address.rm1
rm01.hadoop.com:8033
yarn.resourcemanager.resource-tracker.address.rm1
rm01.hadoop.com:8031
yarn.resourcemanager.webapp.address.rm1
rm01.hadoop.com:8088
yarn.resourcemanager.webapp.https.address.rm1
rm01.hadoop.com:8090
yarn.resourcemanager.address.rm2
rm02.hadoop.com:8032
yarn.resourcemanager.scheduler.address.rm2
rm02.hadoop.com:8030
yarn.resourcemanager.admin.address.rm2
rm02.hadoop.com:8033
yarn.resourcemanager.resource-tracker.address.rm2
rm02.hadoop.com:8031
yarn.resourcemanager.webapp.address.rm2
rm02.hadoop.com:8088
yarn.resourcemanager.webapp.https.address.rm2
rm02.hadoop.com:8090
|
[hadoop@nn01 hadoop]$ vi
slaves
dn01.hadoop.com
dn02.hadoop.com
dn03.hadoop.com
|
[hadoop@nn01 ~]$ hdfs zkfc -formatZK
启动journalnode节点用于namenode主备数据同步
[hadoop@dn01 ~]$ hadoop-daemon.sh start journalnode
[hadoop@dn02 ~]$ hadoop-daemon.sh start journalnode
[hadoop@dn03 ~]$ hadoop-daemon.sh start journalnode
启动主namenode
[hadoop@nn01 ~]$ hdfs namenode -format -clusterId c1
[hadoop@nn01 ~]$ hadoop-daemon.sh start namenode
启动备用namenode
[hadoop@nn02 ~]$ hdfs namenode -bootstrapStandby
[hadoop@nn02 ~]$ hadoop-daemon.sh start namenode
启动namenode故障转移程序
[hadoop@nn01 ~]$ hadoop-daemon.sh start zkfc
[hadoop@nn02 ~]$ hadoop-daemon.sh start zkfc
启动datanode
[hadoop@dn01 ~]$ hadoop-daemon.sh start datanode
[hadoop@dn02 ~]$ hadoop-daemon.sh start datanode
[hadoop@dn03 ~]$ hadoop-daemon.sh start datanode
启动主resoucemanager
[hadoop@rm01 ~]$ start-yarn.sh
启动备用resoucemanager
[hadoop@rm02 ~]$ yarn-daemon.sh start resourcemanager
http://nn01.hadoop.com:50070/dfshealth.html#tab-overview
http://nn02.hadoop.com:50070/dfshealth.html#tab-overview
http://rm01.hadoop.com:8088/cluster/cluster
http://rm02.hadoop.com:8088/cluster/cluster
HDFS HA 检验实验
[hadoop@nn01 ~]$ jps
2352 DFSZKFailoverController
2188 NameNode
3105 Jps
执行命令
[hadoop@nn01 ~]$ kill -9 2188
刷新页面,看到
说明切换成功。
ResourceManager HA 检验实验
[hadoop@rm01 ~]$ jps
1599 ResourceManager
1927 Jps
启动wordcount程序
kill掉主ResourceManager进程
[hadoop@rm01 ~]$ kill -9 1599
看控制台输出,可以看到备的ResourceManager被启用
说明切换成功。