1. 解压缩tar -zxf zookeeper-3.4.5.tar.gz
2. conf目录下修改文件名zoo_sample.cfg 改为 zoo.cfg # mv zoo_sample.cfgzoo.cfg
3. 修改成如下内容即可(每个主机的这个配置文件一样)
dataDir=/export/crawlspace/mahadev/zookeeper/server1/data
clientPort=2181
initLimit=5
syncLimit=2
server.0=172.17.138.67:4888:5888
server.1=172.17.138.68:4888:5888
server.2=172.17.138.69:4888:5888
server.3=172.17.138.70:4888:5888
4. 启动服务,三台电脑先后执行zkServer start 指令,但三台电脑间执行此指令的间隔不宜过久,如果没有出,错则成功启动
5.
执行测试,在一台机器执行下面操作,要保证能成功连接,否则后面的hadoop namenode可能都是standby模式
#bin/zkCli.sh -server 127.0.0.1:2181
#vi core-site.xml
hadoop.tmp.dir
/home/hadoop/hadoopData/tmp
fs.defaultFS
hdfs://mycluster
ha.zookeeper.quorum
EBPPTEST01:2181,EBPPTEST02:2181,EBPPTEST03:2181
hadoop.proxyuser.hadoop.hosts
172.17.138.67
hadoop.proxyuser.hadoop.groups
*
#vi mapred-site.xml
mapreduce.framework.name
yarn
mapreduce.jobhistory.address
EBPPTEST01:10020
mapreduce.jobhistory.webapp.address
EBPPTEST01:19888
mapreduce.tasktracker.map.tasks.maximum
3
mapreduce.tasktracker.reduce.tasks.maximum
3
#vi hdfs-site.xml
dfs.replication
1
dfs.namenode.name.dir
/home/hadoop/hadoopData/filesystem/name
dfs.datanode.data.dir
/data/data,/home/hadoop/hadoopData/filesystem/data
dfs.nameservices
mycluster
dfs.ha.namenodes.mycluster
nn1,nn2
dfs.namenode.rpc-address.mycluster.nn1
EBPPTEST01:9000
dfs.namenode.http-address.mycluster.nn1
EBPPTEST01:50070
dfs.namenode.rpc-address.mycluster.nn2
EBPPTEST02:9000
dfs.namenode.http-address.mycluster.nn2
EBPPTEST02:50070
dfs.namenode.shared.edits.dir
qjournal://EBPPTEST01:8485;EBPPTEST02:8485;EBPPTEST03:8485/mycluster
dfs.client.failover.proxy.provider.mycluster
org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
dfs.ha.fencing.methods
sshfence
dfs.ha.fencing.ssh.private-key-files
/home/hadoop/.ssh/id_dsa
dfs.journalnode.edits.dir
/home/hadoop/hadoopData/journalData
dfs.ha.automatic-failover.enabled
true
#vi yarn-site.xml
yarn.nodemanager.aux-services
mapreduce_shuffle
yarn.nodemanager.aux-services.mapreduce_shuffle.class
org.apache.hadoop.mapred.ShuffleHandler
1. 启动JournalNode集群
在需要部署JournalNode的节点执行命令 hadoop-daemon.sh start journalnode
2. 格式化QJM文件系统
./hdfs namenode-initializeSharedEdits
3. 启动之前的namenode
hadoop-daemon.sh start namenode
4. 格式化另一个NameNode,需要在另一台机器上执行
./hdfs namenode -bootstrapStandby
5. 关闭HDFS
stop-dfs.sh
6. 在zookeeper中安装HA
hdfs zkfc -formatZK
7. 启动集群
start-dfs.sh
start-yarn.sh
参考文档:http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/HDFSHighAvailabilityWithQJM.html
http://hortonworks.com/blog/how-to-plan-and-configure-yarn-in-hdp-2-0/