Hadoop2.7.7搭建高可用(HA)集群

一、准备工作

1、集群规划

49.235.8.131  master NameNode、DataNode、NodeManager、QuorumPeerMain、ResourceManager、DFSZKFailoverController、JournalNode
180.76.149.30 slave1 NameNode、DataNode、NodeManager、QuorumPeerMain、ResourceManager、DFSZKFailoverController、JournalNode
183.76.179.221 slave2 DataNode、NodeManager、JournalNode、QuorumPeerMain

2、三台机器安装JDK

3、配置SSH免密登录

4、搭建zookeeper集群

二、Hadoop集群安装

1、上传Hadoop安装包hadoop-2.7.7.tar.gz,并将其解压到hadoop(新建)目录下。

#上传安装包
[root@master hadoop]# rz -y
#解压到指定目录
[root@master software]# tar -xzvf hadoop-2.7.7.tar.gz -C /home/hadoop/

2、在hadoop目录下新建hdfs目录,hdfs目录下新建data、name、tmp目录。

[root@master hdfs]# mkdir -p /home/hadoop/hdfs
[root@master hdfs]# mkdir -p /home/hadoop/hdfs/data
[root@master hdfs]# mkdir -p /home/hadoop/hdfs/name
[root@master hdfs]# mkdir -p /home/hadoop/hdfs/tmp

3、修改hadoop-env.sh文件,配置HAVA_HOME如下

[root@master hdfs]# vim /home/hadoop/hadoop-2.7.7/etc/hadoop/hadoop-env.sh
#export JAVA_HOME=${JAVA_HOME}
export JAVA_HOME=/java/jdk1.8.0_161

4、修改core-site.xml



  fs.defaultFS
  hdfs://mycluster


   hadoop.tmp.dir
   /home/hadoop/hdfs/tmp


   io.file.buffer.size
   131072


   hadoop.proxyuser.root.hosts
   *


   hadoop.proxyuser.root.groups
   *



  ha.zookeeper.quorum
  master:2181,slave1:2181,slave2:2181


5、修改hdfs-site.xml



dfs.nameservices
mycluster


dfs.ha.namenodes.mycluster
nn1,nn2



  dfs.namenode.rpc-address.mycluster.nn1
  master:9000


  dfs.namenode.rpc-address.mycluster.nn2
  slave1:9000



  dfs.namenode.http-address.mycluster.nn1
  master:50070


  dfs.namenode.http-address.mycluster.nn2
  slave1:50070



dfs.namenode.shared.edits.dir
qjournal://master:8485;slave1:8485;slave2:8485/mycluster



  dfs.journalnode.edits.dir
  /home/hadoop/journal/data/




dfs.ha.automatic-failover.enabled
true



  dfs.client.failover.proxy.provider.mycluster
  org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider




dfs.ha.fencing.methods

sshfence
shell(/bin/true)




dfs.ha.fencing.ssh.private-key-files
/root/.ssh/id_rsa




dfs.ha.fencing.ssh.connect-timeout
30000

6、修改mapred-site.xml



    
        mapreduce.framework.name
        yarn
    

7、修改yarn-site.xml





    
        yarn.resourcemanager.ha.enabled
        true
    


    
        yarn.resourcemanager.cluster-id
        yrc
    


    
        yarn.resourcemanager.ha.rm-ids
        rm1,rm2
    


    
        yarn.resourcemanager.hostname.rm1
        master
    

    
        yarn.resourcemanager.hostname.rm2
        slave1
    


    
        yarn.resourcemanager.zk-address
        master:2181,slave1:2181,slave2:2181
    

    
        yarn.nodemanager.aux-services
        mapreduce_shuffle
    

8、修改slaves

master
slave1
slave2

三、Hadoop集群启动

1、启动zookeeper集群(master、slave1、slave2)

[root@master hadoop]# /home/zookeeper/zookeeper-3.4.14/bin/zkServer.sh start
[root@slave1 hadoop]# /home/zookeeper/zookeeper-3.4.14/bin/zkServer.sh start
[root@slave2 hadoop]# /home/zookeeper/zookeeper-3.4.14/bin/zkServer.sh start
#查看状态:一个leader,两个follower
[root@master hadoop]# /home/zookeeper/zookeeper-3.4.14/bin/zkServer.sh status
[root@slave1 hadoop]# /home/zookeeper/zookeeper-3.4.14/bin/zkServer.sh status
[root@slave2 hadoop]# /home/zookeeper/zookeeper-3.4.14/bin/zkServer.sh status

2、手动启动journalnode(分别在master、slave1、slave2上执行)

[root@master hadoop]# hadoop-daemon.sh start journalnode
[root@slave1 hadoop]# hadoop-daemon.sh start journalnode
[root@slave2 hadoop]# hadoop-daemon.sh start journalnode
#运行jps命令检验,master、slave1、slvae2上多了JournalNode进程

3、格式化namenode

 #格式化后会在根据core-site.xml中的hadoop.tmp.dir配置的目录下生成个hdfs初始化文件
[root@master hadoop]# hdfs namenode -format

4、把hadoop.tmp.dir配置的目录下所有文件拷贝到另一台namenode节点所在的机器上。

[root@master hadoop]# scp -r /home/hadoop/hdfs/ root@slave1:/home/hadoop/

5、格式化ZKFC(在active上执行即可)

[root@master hadoop]# hdfs zkfc -formatZK

6、启动HDFS

[root@master hadoop]# start-dfs.sh

7、启动YARN

[root@master hadoop]# start-yarn.sh

8、还需要手动在standby上手动启动备份的 resourcemanager

[root@slave1 hdfs]# yarn-daemon.sh start resourcemanager

9、JPS 查看启动进程

#master机器上
[root@master hadoop]# jps
28625 Jps
16258 DataNode
16472 JournalNode
5385 QuorumPeerMain
16793 ResourceManager
16906 NodeManager
16666 DFSZKFailoverController
17565 NameNode

#slave1 机器上jps
[root@slave1 hdfs]# jps
11328 NodeManager
11011 DataNode
10358 ResourceManager
11240 DFSZKFailoverController
7784 QuorumPeerMain
11113 JournalNode
11933 Jps
#slave2 机器上jps
[root@slave1 hdfs]# jps
11328 NodeManager
11011 DataNode
10358 ResourceManager
11240 DFSZKFailoverController
7784 QuorumPeerMain
11113 JournalNode
11933 Jps


四、验证集群可用性

http://49.235.8.131:50070/dfshealth.html#tab-overview

master:9000(active)

http://180.76.149.30:50070/dfshealth.html#tab-overview

slave1:9000' (standby)

测试集群工作状态的一些指令 :

# 查看hdfs的各节点状态信息
[root@master hadoop]# hdfs dfsadmin -report

 

 

你可能感兴趣的:(Hadoop)