Hadoop环境搭建----(利用ZooKeeper搭建Hadoop的HA集群)

1、安装 Zookeeper 集群

具体安装步骤参考之前的文档 https://blog.csdn.net/liyyzz33/article/details/88689594

2、安装 hadoop 集群

具体安装步骤参考之前的文档 https://blog.csdn.net/liyyzz33/article/details/88397249

这只需要根据以上安装好的集群进行修改配置

3、修改hadoop 集群

修改core-site.xml

vi core-site.xml


fs.defaultFS
hdfs://node1:9000



hadoop.tmp.dir
/data/hadoop/hddata/

    
       hadoop.proxyuser.root.hosts
       *
    
    
        hadoop.proxyuser.root.groups 
        *
   


    ha.zookeeper.quorum
    node1:2181,node2:2181,node3:2181




    ha.zookeeper.session-timeout.ms
    1000
    ms


修改hdfs-site.xml

vi hdfs-site.xml 


dfs.replication   
2 
 

dfs.namenode.secondary.http-address
node2:50090


    dfs.nameservices
    myha01




    dfs.ha.namenodes.myha01
    nn1,nn2




    dfs.namenode.rpc-address.myha01.nn1
    node1:9000




    dfs.namenode.http-address.myha01.nn1
    node1:50070




    dfs.namenode.rpc-address.myha01.nn2
    node2:9000




    dfs.namenode.http-address.myha01.nn2
    node2:50070




    dfs.namenode.shared.edits.dir
    qjournal://node1:8485;node2:8485;node3:8485/myha01




    dfs.journalnode.edits.dir
    /data/hadoop/data/journaldata




    dfs.ha.automatic-failover.enabled
    true




    dfs.client.failover.proxy.provider.myha01
    org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider




    dfs.ha.fencing.methods
    
        sshfence
        shell(/bin/true)
    




    dfs.ha.fencing.ssh.private-key-files
    /home/hadoop/.ssh/id_rsa




    dfs.ha.fencing.ssh.connect-timeout
    30000



    ha.failover-controller.cli-check.rpc-timeout.ms
    60000


修改mapred-site.xml

vi mapred-site.xml



mapreduce.framework.name
yarn

  
mapreduce.jobhistory.address  
node1:10020  



    mapreduce.jobhistory.webapp.address
    node1:19888


修改yarn-site.xml

 vi yarn-site.xml 

    
    
        yarn.resourcemanager.ha.enabled
        true
    

    
    
        yarn.resourcemanager.cluster-id
        yrc
    

    
    
        yarn.resourcemanager.ha.rm-ids
        rm1,rm2
    

    
    
        yarn.resourcemanager.hostname.rm1
        node2
    

    
        yarn.resourcemanager.hostname.rm2
        node3
    

    
    
        yarn.resourcemanager.zk-address
        node1:2181,node2:2181,node3:2181
    

    
        yarn.nodemanager.aux-services
        mapreduce_shuffle
    

    
        yarn.log-aggregation-enable
        true
    

    
        yarn.log-aggregation.retain-seconds
        86400
    

    
    
        yarn.resourcemanager.recovery.enabled
        true
    

    
    
        yarn.resourcemanager.store.class
        org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore
    

4、将hadoop安装包重新分发到其他集群节点

scp -r /data/hadoop/hadoop-3.1.2 root@node2:/data/hadoop/
scp -r /data/hadoop/hadoop-3.1.2 root@node2:/data/hadoop/

Hadoop HA集群的初始化

1、启动ZooKeeper

node1

[root@node1]# /data/hadoop/zookeeper/bin/zkServer.sh start
[root@node1]# jps
2674 Jps
2647 QuorumPeerMain
[root@node1 bin]# ./zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /data/hadoop/zookeeper/bin/../conf/zoo.cfg
Mode: follower

node2

[root@node1]# /data/hadoop/zookeeper/bin/zkServer.sh start
[root@node1]# jps
2674 Jps
2647 QuorumPeerMain
[root@node1 bin]# ./zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /data/hadoop/zookeeper/bin/../conf/zoo.cfg
Mode: follower

node3

[root@node1]# /data/hadoop/zookeeper/bin/zkServer.sh start
[root@node1]# jps
2674 Jps
2647 QuorumPeerMain
[root@node1 bin]# ./zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /data/hadoop/zookeeper/bin/../conf/zoo.cfg
Mode: leader

2、在你配置的各个journalnode节点启动该进程

按照之前的规划,我的是在hadoop1、hadoop2、hadoop3上进行启动,启动命令如下

cd /data/hadoop/hadoop-2.7.7/sbin/
./hadoop-daemon.sh start journalnode
[root@node1 bin]# jps
2739 JournalNode
2788 Jps
2647 QuorumPeerMain

3、格式化namenode

hadoop namenode -format

4、要把在node1节点上生成的元数据 给复制到 另一个namenode(node2)节点上

scp -r /data/hadoop/hddata root@node2:/data/hadoop/

5、格式化zkfc

重点强调:只能在nameonde节点进行

hdfs zkfc -formatZK

启动集群

1、启动HDFS

cd /data/hadoop/hadoop-2.7.7/sbin/
./start-dfs.sh

2、启动YARN

在主备 resourcemanager 中随便选择一台进行启动

cd /data/hadoop/hadoop-2.7.7/sbin/
./start-yarn.sh

若备用节点的 resourcemanager 没有启动起来,则手动启动起来,在node2,node3上进行手动启动

./yarn-daemon.sh start resourcemanager

3、启动 mapreduce 任务历史服务器

./mr-jobhistory-daemon.sh start historyserver

4、查看各主节点的状态

HDFS

[root@node1]#  hdfs haadmin -getServiceState nn1
standby
[root@node1]# hdfs haadmin -getServiceState nn2
active

YARN

[root@node1]# yarn rmadmin -getServiceState rm1
standby
[root@node1]# yarn rmadmin -getServiceState rm2
active
[root@node1]#

5、WEB界面进行查看

HDFS
node1
Hadoop环境搭建----(利用ZooKeeper搭建Hadoop的HA集群)_第1张图片
node2
Hadoop环境搭建----(利用ZooKeeper搭建Hadoop的HA集群)_第2张图片

YARN

standby节点会自动跳到avtive节点
Hadoop环境搭建----(利用ZooKeeper搭建Hadoop的HA集群)_第3张图片
MapReduce历史服务器web界面
Hadoop环境搭建----(利用ZooKeeper搭建Hadoop的HA集群)_第4张图片

你可能感兴趣的:(bigdata)