hadoop3.1.3集群搭建(ha+yarn)

当前环境:

centos6.5,jdk8

准备工作:

1.服务器之间免密登录

$ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
$ chmod 0600 ~/.ssh/authorized_keys

2.服务器之间时间同步

3.安装zookeeper集群

搭建步骤:

1.下载apache hadoop3.1.3并上传至服务器解压

https://www.apache.org/dyn/closer.cgi/hadoop/common/hadoop-3.1.3/hadoop-3.1.3.tar.gz

2.环境变量设置:

export BASE_DIR=/data/br
export JAVA_HOME=$BASE_DIR/base/jdk1.8.0_181
export HADOOP_HOME=$BASE_DIR/base/hadoop
export ZOOKEEPER_HOME=/data/br/base/zookeeper

export HDFS_JOURNALNODE_USER=root
export HDFS_ZKFC_USER=root
export HDFS_NAMENODE_USER=root
export HDFS_DATANODE_USER=root
export HDFS_SECONDARYNAMENODE_USER=root
export YARN_NODEMANAGER_USER=root
export YARN_RESOURCEMANAGER_USER=root

export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$ZOOKEEPER_HOME/bin:$KAFKA_HOME/bin

3.hadoop-env.sh设置

export JAVA_HOME=/data/br/base/jdk1.8.0_181

4.core-site.xml配置



    fs.defaultFS
    hdfs://mycluster
  
  
    hadoop.tmp.dir
    /data/br/cache/hadoop/ha
  
  
  
     ha.zookeeper.quorum
     bonree01:2181,bonree02:2181,bonree03:2181
   

5.hdfs-site.xml配置


  
    dfs.replication
    2
  
  
    dfs.nameservices
    mycluster
  
  
    dfs.ha.namenodes.mycluster
    nn1,nn2
  
  
    dfs.namenode.rpc-address.mycluster.nn1
    bonree01:8020
  
  
    dfs.namenode.rpc-address.mycluster.nn2
    bonree02:8020
  
  
    dfs.namenode.shared.edits.dir
    qjournal://bonree01:8485;bonree02:8485;bonree03:8485/mycluster
  
  
    dfs.client.failover.proxy.provider.mycluster
    org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
  
  
    dfs.ha.fencing.methods
    sshfence
  
  
    dfs.ha.fencing.ssh.private-key-files
    /root/.ssh/id_dsa
  
  
    dfs.ha.automatic-failover.enabled
    true
  

6.mapred-site.xml设置


  
    mapreduce.framework.name
    yarn
  
  
        mapreduce.application.classpath
        $HADOOP_HOME/share/hadoop/mapreduce/*:$HADOOP_HOME/share/hadoop/mapreduce/lib/*
    

7.yarn-site.xml配置





    yarn.nodemanager.aux-services
    mapreduce_shuffle


    yarn.nodemanager.env-whitelist
    JAVA_HOME,HADOOP_HOME



  yarn.resourcemanager.ha.enabled
  true



  yarn.resourcemanager.cluster-id
  rmhacluster1



  yarn.resourcemanager.ha.rm-ids
  rm1,rm2



  yarn.resourcemanager.hostname.rm1
  bonree02


  yarn.resourcemanager.hostname.rm2
  bonree03



  yarn.resourcemanager.webapp.address.rm1
  bonree02:8088


  yarn.resourcemanager.webapp.address.rm2
  bonree03:8088



  yarn.resourcemanager.zk-address
  bonree01:2181,bonree02:2181,bonree03:2181


8.works指定datanode

bonree01
bonree02
bonree03

以上配置拷贝到所有集群上,配置完成,若需要添加其他配置参数,参考官网:https://hadoop.apache.org/docs/r3.1.3/hadoop-project-dist/hadoop-common/ClusterSetup.html

 

二、格式化相关操作(前提zookeeper集群已启动)

1.在指定的所有journalnode机器上执行命令启动journalnode

   hdfs --daemon start journalnode

2.在某台机器上执行namenode格式化(zk进程所在节点)

   hdfs namenode -format

3.在所在某一台namenode机器执行,启动namenode

   hdfs --daemon start namenode

4.在其余namenode机器上执行,同步active namenode信息,作为secondarynamenode

   hdfs namenode -bootstrapStandby

5.在active namenode所在节点执行,初始化zookeeper上NameNode的状态

  hdfs zkfc -formatZK

6.start-dfs.sh 启动ha,在指定的resource manager所在机器上执行start-yarn.sh启动resourcemanager、nodemanager

再次检查每台机器进程是否都正常运行,搭建完成。

hdfs web url:http://bonree01:9870, 

resourcemanager web url:http://bonree02:8088

说明:搭建说明及配置简陋,仅保证能正常运行。仅供参考。

你可能感兴趣的:(hadoop,大数据技术)