Hadoop安装教程 HA高可用模式


系统准备

  • 一共三台机器 hadoop-01、hadoop-02、hadoop-02
  • hadoop-01 做 NameNode(active); hadoop-02 做 NameNode(standby)
  • hadoop-02 做 ResourceManager(active); hadoop-03 做 ResourceManager(standby);
  • 三台机器都充当 DataNodeNodeManger
  • 三台机器都充当 JournalNode
  • 三台机器执行 yum install -y snappy

Hadoop 安装和配置

  • Hadoop 安装

    上传软件,将其解压值 ~/app目录
    tar -zxvf hadoop-{versionid}.tar.gz
    
  • Hadoop 配置

    Hadoop安装完成后,需要对主要的几个配置文件进行配置,如下:
    ${HADOOP_HOME}/etc/hadopp/core-site.xml
    ${HADOOP_HOME}/etc/hadopp/hdfs-site.xml
    ${HADOOP_HOME}/etc/hadopp/yarn-site.xml
    ${HADOOP_HOME}/etc/hadopp/slaves
    【配置参数中数据目录路径需要手动提前建立】
    

    针对 core-site.xml 文件:

      
    
      
    
      
      hadoop.tmp.dir
      路径/hadoop/tmp
      A base for other temporary directories.
      
    
      
    
      
      fs.defaultFS
      hdfs://hyz  
      
    
      
      io.compression.codecs
      org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.BZip2Codec,org.apache.hadoop.io.compress.SnappyCodec
      
    
      
      ipc.client.connect.timeout
      120000
      
    
      
      ha.failover-controller.cli-check.rpc-timeout.ms
      120000
      
    
      
    

    针对 hdfs-site.xml 文件:
    Hadoop官方提供了两种 HDFS 的 HA 配置: 一种基于QJM实现; 一种基于NFS实现

QJM:the Quorum Journal Manager,翻译是法定经济管理人,实在没法想象,所以大家都亲切的称之为QJM。这种方案是通过JournalNode共享EditLog的数据,使用的是Paxos算法(没错,zookeeper就是使用的这种算法),保证活跃的NameNode与备份的NameNode之间EditLog日志一致。
NFS:Network File System 或 Conventional Shared Storage,传统共享存储,其实就是在服务器挂载一个网络存储(比如NAS),活跃NameNode将EditLog的变化写到NFS,备份NameNode检查到修改就读取过来,是两个NameNode数据一致。

部署上:QJM需要启动几个JournalNode即可,NFS需要挂载一个共享存储
配置文件上: hdfs-site.xml的内容唯一的区别是 dfs.namenode.shared.edits.dir 属性的配置

本文采用 QJM 的方法配置 hdfs 的 HA 模式,hdfs-site.xml 内容如下:

 

     
     
       dfs.datanode.data.dir
       file:///路径/hadoop/data/datanode
     
     

     
     
       dfs.nameservices
       hyz
     

     
       dfs.ha.namenodes.hyz
       nn1,nn2
     

     
       dfs.namenode.rpc-address.hyz.nn1
       hadoop-01:8020
     

     
       dfs.namenode.rpc-address.hyz.nn2
       hadoop-02:8020
     

     
       dfs.namenode.http-address.hyz.nn1
       hadoop-01:50070
     

     
       dfs.namenode.http-address.hyz.nn2
       hadoop-02:50070
     

     
       dfs.namenode.servicerpc-address.hyz.nn1
       hadoop-01:53310
     

     
       dfs.namenode.servicerpc-address.hyz.nn2
       hadoop-02:53310
     

     
       dfs.namenode.shared.edits.dir
       qjournal://hadoop-01:8485;hadoop-02:8485;hadoop-03:8485/hyz
     

     
       dfs.journalnode.edits.dir
       路径/data/ha/journal
     

     
       dfs.journalnode.rpc-address
       0.0.0.0:8485
     

     
       dfs.journalnode.http-address
       0.0.0.0:8482
     

     
       dfs.client.failover.proxy.provider.hyz
       org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
     

     
       dfs.ha.fencing.methods
       sshfence
     

     
       dfs.ha.fencing.ssh.private-key-files
       路径/.ssh/id_rsa
     

     
       dfs.ha.fencing.ssh.connect-timeout
       10000
     

     
       dfs.ha.automatic-failover.enabled
       true
     

     
       ha.zookeeper.quorum
       zkF01:2181,zkF02:2181,zkF03:2181
     
     

     
     
       dfs.namenode.name.dir
       file:///路径/hadoop/data/namenod
     

     
       dfs.namenode.acls.enabled
       true
     

     
       dfs.namenode.handler.count
       50
       The number of server threads for the namenode.
     
     

     
     
       dfs.datanode.handler.count
       20
       The number of server threads for the datanode.
     

     
       dfs.datanode.max.xcievers
       8192
     

     
       dfs.datanode.socket.write.timeout
       480000
     

     
       dfs.datanode.hdfs-blocks-metadata.enabled
       true
     

     
       dfs.datanode.failed.volumes.tolerated
       0
     

     
     
       dfs.permissions.enabled
       true
       If "true", enable permission checking in HDFS. If "false", permission checking is turned off, but all other behavior is unchanged. Switching from one parameter value to the other does not change the mode, owner or group of files or directories.
     

     
       dfs.permissions
       false
     

     
       dfs.replication.min
       3
       Minimal block replication.
     

     
       dfs.replication
       3
     

     
       dfs.support.append
       true
       set if hadoop support append
     

     
       fs.checkpoint.period
       60
       set if hadoop support append
     

      
       dfs.balance.bandwidthPerSec 
       10485760 
       Specifies the maximum bandwidth that each datanode can utilize for the balancing purpose in term of the number of bytes per second. 
                          

     
       fs.hdfs.impl.disable.cache
       false
     

     
       dfs.socket.timeout
       1800000
     

     
       fs.trash.interval
       1440
       Number of minutes between trash checkpoint. if zero, the trash feature is disabled.
     

     
       dfs.blocksize
       134217728
     

 

针对 yarn-site.xml 文件:

 
 
 
 
 yarn.nodemanager.resource.cpu-vcores
 CPU核数
 

 
 yarn.nodemanager.resource.memory-mb
 Mem大小
 

 
 yarn.nodemanager.local-dirs
 路径/yarn/yarnlocal
 

 
 yarn.nodemanager.log-dirs
 路径/yarn/yarnlog
 
 
 
 yarn.nodemanager.remote-app-log-dir
 路径/yarn/remote-app-logs
 

 

 

 
 yarn.resourcemanager.ha.enabled
 true
 

 
 yarn.resourcemanager.ha.automatic-failover.enabled
 true
 

 
 yarn.resourcemanager.cluster-id
 yarn-cluster
 

 
 yarn.resourcemanager.ha.rm-ids
 rm1,rm2
 

 
 yarn.resourcemanager.ha.id
 rm1 
 

 
 yarn.resourcemanager.store.class
 org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore
 

 
 yarn.resourcemanager.zk-address
 zkF01:2181,zkF02:2181,zkF03:2181
 

 
 yarn.resourcemanager.zk.state-store.address
 zkF01:2181,zkF02:2181,zkF03:2181
 

 
 ha.zookeeper.quorum
 zkF01:2181,zkF02:2181,zkF03:2181
 

 
 yarn.resourcemanager.recovery.enabled
 true
 

 
 yarn.app.mapreduce.am.scheduler.connection.wait.interval-ms
 5000
 

 
 mapreduce.jobhistory.address
 hadoop-02:10020
 

 
 mapreduce.jobhistory.webapp.address
 hadoop-02:19888
 

 
 yarn.log-aggregation-enable
 true
 

 

 
 yarn.resourcemanager.address.rm1
 hadoop-02:8032
 

 
 yarn.resourcemanager.scheduler.address.rm1
 hadoop-02:8030
 

 
 yarn.resourcemanager.webapp.address.rm1
 hadoop-02:8088
 

 
 yarn.resourcemanager.resource-tracker.address.rm1
 hadoop-02:8031
 

 
 yarn.resourcemanager.admin.address.rm1
 hadoop-02:8033
 

 
 yarn.resourcemanager.ha.admin.address.rm1
 hadoop-02:23142
 

 

 
 yarn.resourcemanager.address.rm2
 hadoop-03:8032
 

 
 yarn.resourcemanager.scheduler.address.rm2
 hadoop-03:8030
 

 
 yarn.resourcemanager.webapp.address.rm2
 hadoop-03:8088
 

 
 yarn.resourcemanager.resource-tracker.address.rm2
 hadoop-03:8031
 

 
 yarn.resourcemanager.admin.address.rm2
 hadoop-03:8033
 

 
 yarn.resourcemanager.ha.admin.address.rm2
 hadoop-03:23142
 

 

 

 

 
 yarn.nodemanager.aux-services
 mapreduce_shuffle
 

 
 yarn.nodemanager.aux-services.mapreduce.shuffle.class
 org.apache.hadoop.mapred.ShuffleHandler
 

 
 Address where the localizer IPC is.
 yarn.nodemanager.localizer.address
 0.0.0.0:8040
 

 
 NM Webapp address.
 yarn.nodemanager.webapp.address
 0.0.0.0:8042
 

 
 yarn.nodemanager.log.retain-seconds
 10800
 

 
 yarn.log-aggregation.retain-seconds
 43200
 

 
 yarn.log-aggregation.retain-check-interval-seconds
 7200
 

 

 

针对 slaves 文件:
hadoop-01 hadoop-02 hadoop-03


Hadoop 启动

  • ${HADOOP_HOME}/sbin/hadoop-env.sh 文件中,手动修改 export JAVA_HOME = /opt/jdk (要绝对路径)
  • nn1
    • ${HADOOP_HOME}/bin/hdfs zkfc -formatZK
  • nn1/nn2/nn3
    • ${HADOOP_HOME}/sbin/hadoop-daemon.sh start journalnode
  • nn1
    • ${HADOOP_HOME}/bin/hdfs namenode -format
    • ${HADOOP_HOME}/sbin/hadoop-daemon.sh start namenode
  • nn2
    • ${HADOOP_HOME}/bin/hdfs namenode -bootstrapStandby
    • ${HADOOP_HOME}/sbin/hadoop-daemon.sh start namenode
  • nn1
    • ${HADOOP_HOME}/sbin/hadoop-daemons.sh start datanode
  • nn1/nn2
    • ${HADOOP_HOME}/sbin/hadoop-daemon.sh start zkfc
  • nn2/nn3
    • ${HADOOP_HOME}/sbin/start-yarn.sh
  • nn2
    • ${HADOOP_HOME}/sbin/mr-jobhistory-daemon.sh start historyserver

Hadoop 检查

  • jps 命令检查进程情况
    hadoop-01的线程情况.png

    hadoop-02的线程情况.png

    hadoop-03的线程情况.png
  • 浏览器输入 http://hadoop-01:50070 查看 NameNode 状态
    namenode状态.png

你可能感兴趣的:(Hadoop安装教程 HA高可用模式)