Hadoop HA on Yarn——集群配置

集群搭建

因为服务器数量有限,这里服务器开启的进程有点多:

机器名   安装软件   运行进程  
hadoop001    Hadoop,Zookeeper  

NameNode, DFSZKFailoverController, ResourceManager

DataNode, NodeManager

QuorumPeerMain

JournalNode

hadoop002 Hadoop,Zookeeper

NameNode, DFSZKFailoverController, ResourceManager

DataNode, NodeManager

QuorumPeerMain 

JournalNode

hadoop003 Hadoop,Zookeeper

DataNode, NodeManager

QuorumPeerMain

 

 

 

 

 

 

 

 

 

 

 

 
 
 
说明[2]:
在hadoop2.X中通常由两个NameNode组成,一个处于active状态,另一个处于standby状态。Active NameNode对外提供服务,而Standby NameNode则不对外提供服务,仅同步active namenode的状态,以便能够在它失败时快速进行切换。
hadoop2.0官方提供了两种HDFS HA的解决方案,一种是NFS,另一种是QJM(由cloudra提出,原理类似zookeeper)。这里我使用QJM完成。主备NameNode之间通过一组JournalNode同步元数据信息,一条数据只要成功写入多数JournalNode即认为写入成功。通常配置奇数个JournalNode
 
这里略去jdk,Hadoop,Zookeeper的安装过程和环境变量配置。
 

无密码登陆

这里要非常注意无密码登陆的配置:
ssh-keygen -t rsa

在~/.ssh/目录中生成两个文件id_rsa和id_rsa.pub

如果想从hadoop001免密码登录到hadoop002中要在hadoop001中执行

ssh-copy-id -i ~/.ssh/id_rsa.pub [用户名]@hadoop002

这里为了实现任何机器之间都可以免密码登陆,所以在hadoop001中再执行两遍上面的操作(把@后面的机器名分别改成hadoop001和hadoop003),最后把生成的authorized_keys复制所有的节点上

 

Hadoop配置 

core-site.xml

<configuration>

<property>
<name>fs.defaultFSname>
<value>hdfs://appclustervalue>
property>


<property>
<name>hadoop.tmp.dirname>
<value>/data/hadoop/storage/tmpvalue>
property>


<property>
<name>ha.zookeeper.quorumname>
<value>hadoop001:2181,hadoop002:2181,hadoop003:2181value>
property>

<property>
<name>ha.zookeeper.session-timeout.msname>
<value>2000value>
property>
configuration>

 

hdfs-site.xml

<configuration>

<property>
<name>dfs.namenode.name.dirname>
<value>file:///data/hadoop/storage/hdfs/namevalue>
property>


<property>
<name>dfs.datanode.data.dirname>
<value>file:///data/hadoop/storage/hdfs/datavalue>
property>


<property>
<name>dfs.replicationname>
<value>2value>
property>


<property> 
<name>dfs.nameservicesname> 
<value>appclustervalue> 
property>


<property> 
<name>dfs.ha.namenodes.appclustername> 
<value>nn1,nn2value> 
property> 


<property> 
<name>dfs.namenode.rpc-address.appcluster.nn1name> 
<value>hadoop001:8020value> 
property> 


<property> 
<name>dfs.namenode.rpc-address.appcluster.nn2name> 
<value>hadoop002:8020value> 
property> 


<property> 
<name>dfs.namenode.http-address.appcluster.nn1name> 
<value>hadoop001:50070value> 
property> 


<property> 
<name>dfs.namenode.http-address.appcluster.nn2name> 
<value>hadoop002:50070value> 
property> 


<property> 
<name>dfs.namenode.shared.edits.dirname> 
<value>qjournal://hadoop001:8485;hadoop002:8485;hadoop003:8485/appclustervalue> 
property> 

<property> 
<name>dfs.ha.automatic-failover.enabled.appclustername> 
<value>truevalue> 
property> 


<property> 
<name>dfs.client.failover.proxy.provider.appclustername> 
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvidervalue> 
property> 


<property> 
<name>dfs.ha.fencing.methodsname> 
<value>sshfencevalue> 
property>


<property> 
<name>dfs.ha.fencing.ssh.private-key-filesname> 
<value>/home/[用户名]/.ssh/id_rsavalue> 
property> 


<property>
<name>dfs.journalnode.edits.dirname>
<value>/data/hadoop/tmp/journalvalue>
property>
configuration>

 

mapred-site.xml

<configuration>
<property>
<name>mapreduce.framework.namename>
<value>yarnvalue>
property>


<property>
<name>mapreduce.jobhistory.addressname>
<value>0.0.0.0:10020value>
property>


<property>
<name>mapreduce.jobhistory.webapp.addressname>
<value>0.0.0.0:19888value>
property>
configuration>  

 

yarn-site.xml

xml version="1.0"?>
<configuration>

<property>
<name>yarn.resourcemanager.connect.retry-interval.msname>
<value>2000value>
property>


<property>
<name>yarn.resourcemanager.ha.enabledname>
<value>truevalue>
property>


<property>
<name>yarn.resourcemanager.ha.rm-idsname>
<value>rm1,rm2value>
property>

<property>
<name>ha.zookeeper.quorumname>
<value>hadoop001:2181,hadoop002:2181,hadoop003:2181value>
property>


<property>
<name>yarn.resourcemanager.ha.automatic-failover.enabledname>
<value>truevalue>
property>

<property>
<name>yarn.resourcemanager.hostname.rm1name>
<value>hadoop001value>
property>

<property>
<name>yarn.resourcemanager.hostname.rm2name>
<value>hadoop002value>
property>


<property>
<name>yarn.resourcemanager.ha.idname>
<value>rm1value>
<description>If we want to launch more than one RM in single node,we need this configurationdescription>
property>


<property>
<name>yarn.resourcemanager.recovery.enabledname>
<value>truevalue>
property>


<property>
<name>yarn.resourcemanager.zk-state-store.addressname>
<value>hadoop001:2181,hadoop002:2181,hadoop003:2181value>
property>

<property>
<name>yarn.resourcemanager.store.classname>
<value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStorevalue>
property>

<property>
<name>yarn.resourcemanager.zk-addressname>
<value>hadoop001:2181,hadoop002:2181,hadoop003:2181value>
property>

<property>
<name>yarn.resourcemanager.cluster-idname>
<value>appcluster-yarnvalue>
property>


<property>
<name>yarn.app.mapreduce.am.scheduler.connection.wait.interval-msname>
<value>5000value>
property>


<property>
<name>yarn.resourcemanager.address.rm1name>
<value>hadoop001:8032value>
property>

<property>
<name>yarn.resourcemanager.scheduler.address.rm1name>
<value>hadoop001:8030value>
property>

<property>
<name>yarn.resourcemanager.webapp.address.rm1name>
<value>hadoop001:8088value>
property>

<property>
<name>yarn.resourcemanager.resource-tracker.address.rm1name>
<value>hadoop001:8031value>
property>

<property>
<name>yarn.resourcemanager.admin.address.rm1name>
<value>hadoop001:8033value>
property>

<property>
<name>yarn.resourcemanager.ha.admin.address.rm1name>
<value>hadoop001:23142value>
property>


<property>
<name>yarn.resourcemanager.address.rm2name>
<value>hadoop002:8032value>
property>

<property>
<name>yarn.resourcemanager.scheduler.address.rm2name>
<value>hadoop002:8030value>
property>

<property>
<name>yarn.resourcemanager.webapp.address.rm2name>
<value>hadoop002:8088value>
property>

<property>
<name>yarn.resourcemanager.resource-tracker.address.rm2name>
<value>hadoop002:8031value>
property>

<property>
<name>yarn.resourcemanager.admin.address.rm2name>
<value>hadoop002:8033value>
property>

<property>
<name>yarn.resourcemanager.ha.admin.address.rm2name>
<value>hadoop002:23142value>
property>

<property>
<name>yarn.nodemanager.aux-servicesname>
<value>mapreduce_shufflevalue>
property>

<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.classname>
<value>org.apache.hadoop.mapred.ShuffleHandlervalue>
property>

<property>
<name>yarn.nodemanager.local-dirsname>
<value>/data/hadoop/yarn/localvalue>
property>

<property>
<name>yarn.nodemanager.log-dirsname>
<value>/data/hadoop/yarn/logvalue>
property>

<property>
<name>mapreduce.shuffle.portname>
<value>23080value>
property>


<property>
<name>yarn.client.failover-proxy-providername>
<value>org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvidervalue>
property>

<property>
<name>yarn.resourcemanager.ha.automatic-failover.zk-base-pathname>
<value>/yarn-leader-electionvalue>
<description>Optionalsetting.Thedefaultvalueis/yarn-leader-electiondescription>
property>
configuration>

 

 

hadoop-env.sh & mapred-env.sh & yarn-env.sh

export JAVA_HOME=/usr/java/jdk1.7.0_60 
export CLASS_PATH=$JAVA_HOME/lib:$JAVA_HOME/jre/lib 
  
export HADOOP_HOME=/data/hadoop-2.6.0
export HADOOP_PID_DIR=/data/hadoop/pids 
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native 
export HADOOP_OPTS="$HADOOP_OPTS-Djava.library.path=$HADOOP_HOME/lib/native" 
  
export HADOOP_PREFIX=$HADOOP_HOME 
  
export HADOOP_MAPRED_HOME=$HADOOP_HOME 
export HADOOP_COMMON_HOME=$HADOOP_HOME 
export HADOOP_HDFS_HOME=$HADOOP_HOME 
export YARN_HOME=$HADOOP_HOME 
  
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop 
export HDFS_CONF_DIR=$HADOOP_HOME/etc/hadoop 
export YARN_CONF_DIR=$HADOOP_HOME/etc/hadoop 
  
export JAVA_LIBRARY_PATH=$HADOOP_HOME/lib/native 
  
export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin

  

参考文献

[1] hdfs-site.xml:http://www.21ops.com/front-tech/10744.html  

[2] yarn-site.xml: http://www.aboutyun.com/thread-10572-1-1.html 评论也值得参考

仅参考这两篇配置后报错:

15/07/17 13:58:55 FATAL ha.ZKFailoverController: Automatic failover is not enabled for NameNode at hadoop001/**.**.**.**:8020. Please ensure that automatic failover is enabled in the configuration before running the ZK failover controller.

再参考

[3]http://www.cnblogs.com/meiyuanbao/p/3545929.html (没有做到Yarn的HA)

发现需要在hdfs-site.xml添加配置:

<property> 
<name>dfs.ha.automatic-failover.enabled.appclustername> 
<value>truevalue> 
property> 

 

转载于:https://www.cnblogs.com/captainlucky/p/4654923.html

你可能感兴趣的:(大数据,运维,java)