hadoop2.6.5 ha配置与yarn ha配置


     前提:java,ssh,hosts都配置完了。

     master:  namenode  resourcemanager  zookeeper  zkfc

     slave1:  datanode  journalnode  nodemanager   zookeeper

     slave2:  datanode  journalnode  nodemanager   zookeeper

     node1:  zkfc  namenode  resourcemanager


1.hdfs-site.xml


  dfs.nameservices
  myha


  dfs.ha.namenodes.myha
  nn1,nn2


  dfs.namenode.rpc-address.myha.nn1
  master:8020


  dfs.namenode.rpc-address.myha.nn2
  node1:8020


  dfs.namenode.http-address.myha.nn1
  master:50070


  dfs.namenode.http-address.myha.nn2
  node1:50070


  dfs.namenode.shared.edits.dir
  qjournal://node1:8485;slave1:8485;slave2:8485/myha


  dfs.client.failover.proxy.provider.myha
  org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider


  dfs.ha.fencing.methods
  shell(/bin/true)


  dfs.ha.fencing.ssh.private-key-files
  /home/hadoop/.ssh/id_rsa


  dfs.journalnode.edits.dir
  /usr/hadoop/dfs/journalnode


  dfs.ha.automatic-failover.enabled
  true


		dfs.replication
		2


	dfs.permissions.enabled
	false


2.core-site.xml


  fs.defaultFS
  hdfs://myha


  ha.zookeeper.quorum
  master:2181,slave1:2181,slave2:2181


	hadoop.tmp.dir
	/usr/hadoop/tmp

3.yarn-site.xml


   yarn.resourcemanager.ha.enabled
   true
 
 
   yarn.resourcemanager.cluster-id
   myyarn
 
 
   yarn.resourcemanager.ha.rm-ids
   rm1,rm2
 
 
   yarn.resourcemanager.hostname.rm1
   master
 
 
   yarn.resourcemanager.hostname.rm2
   node1
 
 
   yarn.resourcemanager.zk-address
   master:2181,slave1:2181,slave2:2181
  

	yarn.nodemanager.aux-services
	mapreduce_shuffle

  
   yarn.resourcemanager.ha.automatic-failover.enabled
   true
  
 
   yarn.resourcemanager.ha.automatic-failover.embedded
   true
 
 
   yarn.resourcemanager.ha.id
   rm1  
 

4.mapred-site.xml


                mapreduce.framework.name
                yarn
        


5.运行hdfs

     1.   先开启 zookeeper

     2.   在开启journalnode      hadoop-daemon.sh start  journalnode

     3.    namenode格式化       hdfs namenode -format

     4.     zkfc格式化                 hdfs zkfc -formatZK

     5.     master开启namenode       hadoop-daemon.sh start namenode

     6.     master上开启datanode      hadoop-daemons.sh  start datanode

     7.     master,node1上开启zkfc    hadoop-daemon.sh start zkfc

     8.     node1备份                          hdfs namenode -bootstrapStandby

     9.     node1开启namenode         hadoop-daemon.sh start namenode

     

      上面命令以后可以用start-dfs.sh来代替。

       验证:

[hadoop@master tmp]$ hdfs haadmin -getServiceState nn1
active
[hadoop@master tmp]$ hdfs haadmin -getServiceState nn2
standby
       master上kill掉namenode,看看会不会把standby的namenode变为active。

[hadoop@master tmp]$ jps
4833 Jps
4199 ResourceManager
3304 NameNode
2377 QuorumPeerMain
3581 DFSZKFailoverController
[hadoop@master tmp]$ kill -9 3304
[hadoop@master tmp]$ hdfs haadmin -getServiceState nn2
active
[hadoop@master tmp]$ hdfs haadmin -getServiceState nn1
17/08/01 20:30:57 INFO ipc.Client: Retrying connect to server: master/192.168.0.110:8020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1000 MILLISECONDS)
Operation failed: Call From master/192.168.0.110 to master:8020 failed on connection exception: java.net.ConnectException: 拒绝连接; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
[hadoop@master tmp]$ 
     从上面可以看到是可以的。还可以用hdfs haadmin命令来转换
[hadoop@master tmp]$ hadoop-daemon.sh start namenode
starting namenode, logging to /usr/hadoop/logs/hadoop-hadoop-namenode-master.out
[hadoop@master tmp]$ jps
4199 ResourceManager
2377 QuorumPeerMain
5002 Jps
3581 DFSZKFailoverController
4926 NameNode
[hadoop@master tmp]$ hdfs haadmin -getServiceState nn1
standby
[hadoop@master tmp]$ hdfs haadmin -getServiceState nn2
active
[hadoop@master tmp]$ hdfs haadmin -failover --forcefence --forceactive nn2 nn1
forcefence and forceactive flags not supported with auto-failover enabled.
[hadoop@master tmp]$ hdfs haadmin -getServiceState nn2
active
[hadoop@master tmp]$ hdfs haadmin -failover  nn2 nn1
Failover to NameNode at master/192.168.0.110:8020 successful
[hadoop@master tmp]$ hdfs haadmin -getServiceState nn2
standby
[hadoop@master tmp]$ hdfs haadmin -getServiceState nn1
active

6.运行yarn

      1.  master上运行resourcemanager     yarn-daemon.sh start resourcemanager

      2.  node1上运行resourcemanager      yarn-daemon.sh start resourcemanager

      3.  master上运行nodemanager           yarn-daemons.sh start nodemanager


      验证:

[hadoop@master tmp]$ yarn rmadmin -getServiceState rm1
active
[hadoop@master tmp]$ yarn rmadmin -getServiceState rm2
standby
[hadoop@master tmp]$ jps
5728 Jps
5399 ResourceManager
2377 QuorumPeerMain
3581 DFSZKFailoverController
4926 NameNode
[hadoop@master tmp]$ kill -9 5399
[hadoop@master tmp]$ yarn rmadmin -getServiceState rm2
standby
[hadoop@master tmp]$ yarn rmadmin -getServiceState rm1
17/08/01 20:40:01 INFO ipc.Client: Retrying connect to server: master/192.168.0.110:8033. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1000 MILLISECONDS)
Operation failed: Call From master/192.168.0.110 to master:8033 failed on connection exception: java.net.ConnectException: 拒绝连接; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
[hadoop@master tmp]$ yarn rmadmin -getServiceState rm2
active

     

你可能感兴趣的:(hadoop)