配置hadoopHA(高可用集群)常见错误解决办法

 

   乾坤未定,你我皆是黑马。

 

 

在学习hadoop过程中,

 

1.在启动第二个节点的namenode时候,出现错误。

 

InconsistentFSStateException: Directory /opt/modules/hadoopha/hadoop-2.5.2/data/tmp/dfs/name is in an inconsistent state: storage directory does not exist or is not accessible.

原因分析:在core-sitexml中定义的存储位置下的versionID不符合导致的例如你设置的位置是下面这样,


  
    
    fs.defaultFS
    hdfs://ns1
  
  
    
    hadoop.tmp.dir
    /opt/modules/hadoopha/hadoop-2.5.2/data/tmp
  

进入tmp目录下name里面删除version。然后进入hadoop-2.5.2/bin

目录下执行重新格式话

hadoop namenode -format

之后重新启动namenode,问题解决。

 

2.namdenode启动失败,错误原因如下

.QuorumException: Unable to check if JNs are ready for formatting. 1 exceptions thrown:

检查hdfs-site.xml中 的下列配置

  
        dfs.namenode.shared.edits.dir
        qjournal://hadoop1:8485;hadoop2:8485;hadoop3/ns1
  

然后关闭hadoop集群

 sbin/stop-all.sh 后,重新启动namenode

 

3.改变namenode状态为Active时出现错误


Operation failed: Failed on local exception: java.io.EOFException; Host Details : local host is:destination host is

本来我是将hadoop文件夹删除之后重新解压,配置变量,结果依然出现这个问题,

出现这个错误的原因是节点下多次格式化的导致的,具体的原因也太清楚。

解决办法:

进入你设置的namenode目录下,进入data/dfs/...目录下,删除name文件夹,

然后重新格式化 : bin/hdfs namenode -format   

格式化之后,使用reboot重启虚拟机后,重新打开namenode节点然后不用再次格式化,直接

使用bin/hdfs haadmin -transsitionToActive nn1  改变状态

查看50070端口,成功!!!

 

4.格式化namenode出现拒绝连接错误,如下所示

org.apache.hadoop.hdfs.qjournal.client.QuorumException: Unable to check if JNs are ready for formatting. 1 exceptions thrown:
192.168.129.128:8485: Call From bigdata-senior01/192.168.129.128 to bigdata-senior01:8485 failed on connection exception: java.net.ConnectException: 拒绝连接; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
	at org.apache.hadoop.hdfs.qjournal.client.QuorumException.create(QuorumException.java:81)
	at org.apache.hadoop.hdfs.qjournal.client.QuorumCall.rethrowException(QuorumCall.java:223)
	at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.hasSomeData(QuorumJournalManager.java:232)
	at org.apache.hadoop.hdfs.server.common.Storage.confirmFormat(Storage.java:875)
	at org.apache.hadoop.hdfs.server.namenode.FSImage.confirmFormat(FSImage.java:171)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:922)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1354)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1473)
19/04/04 19:28:36 INFO ipc.Client: Retrying connect to server: bigdata-senior02/192.168.129.130:8485. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
19/04/04 19:28:36 INFO ipc.Client: Retrying connect to server: bigdata-senior03/192.168.129.133:8485. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
19/04/04 19:28:36 FATAL namenode.NameNode: Exception in namenode join
org.apache.hadoop.hdfs.qjournal.client.QuorumException: Unable to check if JNs are ready for formatting. 1 exceptions thrown:
192.168.129.128:8485: Call From bigdata-senior01/192.168.129.128 to bigdata-senior01:8485 failed on connection exception: java.n

   我首先将haodop目录下的临时目录tmp下的data文件删除,然后将core-site.xml中设置的hadoop.tmp.dir文件目录删除之后,(另外我把日志的内容也清空了,个人感觉删不删都行) 然后重新 bin/hdfs namenode -format  之后发现仍然有错误,查看资料之后,再格式化之前要启动journalnode

 sbin/hadoop-daemon.sh start journalnode

然后执行格式化命令   ;成功格式化



5,在对第二个namenode进行-bootstrapStandby格式化时出现错误

  

:57:35 INFO namenode.NameNode: registered UNIX signal handlers for [TERM, HUP, INT]
19/04/04 19:57:35 INFO namenode.NameNode: createNameNode [-bootstrapStandby]
19/04/04 19:57:37 INFO ipc.Client: Retrying connect to server: bigdata-senior01/192.168.129.128:8020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
19/04/04 19:57:38 INFO ipc.Client: Retrying connect to server: bigdata-senior01/192.168.129.128:8020. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
19/04/04 19:57:39 INFO ipc.Client: Retrying connect to server: bigdata-senior01/192.168.129.128:8020. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
19/04/04 19:57:40 INFO ipc.Client: Retrying connect to server: bigdata-senior01/192.168.129.128:8020. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
19/04/04 19:57:41 INFO ipc.Client: Retrying connect to server: bigdata-senior01/192.168.129.128:8020. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
19/04/04 19:57:42 INFO ipc.Client: Retrying connect to server: bigdata-senior01/192.168.129.128:8020. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
19/04/04 19:57:43 INFO ipc.Client: Retrying connect to server: bigdata-senior01/192.168.129.128:8020. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
19/04/04 19:57:44 INFO ipc.Client: Retrying connect to server: bigdata-senior01/192.168.129.128:8020. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
19/04/04 19:57:45 INFO ipc.Client: Retrying connect to server: bigdata-senior01/192.168.129.128:8020. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
19/04/04 19:57:46 INFO ipc.Client: Retrying connect to server: bigdata-senior01/192.168.129.128:8020. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
19/04/04 19:57:46 FATAL ha.BootstrapStandby: Unable to fetch namespace information from active NN at bigdata-senior01/192.168.129.128:8020: Call From bigdata-senior02/192.168.129.130 to bigdata-senior01:8020 failed on connection exception: java.net.ConnectException: 拒绝连接; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
19/04/04 19:57:46 INFO util.ExitUtil: Exiting with status 2
19/04/04 19:57:46 INFO namenode.NameNode: SHUTDOWN_MSG: 

  首先要检查防火墙是否关闭,一般都设置为开机自动关闭的,另一个可能的问题时主机器没有开启namenode

 所以首先要在第一个namenode节点开启,然后进行格式化 sbin/hdfs nameno -bootstrapStandby  

看到出现 Stdorage directory /home/xxx/xxx/name has been successfully formatted,表示格式化成功!!

你可能感兴趣的:(配置hadoopHA(高可用集群)常见错误解决办法)