hdfs的block损坏

到了这一步就基本判定是block出现了问题,因为hdfs的block如果出现损坏或者离线,hdfs会开启自我保护机制,即

模式上线

尝试通过重启master,让hdfs自我修复掉坏的block,但是失败了,仍然提示一个block异常

 

2019-01-02 16:21:40,081 INFO  ipc.Server (Server.java:logException(2394)) - IPC Server handler 583 on 8020, call org.apache.hadoop.hdfs.protocol.ClientProtocol.renewLease from 72.118.0.12:36554 Call#1477 Retry#0: org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot renew lease for DFSClient_NONMAPREDUCE_-2130476810_1. Name node is in safe mode.

 

尝试关掉安全模式,然后启动hdfs的master,启动后ambari的collector任然没提示无法获取需要的数据

 

2019-01-02 16:01:05,596 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller: Call exception, tries=10, retries=35, started=68259 ms ago, cancelled=false, msg=org.apache.hadoop.hbase.NotServingRegionException: Region SYSTEM.CATALOG,,1545030793895.709d49c78a9511adb4082e4177ca23e0. is not online on 1.snamenode1,61320,1546415985105

        at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3077)

        at org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1015)

        at org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:1955)

        at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32389)

        at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2150)

        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112)

        at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:187)

        at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:167)

 row '' on table 'SYSTEM.CATALOG' at region=SYSTEM.CATALOG,,1545030793895.709d49c78a9511adb4082e4177ca23e0., hostname=1.snamenode1,61320,1545716568014, seqNum=43

 

 

最终只能选择手动修复

su - hdfs

hdfs dfsadmin -safemode leave

hdfs fsck / -delete   #因为损坏的block并不是很重要,所以直接删除掉了

 

然后再次重启master和collector,服务正常

你可能感兴趣的:(大数据技术)