解决hbase中hregionserver起不来 报org.apache.hadoop.hbase.ClockOutOfSyncException错

解决hbase中hregionserver起不来 org.apache.hadoop.hbase.ClockOutOfSyncException报错

问题

  • 当我们启动hbase的服务后发现了一个问题,那就是部分的hregionserver没有起来,或者是起来的但是过了一会儿又死掉了。

排查错误

  • 方法:通过看启动日志会看到启动日志的存储位置
    在这里插入图片描述
  • 我出错在min1上,所以我查看min1的启动日志。
[root@min1 ~]# cat /data/hbase/logs/hbase-root-regionserver-min1.log

报出的错误是:

Caused by: org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.ClockOutO                                       fSyncException): org.apache.hadoop.hbase.ClockOutOfSyncException: Server min1,16020,1553633572848                                        has been rejected; Reported time is too far out of sync with master.  Time difference of 34101ms >                                        max allowed of 30000ms
        at org.apache.hadoop.hbase.master.ServerManager.checkClockSkew(ServerManager.java:366)
        at org.apache.hadoop.hbase.master.ServerManager.regionServerStartup(ServerManager.java:243                                       )
        at org.apache.hadoop.hbase.master.MasterRpcServices.regionServerStartup(MasterRpcServices.                                       java:480)
        at org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServer                                       StatusService$2.callBlockingMethod(RegionServerStatusProtos.java:11085)
        at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
        at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
        at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)

        at org.apache.hadoop.hbase.ipc.AbstractRpcClient.onCallFinished(AbstractRpcClient.java:387                                       )
        at org.apache.hadoop.hbase.ipc.AbstractRpcClient.access$100(AbstractRpcClient.java:95)
        at org.apache.hadoop.hbase.ipc.AbstractRpcClient$3.run(AbstractRpcClient.java:410)
        at org.apache.hadoop.hbase.ipc.AbstractRpcClient$3.run(AbstractRpcClient.java:406)
        at org.apache.hadoop.hbase.ipc.Call.callComplete(Call.java:103)
        at org.apache.hadoop.hbase.ipc.Call.setException(Call.java:118)
        at org.apache.hadoop.hbase.ipc.NettyRpcDuplexHandler.readResponse(NettyRpcDuplexHandler.ja                                       va:161)
        at org.apache.hadoop.hbase.ipc.NettyRpcDuplexHandler.channelRead(NettyRpcDuplexHandler.jav                                       a:191)
        at org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChanne                                       lRead(AbstractChannelHandlerContext.java:362)
        at org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChanne                                       lRead(AbstractChannelHandlerContext.java:348)
        at org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.fireChannelR                                       ead(AbstractChannelHandlerContext.java:340)
        at org.apache.hbase.thirdparty.io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead                                       (ByteToMessageDecoder.java:310)
        at org.apache.hbase.thirdparty.io.netty.handler.codec.ByteToMessageDecoder.channelRead(Byt                                       eToMessageDecoder.java:284)
        at org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChanne                                       lRead(AbstractChannelHandlerContext.java:362)
        at org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChanne                                       lRead(AbstractChannelHandlerContext.java:348)
        at org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.fireChannelR                                       ead(AbstractChannelHandlerContext.java:340)
        at org.apache.hbase.thirdparty.io.netty.handler.timeout.IdleStateHandler.channelRead(IdleS                                       tateHandler.java:286)
        at org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChanne                                       lRead(AbstractChannelHandlerContext.java:362)
        at org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChanne                                       lRead(AbstractChannelHandlerContext.java:348)
        at org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.fireChannelR                                       ead(AbstractChannelHandlerContext.java:340)
        at org.apache.hbase.thirdparty.io.netty.channel.DefaultChannelPipeline$HeadContext.channel                                       Read(DefaultChannelPipeline.java:1359)
        at org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChanne                                       lRead(AbstractChannelHandlerContext.java:362)
        at org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChanne                                       lRead(AbstractChannelHandlerContext.java:348)
        at org.apache.hbase.thirdparty.io.netty.channel.DefaultChannelPipeline.fireChannelRead(Def                                       aultChannelPipeline.java:935)
        at org.apache.hbase.thirdparty.io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStre                                       amUnsafe.epollInReady(AbstractEpollStreamChannel.java:801)
        at org.apache.hbase.thirdparty.io.netty.channel.epoll.EpollEventLoop.processReady(EpollEve                                       ntLoop.java:404)
        at org.apache.hbase.thirdparty.io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.ja                                       va:304)
        at org.apache.hbase.thirdparty.io.netty.util.concurrent.SingleThreadEventExecutor$5.run(Si                                       ngleThreadEventExecutor.java:858)
        at org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnab                                       leDecorator.run(DefaultThreadFactory.java:138)
        ... 1 more
2019-03-27 04:52:58,067 ERROR [regionserver/min1:16020] regionserver.HRegionServer: RegionServer a                                       bort: loaded coprocessors are: []
2019-03-27 04:52:58,102 INFO  [regionserver/min1:16020] regionserver.HRegionServer: Dump of metric                                       s as JSON on abort: {
  "beans" : [ {
    "name" : "java.lang:type=Memory",
    "modelerType" : "sun.management.MemoryImpl",
    "Verbose" : false,
    "ObjectPendingFinalizationCount" : 0,
    "NonHeapMemoryUsage" : {
      "committed" : 52895744,
      "init" : 2555904,
      "max" : -1,
      "used" : 51286120
    },
    "HeapMemoryUsage" : {
      "committed" : 46661632,
      "init" : 48234496,
      "max" : 735772672,
      "used" : 21758824
    },
    "ObjectName" : "java.lang:type=Memory"
  } ],
  "beans" : [ {
    "name" : "Hadoop:service=HBase,name=RegionServer,sub=IPC",
    "modelerType" : "RegionServer,sub=IPC",
    "tag.Context" : "regionserver",
    "tag.Hostname" : "min1"
  } ],
  "beans" : [ ],
  "beans" : [ ]
}

原因:

  • 上面说超时了,那是因为你的集群出问题的节点的时间没有和主节点同步,造成了一定的时间差,这个差超过了一定的限度,所以这个节点就被掉线了。

解决方法

  • 那就是重新做一下该节点的时间同步,减小时间差。时间同步的方法自行百度,教程较多。

你可能感兴趣的:(大数据,Hadoop,HBase)