HBase .META. Region启动不成功

启动region server的时候报如下错误:

 

2013-09-09 11:23:05,863 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0
2013-09-09 11:23:08,874 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0
2013-09-09 11:23:11,898 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0
2013-09-09 11:24:15,344 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Stats: total=2.05 MB, free=247.44 MB, max=249.48 MB, blocks=0, accesses=0, hits=0, hitRatio=0, cachingAccesses=0, cachingHits=0, cachingHitsRatio=0, evictions=0, evicted=0, evictedPerRun=NaN
2013-09-09 11:24:19,977 ERROR org.apache.hadoop.hbase.regionserver.wal.HLog: Can't open after 300 attempts and 300518ms  for hdfs://opentsdb:8020/hbase/.logs/opentsdb,60020,1378358082016-splitting/opentsdb%2C60020%2C1378358082016.1378397697610
2013-09-09 11:24:19,978 INFO org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Processed 0 edits across 0 regions threw away edits for 0 regions; log file=hdfs://opentsdb:8020/hbase/.logs/opentsdb,60020,1378358082016-splitting/opentsdb%2C60020%2C1378358082016.1378397697610 is corrupted = false progress failed = false
2013-09-09 11:24:19,978 WARN org.apache.hadoop.hbase.regionserver.SplitLogWorker: log splitting of hdfs://opentsdb:8020/hbase/.logs/opentsdb,60020,1378358082016-splitting/opentsdb%2C60020%2C1378358082016.1378397697610 failed, returning error
java.io.IOException: Cannot obtain block length for LocatedBlock{BP-17274449-192.168.0.75-1376541308222:blk_4420133534962983319_1645; getBlockSize()=0; corrupt=false; offset=0; locs=[192.168.0.75:50010]}
    at org.apache.hadoop.hdfs.DFSInputStream.readBlockLength(DFSInputStream.java:319)
    at org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:263)
    at org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:205)
    at org.apache.hadoop.hdfs.DFSInputStream.<init>(DFSInputStream.java:198)
    at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1117)
    at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:249)
    at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:82)
    at org.apache.hadoop.io.SequenceFile$Reader.openFile(SequenceFile.java:1787)
    at org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader$WALReader.openFile(SequenceFileLogReader.java:62)
    at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1707)
    at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1728)
    at org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader$WALReader.<init>(SequenceFileLogReader.java:55)
    at org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.init(SequenceFileLogReader.java:177)
    at org.apache.hadoop.hbase.regionserver.wal.HLog.getReader(HLog.java:713)
    at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.getReader(HLogSplitter.java:825)
    at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.getReader(HLogSplitter.java:738)
    at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.splitLogFile(HLogSplitter.java:382)
    at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.splitLogFile(HLogSplitter.java:350)
    at org.apache.hadoop.hbase.regionserver.SplitLogWorker$1.exec(SplitLogWorker.java:115)
    at org.apache.hadoop.hbase.regionserver.SplitLogWorker.grabTask(SplitLogWorker.java:283)
    at org.apache.hadoop.hbase.regionserver.SplitLogWorker.taskLoop(SplitLogWorker.java:214)
    at org.apache.hadoop.hbase.regionserver.SplitLogWorker.run(SplitLogWorker.java:182)
    at java.lang.Thread.run(Thread.java:662)

 

从错误上可以看出,ROOT Region没在线, Region Server的拆分HLog的时候,由于获取HLog的长度时,发生错误,导致失败.查询Region Server状态的时候,发现确实没有.META. Region,猜测该HLog文件损坏了.

hbase hlog /hbase/.logs/opentsdb,60020,1378358082016-splitting/opentsdb%2C60020%2C1378358082016.1378397697610

查看HLog,和上面报的错误时一样的,删除该log文件,hadoop fs -rmr /hbase/.logs/opentsdb,60020,1378358082016-splitting/opentsdb%2C60020%2C1378358082016.1378397697610

重新启动Region Server就可以了,问题的原因是,用户在向Hbase插入数据时,强制停掉了RS,使HLog文件出错.

在网上查看Region is not online: -ROOT-,,0相关的错误,也没有得到正确的答案,后来看了一下源码,报这个错误的地方是在:

protected HRegion getRegion(final byte[] regionName)
      throws NotServingRegionException {
    HRegion region = null;
    region = getOnlineRegion(regionName);
    if (region == null) {
      throw new NotServingRegionException("Region is not online: " +
        Bytes.toStringBinary(regionName));
    }
    return region;
  }

也就是说,regionName不再Map中,就会报这个错误,具体问题还得具体分析

你可能感兴趣的:(hbase)