HBase 填坑之RegionServers异常退出

用HBase的过程中遇到很多问题,这个问题是最无头绪的,而且影响最大的。记录下来免其他人一样躺坑。

1.问题表现

RegionServers日志里报NPE,然后程序就退出了,每天不定时出现,晚上任务繁忙时,出现频率更高。

2019-11-18 18:24:55,049 ERROR [regionserver/hdpv-014/125.94.213.41:16020-shortCompactions-1574050678424] coprocessor.CoprocessorHost: The coprocessor org.apache.hadoop.hbase.regionserver.IndexHalfStoreFileReaderGenerator threw java.lang.NullPointerException
java.lang.NullPointerException
at org.apache.hadoop.hbase.KeyValue.createKeyValueFromKey(KeyValue.java:2414)
at org.apache.phoenix.util.RepairUtil.isLocalIndexStoreFilesConsistent(RepairUtil.java:32)
at org.apache.hadoop.hbase.regionserver.IndexHalfStoreFileReaderGenerator.preCompactScannerOpen(IndexHalfStoreFileReaderGenerator.java:196)
at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost 6. c a l l ( R e g i o n C o p r o c e s s o r H o s t . j a v a : 499 ) a t o r g . a p a c h e . h a d o o p . h b a s e . r e g i o n s e r v e r . R e g i o n C o p r o c e s s o r H o s t 6.call(RegionCoprocessorHost.java:499) at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost 6.call(RegionCoprocessorHost.java:499)atorg.apache.hadoop.hbase.regionserver.RegionCoprocessorHostRegionOperation.call(RegionCoprocessorHost.java:1660)
at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.execOperation(RegionCoprocessorHost.java:1734)
at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.execOperationWithResult(RegionCoprocessorHost.java:1699)
at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.preCompactScannerOpen(RegionCoprocessorHost.java:494)
at org.apache.hadoop.hbase.regionserver.compactions.Compactor.preCreateCoprocScanner(Compactor.java:362)
at org.apache.hadoop.hbase.regionserver.compactions.Compactor.compact(Compactor.java:299)
at org.apache.hadoop.hbase.regionserver.compactions.DefaultCompactor.compact(DefaultCompactor.java:69)
at org.apache.hadoop.hbase.regionserver.DefaultStoreEngine D e f a u l t C o m p a c t i o n C o n t e x t . c o m p a c t ( D e f a u l t S t o r e E n g i n e . j a v a : 133 ) a t o r g . a p a c h e . h a d o o p . h b a s e . r e g i o n s e r v e r . H S t o r e . c o m p a c t ( H S t o r e . j a v a : 1256 ) a t o r g . a p a c h e . h a d o o p . h b a s e . r e g i o n s e r v e r . H R e g i o n . c o m p a c t ( H R e g i o n . j a v a : 1972 ) a t o r g . a p a c h e . h a d o o p . h b a s e . r e g i o n s e r v e r . C o m p a c t S p l i t T h r e a d DefaultCompactionContext.compact(DefaultStoreEngine.java:133) at org.apache.hadoop.hbase.regionserver.HStore.compact(HStore.java:1256) at org.apache.hadoop.hbase.regionserver.HRegion.compact(HRegion.java:1972) at org.apache.hadoop.hbase.regionserver.CompactSplitThread DefaultCompactionContext.compact(DefaultStoreEngine.java:133)atorg.apache.hadoop.hbase.regionserver.HStore.compact(HStore.java:1256)atorg.apache.hadoop.hbase.regionserver.HRegion.compact(HRegion.java:1972)atorg.apache.hadoop.hbase.regionserver.CompactSplitThreadCompactionRunner.doCompaction(CompactSplitThread.java:536)
at org.apache.hadoop.hbase.regionserver.CompactSplitThread C o m p a c t i o n R u n n e r . r u n ( C o m p a c t S p l i t T h r e a d . j a v a : 573 ) a t j a v a . u t i l . c o n c u r r e n t . T h r e a d P o o l E x e c u t o r . r u n W o r k e r ( T h r e a d P o o l E x e c u t o r . j a v a : 1142 ) a t j a v a . u t i l . c o n c u r r e n t . T h r e a d P o o l E x e c u t o r CompactionRunner.run(CompactSplitThread.java:573) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor CompactionRunner.run(CompactSplitThread.java:573)atjava.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)atjava.util.concurrent.ThreadPoolExecutorWorker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
……
2019-11-18 18:24:58,581 ERROR [main] regionserver.HRegionServerCommandLine: Region server exiting
java.lang.RuntimeException: HRegionServer Aborted
at org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.start(HRegionServerCommandLine.java:68)
at org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.run(HRegionServerCommandLine.java:87)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126)
at org.apache.hadoop.hbase.regionserver.HRegionServer.main(HRegionServer.java:2801)
2019-11-18 18:24:58,649 INFO [pool-4-thread-1] regionserver.ShutdownHook: Shutdown hook starting; hbase.shutdown.hook=true; fsShutdownHook=org.apache.hadoop.fs.FileSystem C a c h e Cache CacheClientFinalizer@344344fa
2019-11-18 18:24:58,649 INFO [pool-4-thread-1] regionserver.ShutdownHook: Starting fs shutdown hook thread.
2019-11-18 18:24:58,650 INFO [pool-4-thread-1] regionserver.ShutdownHook: Shutdown hook finished.

从日志上看有价值的信息是:

coprocessor.CoprocessorHost: The coprocessor org.apache.hadoop.hbase.regionserver.IndexHalfStoreFileReaderGenerator threw java.lang.NullPointerException

回想最近的操作,是把Index都删除。难道表损坏了,但每天定时的导入导出是正常的呀。网上也查不出个之所以然。

问题存在好久,每天都要手动启动挂掉的RegionServers,好在RegionServers的数量比较多,挂个几台不影响业务,但问题总是要解决了。终于,在官方的Bug列表里找到对应问题。

https://issues.apache.org/jira/browse/PHOENIX-3759 Dropping a local index causes NPE

创建本地索引时,会创建一个名叫“L#0”的列族,如果把本地索引都删掉,“L#0”列族对应的HFile就变成空的了,在compaction压缩时,这个空的HFile会引起主键为空的异常。

2.解决的方法:

当删除所有本地索引时,需要把表的“L#0”列族手动删除。操作命令。

hbase shell
alter 'TABLE','delete'=>'L#0'

你可能感兴趣的:(HBase)