hbase的从节点全被我搞挂了

测试环境:
hbase 5个节点服务器。
服务器配置14cpu 60G


主要是记录这个hbase从节点全挂掉的过程和一些自己的心得, 这次hbase的HRegionServer全部死掉幸好是在测试环境,及早发现。

主要的报错: java.net.SocketException: Too many open files
我怀疑是写入的数据量过大, 后期的时候又要一直做compaction合并操作, 最后给弄挂掉了。

昨天加了一个新项目, spark streaming 使用bulk load模式落地数据到hbase, 将日志信息全都存到hbase。
今天早上一来看hbase的从节点全都挂了,master节点没挂。

正常bulk Loaded HFile的日志 写入hbase表: migu_datalog,
regionserver.HStore: Loaded HFile hdfs://migumaster/hbase/data/default/migu_datalog/

2020-01-15 18:01:46,961 INFO  [RpcServer.FifoWFPBQ.priority.handler=19,queue=1,port=6201] regionserver.HStore: Successfully loaded store file hdfs://migumaster/pub_stat_migu/hbasetmp/m/.tmp/b83b75ec2f494bbeb93db9e4b5b77bec.top into store m (new location: hdfs://migumaster/hbase/data/default/migu_datalog/02c94cf56f6dc07bc72fdd033c235fc5/m/95e3fc79bc754b12892c1510f1b1e35a_SeqId_4_)
2020-01-15 18:01:46,962 INFO  [RpcServer.FifoWFPBQ.priority.handler=19,queue=1,port=6201] regionserver.HStore: Loaded HFile hdfs://migumaster/pub_stat_migu/hbasetmp/m/.tmp/d3da72f0f34c4f5c98750d997dd69992.top into store 'm' as hdfs://migumaster/hbase/data/default/migu_datalog/02c94cf56f6dc07bc72fdd033c235fc5/m/843f97405a96455b9338d1006a190afd_SeqId_4_ - updating store file list.
2020-01-15 18:01:46,962 INFO  [RpcServer.FifoWFPBQ.priority.handler=15,queue=1,port=6201] regionserver.HStore: Loaded HFile hdfs://migumaster/hbase/data/default/migu_datalog/ee65b2e31a0fbd82e6c663ce45023b8f/m/fa2ab04fecca46859151c95156fa8b6c_SeqId_4_ into store 'm

大概从18点开始启动spark streaming 任务,


在18:23的时候bulk load 写入hbase存在了报错:这时hbase的从节点还没有挂掉, 还在运行, 后面spark streaming 写入hbase还在正常运行。

报错: java.io.IOException: Got error for OP_READ_BLOCK, status=ERROR, self=/10.186.59.94:46626, remote=/10.186.59.106:50010

2020-01-15 18:23:48,611 INFO  [RpcServer.FifoWFPBQ.priority.handler=14,queue=0,port=6201] regionserver.HStore: Loaded HFile hdfs://migumaster/pub_stat_migu/hbasetmp/m/.tmp/3193c7040dc4482fa268ad0af9f19380.bottom into store 'm' as hdfs://migumaster/hbase/data/default/migu_datalog/ee65b2e31a0fbd82e6c663ce45023b8f/m/cbed14d88bc64a65b9871d7e5c181d5e_SeqId_22_ - updating store file list.
2020-01-15 18:23:48,611 WARN  [RpcServer.FifoWFPBQ.priority.handler=13,queue=1,port=6201] hdfs.BlockReaderFactory: I/O error constructing remote block reader.
java.io.IOException: Got error for OP_READ_BLOCK, status=ERROR, self=/10.186.59.94:46626, remote=/10.186.59.106:50010, for file /pub_stat_migu/hbasetmp/m/.tmp/4e684f9ace4e4681ac6ad01a5d0fd914.top, for pool BP-266398130-10.186.59.129-1574389974472 block 1085670735_11936620
        at org.apache.hadoop.hdfs.RemoteBlockReader2.checkSuccess(RemoteBlockReader2.java:467)
        at org.apache.hadoop.hdfs.RemoteBlockReader2.newBlockReader(RemoteBlockReader2.java:432)
        at org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReader(BlockReaderFactory.java:881)
        at org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:759)
        at org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:376)
        at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:652)
        at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:879)
        at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:932)
        at java.io.DataInputStream.readFully(DataInputStream.java:195)
        at org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:391)
        at org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:482)
        at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:540)
        at org.apache.hadoop.hbase.regionserver.HStore.assertBulkLoadHFileOk(HStore.java:734)
        at org.apache.hadoop.hbase.regionserver.HRegion.bulkLoadHFiles(HRegion.java:5350)
        at org.apache.hadoop.hbase.regionserver.RSRpcServices.bulkLoadHFile(RSRpcServices.java:1950)
        at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:33650)
        at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2171)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:109)
        at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:185)
        at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:165)
2020-01-15 18:23:48,613 WARN  [RpcServer.FifoWFPBQ.priority.handler=13,queue=1,port=6201] hdfs.DFSClient: Failed to connect to /10.186.59.106:50010 for block, add to deadNodes and continue. java.io.IOException: Got error for OP_READ_BLOCK, status=ERROR, self=/10.186.59.94:46626, remote=/10.186.59.106:50010, for file /pub_stat_migu/hbasetmp/m/.tmp/4e684f9ace4e4681ac6ad01a5d0fd914.top, for pool BP-266398130-10.186.59.129-1574389974472 block 1085670735_11936620
java.io.IOException: Got error for OP_READ_BLOCK, status=ERROR, self=/10.186.59.94:46626, remote=/10.186.59.106:50010, for file /pub_stat_migu/hbasetmp/m/.tmp/4e684f9ace4e4681ac6ad01a5d0fd914.top, for pool BP-266398130-10.186.59.129-1574389974472 block 1085670735_11936620
        at org.apache.hadoop.hdfs.RemoteBlockReader2.checkSuccess(RemoteBlockReader2.java:467)
        at org.apache.hadoop.hdfs.RemoteBlockReader2.newBlockReader(RemoteBlockReader2.java:432)
        at org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReader(BlockReaderFactory.java:881)
        at org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:759)
        at org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:376)
        at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:652)
        at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:879)
        at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:932)
        at java.io.DataInputStream.readFully(DataInputStream.java:195)
        at org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:391)
        at org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:482)
        at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:540)
        at org.apache.hadoop.hbase.regionserver.HStore.assertBulkLoadHFileOk(HStore.java:734)
        at org.apache.hadoop.hbase.regionserver.HRegion.bulkLoadHFiles(HRegion.java:5350)
        at org.apache.hadoop.hbase.regionserver.RSRpcServices.bulkLoadHFile(RSRpcServices.java:1950)
        at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:33650)
        at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2171)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:109)
        at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:185)
        at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:165)

2020-01-15 19:24:49,415过后, 就没有spark streaming任务写入表migu_datalog了
hbase一直在compaction合并文件


2020-01-15 19:25:02,667 INFO  [regionserver/mg011.tigard.com/10.186.59.94:6201-shortCompactions-1575878052705] regionserver.HStore: Completed compaction of 3 file(s) in m of migu:download_log20200115,66,1579017595940.521cf9c3e5b59db4e20b5e6b841db190. into 532ab9392f6b4f41b4ce728e37d0fa85(size=340.4 K), total size for store is 1.7 G. This selection was in queue for 0sec, and took 0sec to execute.
2020-01-15 19:25:02,667 INFO  [regionserver/mg011.tigard.com/10.186.59.94:6201-shortCompactions-1575878052705] regionserver.CompactSplitThread: Completed compaction: Request = regionName=migu:download_log20200115,66,1579017595940.521cf9c3e5b59db4e20b5e6b841db190., storeName=m, fileCount=3, fileSize=346.6 K, priority=-447, time=5371180358519352; duration=0sec
2020-01-15 19:25:02,704 INFO  [regionserver/mg011.tigard.com/10.186.59.94:6201-shortCompactions-1575878052705] regionserver.HRegion: Starting compaction on m in region migu:download_log20200115,66,1579017595940.521cf9c3e5b59db4e20b5e6b841db190.
2020-01-15 19:25:02,704 INFO  [regionserver/mg011.tigard.com/10.186.59.94:6201-shortCompactions-1575878052705] regionserver.HStore: Starting compaction of 3 file(s) in m of migu:download_log20200115,66,1579017595940.521cf9c3e5b59db4e20b5e6b841db190. into tmpdir=hdfs://migumaster/hbase/data/migu/download_log20200115/521cf9c3e5b59db4e20b5e6b841db190/.tmp, totalSize=330.4 K
2020-01-15 19:25:02,712 INFO  [regionserver/mg011.tigard.com/10.186.59.94:6201-shortCompactions-1575878052705] hfile.CacheConfig: blockCache=LruBlockCache{blockCount=6624, currentSize=11647490360, freeSize=1086023368, maxSize=12733513728, heapSize=11647490360, minSize=12096837632, minFactor=0.95, multiSize=6048418816, multiFactor=0.5, singleSize=3024209408, singleFactor=0.25}, cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false, cacheBloomsOnWrite=false, cacheEvictOnClose=false, cacheDataCompressed=false, prefetchOnOpen=false
2020-01-15 19:25:02,731 INFO  [regionserver/mg011.tigard.com/10.186.59.94:6201-shortCompactions-1575878052705] regionserver.HStore: Completed compaction of 3 file(s) in m of migu:download_log20200115,66,1579017595940.521cf9c3e5b59db4e20b5e6b841db190. into c35c4fa4c81c43d2ae54d181159a70c3(size=331.5 K), total size for store is 1.7 G. This selection was in queue for 0sec, and took 0sec to execute.
2020-01-15 19:25:02,731 INFO  [regionserver/mg011.tigard.com/10.186.59.94:6201-shortCompactions-1575878052705] regionserver.CompactSplitThread: Completed compaction: Request = regionName=migu:download_log20200115,66,1579017595940.521cf9c3e5b59db4e20b5e6b841db190., storeName=m, fileCount=3, fileSize=330.4 K, priority=-445, time=5371180451211312; duration=0sec

另一个项目spark streaming写入hbase表: download_log20200115 也开始报错了。 后面就一直是类似报错的日志,报错不断。

报错: java.io.IOException: Got error for OP_READ_BLOCK, remote=/10.186.59.106:50010, for file /hbase/data/migu/download_log20200115/

2020-01-15 19:25:09,550 WARN  [regionserver/mg011.tigard.com/10.186.59.94:6201-shortCompactions-1575878052705] hdfs.DFSClient: Failed to connect to /10.186.59.106:50010 for block, add to deadNodes and continue. java.io.IOException: Got error for OP_READ_BLOCK, status=ERROR, self=/10.186.59.94:53468, remote=/10.186.59.106:50010, for file /hbase/data/migu/==**download_log20200115**==/521cf9c3e5b59db4e20b5e6b841db190/m/a9e33322849e45819277779b94199499, for pool BP-266398130-10.186.59.129-1574389974472 block 1085795181_12062989
java.io.IOException: Got error for OP_READ_BLOCK, status=ERROR, self=/10.186.59.94:53468, remote=/10.186.59.106:50010, for file /hbase/data/migu/download_log20200115/521cf9c3e5b59db4e20b5e6b841db190/m/a9e33322849e45819277779b94199499, for pool BP-266398130-10.186.59.129-1574389974472 block 1085795181_12062989
        at org.apache.hadoop.hdfs.RemoteBlockReader2.checkSuccess(RemoteBlockReader2.java:467)
        at org.apache.hadoop.hdfs.RemoteBlockReader2.newBlockReader(RemoteBlockReader2.java:432)
        at org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReader(BlockReaderFactory.java:881)
        at org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:759)
        at org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:376)
        at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:652)
        at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:879)
        at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:932)
        at java.io.DataInputStream.readFully(DataInputStream.java:195)
        at org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:391)
        at org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:482)
        at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:525)
        at org.apache.hadoop.hbase.regionserver.StoreFile$Reader.<init>(StoreFile.java:1107)
        at org.apache.hadoop.hbase.regionserver.StoreFileInfo.open(StoreFileInfo.java:265)
        at org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:404)
        at org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:509)
        at org.apache.hadoop.hbase.regionserver.StoreFileScanner.getScannersForStoreFiles(StoreFileScanner.java:144)
        at org.apache.hadoop.hbase.regionserver.StoreFileScanner.getScannersForStoreFiles(StoreFileScanner.java:162)
        at org.apache.hadoop.hbase.regionserver.StoreFileScanner.getScannersForStoreFiles(StoreFileScanner.java:127)
        at org.apache.hadoop.hbase.regionserver.compactions.Compactor.createFileScanners(Compactor.java:206)
        at org.apache.hadoop.hbase.regionserver.compactions.DefaultCompactor.compact(DefaultCompactor.java:71)
        at org.apache.hadoop.hbase.regionserver.DefaultStoreEngine$DefaultCompactionContext.compact(DefaultStoreEngine.java:131)
        at org.apache.hadoop.hbase.regionserver.HStore.compact(HStore.java:1245)
        at org.apache.hadoop.hbase.regionserver.HRegion.compact(HRegion.java:1860)
        at org.apache.hadoop.hbase.regionserver.CompactSplitThread$CompactionRunner.doCompaction(CompactSplitThread.java:529)
        at org.apache.hadoop.hbase.regionserver.CompactSplitThread$CompactionRunner.run(CompactSplitThread.java:566)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)

2020-01-15 19:25之后项目spark streaming写入hbase表: download_log20200115 又开始正常运行了。。。。。

2020-01-15 20:24:36,025 INFO  [RpcServer.FifoWFPBQ.priority.handler=18,queue=0,port=6201] regionserver.HStore: Successfully loaded store file hdfs://migumaster/pub_stat_migu/hbasetmp/m/.tmp/e2dcec6f4f1c460c90c4a97b2ce14701.bottom into store m (new location: hdfs://migumaster/hbase/data/default/migu_datalog/b97ca1522c5f096036975ba2fc6dcae7/m/5d477cdbeff94ba9a3331d9d30a5f624_SeqId_121_)
2020-01-15 20:24:36,026 INFO  [RpcServer.FifoWFPBQ.priority.handler=18,queue=0,port=6201] regionserver.HStore: Loaded HFile hdfs://migumaster/pub_stat_migu/hbasetmp/m/.tmp/f637225222a84c0abd20f37af252ae57.bottom into store 'm' as hdfs://migumaster/hbase/data/default/migu_datalog/b97ca1522c5f096036975ba2fc6dcae7/m/da81db4697f143aa8251c209f8bb30a3_SeqId_121_ - updating store file list.
2020-01-15 20:24:36,028 INFO  [RpcServer.FifoWFPBQ.priority.handler=18,queue=0,port=6201] regionserver.HStore: Loaded HFile hdfs://migumaster/hbase/data/default/migu_datalog/b97ca1522c5f096036975ba2fc6dcae7/m/da81db4697f143aa8251c209f8bb30a3_SeqId_121_ into store 'm

然后一直到2020-01-15 21:51:57,开始报错了, 后面就一直在报错, 这时候猜测hbase的从节点都挂掉了。

报错都是: java.net.SocketException: Too many open files

2020-01-15 21:51:57,782 WARN  [RpcServer.FifoWFPBQ.priority.handler=14,queue=0,port=6201] hdfs.BlockReaderFactory: I/O error constructing remote block reader.
java.net.SocketException: Too many open files
        at sun.nio.ch.Net.socket0(Native Method)
        at sun.nio.ch.Net.socket(Net.java:423)
        at sun.nio.ch.Net.socket(Net.java:416)
        at sun.nio.ch.SocketChannelImpl.<init>(SocketChannelImpl.java:104)
        at sun.nio.ch.SelectorProviderImpl.openSocketChannel(SelectorProviderImpl.java:60)
        at java.nio.channels.SocketChannel.open(SocketChannel.java:142)
        at org.apache.hadoop.net.StandardSocketFactory.createSocket(StandardSocketFactory.java:62)
        at org.apache.hadoop.hdfs.DFSClient.newConnectedPeer(DFSClient.java:3526)
        at org.apache.hadoop.hdfs.BlockReaderFactory.nextTcpPeer(BlockReaderFactory.java:840)
        at org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:755)
        at org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:376)
        at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:652)
        at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:879)
        at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:932)
        at java.io.DataInputStream.readFully(DataInputStream.java:195)
        at org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:391)
        at org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:482)
        at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:540)
        at org.apache.hadoop.hbase.regionserver.HStore.assertBulkLoadHFileOk(HStore.java:734)
        at org.apache.hadoop.hbase.regionserver.HRegion.bulkLoadHFiles(HRegion.java:5350)
        at org.apache.hadoop.hbase.regionserver.RSRpcServices.bulkLoadHFile(RSRpcServices.java:1950)
        at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:33650)
        at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2171)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:109)
        at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:185)
        at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:165)
2020-01-15 21:51:57,782 WARN  [RpcServer.FifoWFPBQ.priority.handler=14,queue=0,port=6201] hdfs.DFSClient: Failed to connect to /10.186.59.34:50010 for block, add to deadNodes and continue. java.net.SocketException: Too many open files
java.net.SocketException: Too many open files
        at sun.nio.ch.Net.socket0(Native Method)
        at sun.nio.ch.Net.socket(Net.java:423)
        at sun.nio.ch.Net.socket(Net.java:416)
        at sun.nio.ch.SocketChannelImpl.<init>(SocketChannelImpl.java:104)
        at sun.nio.ch.SelectorProviderImpl.openSocketChannel(SelectorProviderImpl.java:60)
        at java.nio.channels.SocketChannel.open(SocketChannel.java:142)
        at org.apache.hadoop.net.StandardSocketFactory.createSocket(StandardSocketFactory.java:62)
        at org.apache.hadoop.hdfs.DFSClient.newConnectedPeer(DFSClient.java:3526)
        at org.apache.hadoop.hdfs.BlockReaderFactory.nextTcpPeer(BlockReaderFactory.java:840)
        at org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:755)
        at org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:376)
        at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:652)
        at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:879)
        at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:932)
        at java.io.DataInputStream.readFully(DataInputStream.java:195)
        at org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:391)
        at org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:482)
        at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:540)
        at org.apache.hadoop.hbase.regionserver.HStore.assertBulkLoadHFileOk(HStore.java:734)
        at org.apache.hadoop.hbase.regionserver.HRegion.bulkLoadHFiles(HRegion.java:5350)
        at org.apache.hadoop.hbase.regionserver.RSRpcServices.bulkLoadHFile(RSRpcServices.java:1950)
        at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:33650)
        at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2171)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:109)
        at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:185)
        at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:165)
2020-01-15 21:51:57,783 INFO  [RpcServer.FifoWFPBQ.priority.handler=18,queue=0,port=6201] regionserver.HStore: Validating hfile at hdfs://migumaster/pub_stat_migu/hbasetmp/m/.tmp/2ea31005a32f4abda6a460438cf86673.bottom for inclusion in store m region migu:download_log20200115,,1579067591346.b3265104d0bdc93014b70ed6e88c6e81.

2020-01-15 22:00:42,582 INFO  [RpcServer.FifoWFPBQ.priority.handler=14,queue=0,port=6201] regionserver.HStore: Validating hfile at hdfs://migumaster/pub_stat_migu/hbasetmp/m/.tmp/c8b29ca517d249c0924cb846ff6517b5.bottom for inclusion in store m region migu_datalog,10,1579082358075.b97ca1522c5f096036975ba2fc6dcae7.
2020-01-15 22:00:42,583 WARN  [RpcServer.FifoWFPBQ.priority.handler=14,queue=0,port=6201] ipc.Client: Address change detected. Old: www.migu-cdn-hadoop01.migu01.mmtrix.com/10.186.59.128:9000 New: www.migu-cdn-hadoop01.migu01.mmtrix.com:9000
2020-01-15 22:00:42,583 INFO  [RpcServer.FifoWFPBQ.priority.handler=14,queue=0,port=6201] retry.RetryInvocationHandler: Exception while invoking getBlockLocations of class ClientNamenodeProtocolTranslatorPB over www.migu-cdn-hadoop01.migu01.mmtrix.com/10.186.59.128:9000. Trying to fail over immediately.
java.io.IOException: Failed on local exception: java.net.SocketException: Too many open files; Host Details : local host is: "java.net.UnknownHostException: mg011.tigard.com: mg011.tigard.com: System error"; destination host is: "www.migu-cdn-hadoop01.migu01.mmtrix.com":9000;
        at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
        at org.apache.hadoop.ipc.Client.call(Client.java:1506)
        at org.apache.hadoop.ipc.Client.call(Client.java:1439)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
        at com.sun.proxy.$Proxy19.getBlockLocations(Unknown Source)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getBlockLocations(ClientNamenodeProtocolTranslatorPB.java:256)
        at sun.reflect.GeneratedMethodAccessor10.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
        at com.sun.proxy.$Proxy20.getBlockLocations(Unknown Source)
        at sun.reflect.GeneratedMethodAccessor10.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:279)
        at com.sun.proxy.$Proxy21.getBlockLocations(Unknown Source)
        at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1279)
        at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1266)
        at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1254)
        at org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:305)
        at org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:271)
        at org.apache.hadoop.hdfs.DFSInputStream.<init>(DFSInputStream.java:263)
        at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1585)
        at org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:309)
        at org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:305)
        at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:305)
        at org.apache.hadoop.fs.FilterFileSystem.open(FilterFileSystem.java:162)
        at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:781)
        at org.apache.hadoop.hbase.io.FSDataInputStreamWrapper.<init>(FSDataInputStreamWrapper.java:106)
        at org.apache.hadoop.hbase.io.FSDataInputStreamWrapper.<init>(FSDataInputStreamWrapper.java:78)
        at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:539)
        at org.apache.hadoop.hbase.regionserver.HStore.assertBulkLoadHFileOk(HStore.java:734)
        at org.apache.hadoop.hbase.regionserver.HRegion.bulkLoadHFiles(HRegion.java:5350)
        at org.apache.hadoop.hbase.regionserver.RSRpcServices.bulkLoadHFile(RSRpcServices.java:1950)
        at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:33650)
        at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2171)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:109)
        at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:185)
        at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:165)
Caused by: java.net.SocketException: Too many open files
        at sun.nio.ch.Net.socket0(Native Method)
        at sun.nio.ch.Net.socket(Net.java:423)
        at sun.nio.ch.Net.socket(Net.java:416)
        at sun.nio.ch.SocketChannelImpl.<init>(SocketChannelImpl.java:104)
        at sun.nio.ch.SelectorProviderImpl.openSocketChannel(SelectorProviderImpl.java:60)
        at java.nio.channels.SocketChannel.open(SocketChannel.java:142)
        at org.apache.hadoop.net.StandardSocketFactory.createSocket(StandardSocketFactory.java:62)
        at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:623)
        at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:744)
        at org.apache.hadoop.ipc.Client$Connection.access$3000(Client.java:396)
        at org.apache.hadoop.ipc.Client.getConnection(Client.java:1555)
        at org.apache.hadoop.ipc.Client.call(Client.java:1478)
        ... 39 more

你可能感兴趣的:(hbase,linux运维)