hbase踩过的坑

Hbase踩过的坑

1.hbase 执行list命令报错:
hbase(main):001:0> list
TABLE

ERROR: org.apache.hadoop.hbase.PleaseHoldException: Master is initializing
at org.apache.hadoop.hbase.master.HMaster.checkInitialized(HMaster.java:2293)
at org.apache.hadoop.hbase.master.MasterRpcServices.getTableNames(MasterRpcServices.java:900)
at org.apache.hadoop.hbase.protobuf.generated.MasterProtos MasterService M a s t e r S e r v i c e 2.callBlockingMethod(MasterProtos.java:55650)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2180)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112)
at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133)
at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108)
at java.lang.Thread.run(Thread.java:748)

Here is some help for this command:
List all tables in hbase. Optional regular expression parameter could
be used to filter the output. Examples:

hbase> list
hbase> list ‘abc.*’
hbase> list ‘ns:abc.*’
hbase> list ‘ns:.*’

解决方案:
1)停掉hbase集群 ./stop-hbase.sh
2)单独启动所有slaves节点 ./hbase-daemon.sh start regionserver
3)启动master节点 ./hbase-daemon.sh start master
问题解决:
hbase(main):001:0> list
TABLE
bigdata_demo
myHbase
t1
3 row(s) in 0.2850 seconds
=> [“bigdata_demo”, “myHbase”, “t1”]

2.对hbase对t1表进行put操作时遇到了这个问题:
ERROR: Failed 1 action: No server address listed in hbase:meta for region t1,,1536659773616.09db0b8b3b7f8cd81dde86c9f1e41306. containing row rowkey001: 1 time,

执行scan 扫描表同样报错:
hbase(main):004:0> scan ‘t1’
ROW COLUMN+CELL

ERROR: No server address listed in hbase:meta for region t1,,1536659773616.09db0b8b3b7f8cd81dde86c9f1e41306. containing row

问题排查:
1) 执行命令 scan ‘hbase:meta’ , {LIMIT=>10,FILTER=>”PrefixFilter(‘hbase’)”}

hbase(main):005:0> scan ‘hbase:meta’ , {LIMIT=>10,FILTER=>”PrefixFilter(‘hbase’)”}
ROW COLUMN+CELL
hbase:namespace,,1536546667301.02 column=info:regioninfo, timestamp=1536546668630, value={ENCODED => 022f9ae2f72c020bbe2bac702065cd18,
2f9ae2f72c020bbe2bac702065cd18. NAME => ‘hbase:namespace,,1536546667301.022f9ae2f72c020bbe2bac702065cd18.’, STARTKEY => ”, ENDKEY
=> ”}
hbase:namespace,,1536546667301.02 column=info:seqnumDuringOpen, timestamp=1536659508761, value=\x00\x00\x00\x00\x00\x00\x00\x8F
2f9ae2f72c020bbe2bac702065cd18.
hbase:namespace,,1536546667301.02 column=info:server, timestamp=1536659508761, value=ddt-server3:16020
2f9ae2f72c020bbe2bac702065cd18.
hbase:namespace,,1536546667301.02 column=info:serverstartcode, timestamp=1536659508761, value=1536659470184
2f9ae2f72c020bbe2bac702065cd18.
1 row(s) in 0.0250 seconds
Hbase表正常
2) 执行命令查看t1表 scan ‘hbase:meta’ , {LIMIT=>10,FILTER=>”PrefixFilter(‘t1’)”}
hbase(main):004:0> scan ‘hbase:meta’ , {LIMIT=>10,FILTER=>”PrefixFilter(‘t1’)”}
ROW COLUMN+CELL
t1,,1536659773616.09db0b8b3b7f8cd column=info:regioninfo, timestamp=1536659775077, value={ENCODED => 09db0b8b3b7f8cd81dde86c9f1e41306,
81dde86c9f1e41306. NAME => ‘t1,,1536659773616.09db0b8b3b7f8cd81dde86c9f1e41306.’, STARTKEY => ”, ENDKEY => ”}
1 row(s) in 0.0860 seconds

只有info:regioninfo,没有info:server,没有给t1表分配region

之前集群服务器重启过,怀疑时间同步出了问题,导致没有给t1表分配region,之前已经设置了时间同步服务开机启动,所以直接执行命令查看节点的时间同步情况,
[root@ddt-server3 bin]# ntpdate ddt-server1
12 Sep 09:31:26 ntpdate[10011]: the NTP socket is in use, exiting
结果显示同步ddt-server1节点的时间异常,
先停止时间同步 service ntpd stop:
[root@ddt-server3 bin]# service ntpd stop
Redirecting to /bin/systemctl stop ntpd.service
执行ntpdate ntp.api.bz 重新同步
[root@ddt-server3 bin]# ntpdate ntp.api.bz
12 Sep 09:34:24 ntpdate[10028]: adjust time server 120.25.108.11 offset 0.288812 sec
重启hadoop集群:
./stop-dfs.sh
./start-dfs.sh
重启hbase集群:
./stop-hbase.sh
./start-hbase.sh

如果./start-hbase.sh启动失败,那就,先启动所有slaves,再启动master
启动slaves:./hbase-daemon.sh start regionserver
启动master:./hbase-daemon.sh start master

验证结果:
执行scan ‘hbase:meta’ , {LIMIT=>10,FILTER=>”PrefixFilter(‘t1’)”}
hbase(main):004:0> scan ‘hbase:meta’ , {LIMIT=>10,FILTER=>”PrefixFilter(‘t1’)”}
ROW COLUMN+CELL
t1,,1536659773616.09db0b8b3b7f8cd81d column=info:regioninfo, timestamp=1536659775077, value={ENCODED => 09db0b8b3b7f8cd81dde86c9f1e4130
de86c9f1e41306. > ‘t1,,1536659773616.09db0b8b3b7f8cd81dde86c9f1e41306.’, STARTKEY => ”, ENDKEY => ”}
t1,,1536659773616.09db0b8b3b7f8cd81d column=info:seqnumDuringOpen, timestamp=1536716214137, value=\x00\x00\x00\x00\x00\x00\x00\x02
de86c9f1e41306.
t1,,1536659773616.09db0b8b3b7f8cd81d column=info:server, timestamp=1536716214137, value=ddt-server3:16020
de86c9f1e41306.
t1,,1536659773616.09db0b8b3b7f8cd81d column=info:serverstartcode, timestamp=1536716214137, value=1536716203679
de86c9f1e41306.
1 row(s) in 0.0750 seconds
执行put命令
hbase(main):003:0> put ‘t1’,’rowkey001’,’f1:col1’,’value01’
0 row(s) in 0.1640 seconds

执行scan命令:
hbase(main):004:0> scan ‘t1’
ROW COLUMN+CELL
rowkey001 column=f1:col1, timestamp=1536733356294, value=value01
1 row(s) in 0.0160 seconds

3.hbase其中一张表异常
hbase(main):008:0> scan ‘bigdata_demo’
ROW COLUMN+CELL

ERROR: org.apache.hadoop.hbase.NotServingRegionException: Region bigdata_demo,,1536716834694.56aa00285fdd1d7d05d6843187556cf3. is not online on ddt-server3,16020,1536733192855
at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:2922)
at org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1059)
at org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:2393)
at org.apache.hadoop.hbase.protobuf.generated.ClientProtos ClientService C l i e n t S e r v i c e 2.callBlockingMethod(ClientProtos.java:33648)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2180)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112)
at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133)
at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108)
at java.lang.Thread.run(Thread.java:748)
执行hbase hbck命令
[root@ddt-server2 bin]# hbase hbck
OpenJDK 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
OpenJDK 64-Bit Server VM warning: If the number of processors is expected to increase from one, then you should configure the number of parallel GC threads appropriately using -XX:ParallelGCThreads=N
……
Table t_dict is okay.
Number of regions: 1
Deployed on: ddt-server3,16020,1536748681319
Table hbase:meta is okay.
Number of regions: 1
Deployed on: ddt-server3,16020,1536748681319
Table t1 is okay.
Number of regions: 1
Deployed on: ddt-server3,16020,1536748681319
Table hbase:namespace is okay.
Number of regions: 1
Deployed on: ddt-server3,16020,1536748681319
Table bigdata_demo is okay.
Number of regions: 0
Deployed on:
Table myHbase is okay.
Number of regions: 1
Deployed on: ddt-server3,16020,1536748681319
1 inconsistencies detected.

发现region挂掉了:
Table bigdata_demo is okay.
Number of regions: 0
Deployed on:
解决方案:
1)停掉hbase集群 ./stop-hbase.sh
2)单独启动所有slaves节点 ./hbase-daemon.sh start regionserver
3)启动master节点 ./hbase-daemon.sh start master

你可能感兴趣的:(hbase踩过的坑)