为了弄清楚hbase:meta表中的数据(每个表的region所在的regionserver信息)是否会在集群每次重启后重新生成。
将hbase:meta表中数据全部删除后重启集群,结果发现只有hbase:namespace的信息会重新生成并插入到hbase:meta表中,而其他所有用户表(用户自己创建的表)的信息都没有恢复。
这里又牵扯到另一个问题,如何修复hbase:meta表中的数据,用户表的数据是完整的,存储在hdfs上的数据也没有损坏,但是hbase:meta表中没有存储表的region信息,导致用户表无法访问。hbase提供了一个命令来修复hbase:meta表的数据,这个命令会读取hdfs中每个表对应的.regioninfo文件的内容并重新写入到hbase:meta表中
$ hbase hbck -fixMeta // Try to fix meta problems. This assumes HDFS region info is good.
执行这条命令后hbase:meta表中会插入每张表对应的region信息
test,,1437751474993.80af563e6 column=info:regioninfo, timestamp=1438252485548, value={ENCODED => 80af563e6a9d557b7274
a9d557b7274a20d6a5d1260. a20d6a5d1260, NAME => 'test,,1437751474993.80af563e6a9d557b7274a20d6a5d1260.', STARTKEY
=> '', ENDKEY => ''}
但是这时候用户表还是无法查询,因为每张用户表的每个region在hbase:meta表中都会存储4列值:info:regioninfo、info:seqnumDuringOpen、info:server、info:serverstartcode,其中info:server这列值标记着该region所在的regionserver节点的位置信息。单独重启了集群的regionserver节点,发现hbase:meta表中的数据并没有增加,重启整个集群后hbase:meta表中数据变的完整了,即每张用户表的每个region在hbase:meta表中都会存储4列值,如下:
test,,1437751474993.80af563e6 column=info:regioninfo, timestamp=1438252485548, value={ENCODED => 80af563e6a9d557b7274
a9d557b7274a20d6a5d1260. a20d6a5d1260, NAME => 'test,,1437751474993.80af563e6a9d557b7274a20d6a5d1260.', STARTKEY
=> '', ENDKEY => ''}
test,,1437751474993.80af563e6 column=info:seqnumDuringOpen, timestamp=1438253579793, value=\x00\x00\x00\x00\x00\x00\x
a9d557b7274a20d6a5d1260. 00\x0A
test,,1437751474993.80af563e6 column=info:server, timestamp=1438253579793, value=hadoop-pseudo.com.cn:60020
a9d557b7274a20d6a5d1260.
test,,1437751474993.80af563e6 column=info:serverstartcode, timestamp=1438253579793, value=1438253573258
a9d557b7274a20d6a5d1260.
再次重启HBase集群中发现info:seqnumDuringOpen、info:server、info:serverstartcode这三列值的timestamp值都有所改变,都插入了新版本的数据。
hbase:meta表的每个region对应的4列值,在集群重启的时候info:seqnumDuringOpen、info:server、info:serverstartcode值会进行更新,而info:regioninfo不会进行更新。
附:HBase启动时region的分配过程
http://hbase.apache.org/book.html#regions.arch.assignment
When HBase starts regions are assigned as follows (short version):
The Master invokes the AssignmentManager upon startup.
The AssignmentManager looks at the existing region assignments in hbase:meta.
If the region assignment is still valid (i.e., if the RegionServer is still online) then the assignment is kept.
If the assignment is invalid, then the LoadBalancerFactory is invoked to assign the region. The load balancer (StochasticLoadBalancer by default in HBase 1.0) assign the region to a RegionServer.
hbase:meta is updated with the RegionServer assignment (if needed) and the RegionServer start codes (start time of the RegionServer process) upon region opening by the RegionServer.