region分裂有2种触发情景:1是用户手动触发(参见HRegionServer的splitRegion方法),2是后台flush线程flush完一个region的memstore时,会去检查这个region是否需要分裂(参见MemStoreFlushe的flushRegion方法)。这两种情景在代码实现上并无多大差异。
1.下面以手动的split为例分析,手动split有HregionServer的splitRegion开始
- @Override
- public void splitRegion(HRegionInfo regionInfo, byte[] splitPoint)
- throws NotServingRegionException, IOException {
- checkOpen();
- HRegion region = getRegion(regionInfo.getRegionName());
- region.flushcache();
- region.forceSplit(splitPoint);
- compactSplitThread.requestSplit(region, region.checkSplit());
- }
其中compactSplitThread.requestSplit(region, region.checkSplit())中region.checkSplit()会计算该region的分裂点,看代码
- public byte[] checkSplit() {
-
-
-
-
- if (!splitPolicy.shouldSplit()) {
- return null;
- }
-
- TODO midkey的获得还需深入细看
- byte[] ret = splitPolicy.getSplitPoint();
-
- return ret;
- }
2.接下来接着看SplitRequest的run方法主要部分
- SplitTransaction st = new SplitTransaction(parent, midKey);
- if (!st.prepare()) return;
- st.execute(this.server, this.server);
下面分析下SplitRequest的execute做了什么
- public PairOfSameType<HRegion> execute(final Server server,
- final RegionServerServices services)
- throws IOException {
- PairOfSameType<HRegion> regions = createDaughters(server, services);
- openDaughters(server, services, regions.getFirst(), regions.getSecond());
- transitionZKNode(server, services, regions.getFirst(), regions.getSecond());
- return regions;
- }
先分析
createDaughters
- PairOfSameType<HRegion> createDaughters(final Server server,
- final RegionServerServices services) throws IOException {
- this.fileSplitTimeout = testing ? this.fileSplitTimeout :
- server.getConfiguration().getLong("hbase.regionserver.fileSplitTimeout",
- this.fileSplitTimeout);
- if (server != null && server.getZooKeeper() != null) {
- try {
- createNodeSplitting(server.getZooKeeper(),
- this.parent.getRegionInfo(), server.getServerName());
- } catch (KeeperException e) {
- throw new IOException("Failed creating SPLITTING znode on " +
- this.parent.getRegionNameAsString(), e);
- }
- }
- createSplitDir(this.parent.getFilesystem(), this.splitdir);
- this.journal.add(JournalEntry.CREATE_SPLIT_DIR);
-
- List<StoreFile> hstoreFilesToSplit = null;
- Exception exceptionToThrow = null;
- try{
- hstoreFilesToSplit = this.parent.close(false);
- } catch (Exception e) {
- exceptionToThrow = e;
- }
-
- if (!testing) {
- services.removeFromOnlineRegions(this.parent.getRegionInfo().getEncodedName());
- }
- this.journal.add(JournalEntry.OFFLINED_PARENT);
-
- splitStoreFiles(this.splitdir, hstoreFilesToSplit);
-
- this.journal.add(JournalEntry.STARTED_REGION_A_CREATION);
- HRegion a = createDaughterRegion(this.hri_a, this.parent.rsServices);
- this.journal.add(JournalEntry.STARTED_REGION_B_CREATION);
- HRegion b = createDaughterRegion(this.hri_b, this.parent.rsServices);
-
- if (!testing) {
-
- MetaEditor.offlineParentInMeta(server.getCatalogTracker(), this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo());
- }
- }
StoreFile的split方法
- static Path split(final FileSystem fs, final Path splitDir, final StoreFile f, final byte [] splitRow, final Reference.Range range)
- throws IOException {
-
-
- if (range == Reference.Range.bottom) {
-
- KeyValue splitKey = KeyValue.createLastOnRow(splitRow);
- byte[] firstKey = f.createReader().getFirstKey();
-
- if (firstKey == null) return null;
- if (f.getReader().getComparator().compare(splitKey.getBuffer(),
- splitKey.getKeyOffset(), splitKey.getKeyLength(),
- firstKey, 0, firstKey.length) < 0) {
- return null;
- }
- }
- else {
-
- KeyValue splitKey = KeyValue.createFirstOnRow(splitRow);
- byte[] lastKey = f.createReader().getLastKey();
-
- if (lastKey == null) return null;
- if (f.getReader().getComparator().compare(splitKey.getBuffer(),
- splitKey.getKeyOffset(), splitKey.getKeyLength(),
- lastKey, 0, lastKey.length) > 0) {
- return null;
- }
- }
-
-
-
-
-
-
-
- Reference r = new Reference(splitRow, range);
- String parentRegionName = f.getPath().getParent().getParent().getName();
- Path p = new Path(splitDir, f.getPath().getName() + "." + parentRegionName);
- return r.write(fs, p);
- }
再来看一下openDaughters
- void openDaughters(final Server server,
- final RegionServerServices services, HRegion a, HRegion b)
- throws IOException {
-
-
- DaughterOpener aOpener = new DaughterOpener(server, a);
- DaughterOpener bOpener = new DaughterOpener(server, b);
- aOpener.start();
- bOpener.start();
-
- if (services != null) {
- try {
- services.postOpenDeployTasks(b, server.getCatalogTracker(), true);
- services.addToOnlineRegions(b);
- services.postOpenDeployTasks(a, server.getCatalogTracker(), true);
- services.addToOnlineRegions(a);
- } catch (KeeperException ke) {
- throw new IOException(ke);
- }
- }
- }
最后梳理下整个流程:
检查该region是否需要分裂,如果满足分裂条件,则通过region.checkSplit()拿到midkey,并把该分裂请求SplitRequest提交给后台的CompactSplitThread线程池去执行,SplitRequest内部会创建SplitTransaction来实现分裂,其过程如下:
* 根据该region和midkey
* 在zk上创建一个临时节点(名称为“/hbase/region-in-transition/region-name”的znode),以防regionserver在分裂过程中down掉,保存split状态为RS_ZK_REGION_SPLITTING,表示开始region分裂。同时因为master一直watch znode(/hbase/region-in-transition),所以master会知道这个region的变化,以防master对其进行move等操作
* 在该region所在的hdfs路径下创建.splits文件夹
* 关闭该region从regionserver的online服务中移除
* 生成类型为reference的storefile文件,比如encode name为a,column family为cf(该cf下有名为hfile的storefile)的region分裂后会形成名为b和c的引用文件,此时在hdfs中该region下的目录结构为
/hbase/tableName/a/cf/hfile
/hbase/tableName/b/.splits/cf/hfile.a
/hbase/tableName/c/.splits/cf/hfile.a
这两个引用文件的storefile的内容由原storefile的中间rowkey和range组成,reference文件的个数与原split region的storefile文件个数相同
*
* 并行打开两个daughters region,CompactSplitThread后台线程会compact有references的storefile,compact操作最终清理掉这些reference文件,把实际文件的内容写到daughters region中去。并将daughter region的regioninfo信息和location的位置信息put到.META.表中
* 添加region对象到regionserver的online列表中,终于可以对外提供服务了
转载请注明出处:http://blog.csdn.net/odailidong/article/details/42217439
参考文章:
http://blog.csdn.net/c77_cn/article/details/38758545
http://www.cnblogs.com/foxmailed/p/3970050.html