HBase Region操作实战分析之Store Compaction

在上一篇HBase Region操作实战分析之Mem Flush里,我从代码和日志层面介绍了region的memstore flush逻辑
在文章最后提到不断的flush会导致store目录下的storefile越来越多,则会触发region的(minor/major)compaction


配置项


hbase.hstore.compaction.max 一次minor compaction的最大file数
hbase.hstore.compactionThreshold 一次minor compaction的最小file数
hbase.hstore.blockingStoreFiles file数超过此配置,则block flush


我们知道,region进行flush,是因为region的memstoresize>hbase.hregion.memstore.flush.size 条件成立,引发region下所有store(cf)进行flush,在每个store进行flush时,会检测flush之后的文件数是否大于hbase.hstore.compactionThreshold(c-1),如果有一个store在flush时满足(c-1)条件,则在整个region flush完毕之后,会触发一次整个region的compaction.


代码如下:
for (StoreFlusher flusher : storeFlushers) {
        boolean needsCompaction = flusher.commit();
        if (needsCompaction) {
          compactionRequested = true;
        }
      }

flusher.commit函数里层层调用其他函数,其中重点是return this.storefiles.size() >= this.compactionThreshold;

综上,当一个store需要compact的时候,会引发整个region的compact.下面我们来看compact是怎么做的.

---------------------------------------------------------------------------------------------------------------------------------------------

1)

MemStoreFlusher::flushRegion完成之后,如果需要compact,则通知调用compactSplitThread.requestCompaction


if (region.flushcache()) 

{

        server.compactSplitThread.requestCompaction(region, getName());
}


2)

compactSplitThread线程在run函数里不停的获取需要compact的region信息,调用HRegion::r.compactStores进行region compaction.
byte [] midKey = r.compactStores();

3)
在HRegion::r.compactStores里,就调用Store::compact循环对每个store进行compact
for (Store store: stores.values()) {
            final Store.StoreSize ss = store.compact(majorCompaction);
            lastCompactSize += store.getLastCompactSize();
            if (ss != null && ss.getSize() > maxSize) {
              maxSize = ss.getSize();
              splitRow = ss.getSplitRow();
            }
          }

4)
在Store::compact函数里,首先进行了isMajorCompaction的调用,此函数检查storefiles里年龄最老的file,如果发现最老的file时间大于当前时间减去major compaction的时间,则compaction会提升为一次major compaction(major会在compact时删除需要删除的keyvalue 或者过期的keyvalue,minor compaction just merge files)
lowTimestamp < (now - this.majorCompactionTime)

5)
接下来会选择不超过hbase.hstore.compaction.max个storefiles进行compact.(这里有比较复杂的选择算法,略过),将这些filesToCompact,传给
private StoreFile.Writer compact(final List<StoreFile> filesToCompact,final boolean majorCompaction, final long maxId)
函数,进行compact.结果会放在.tmp目录下

6)

最后,将.tmp下的文件move到对应store下,完成一个store的compact.


---------------------------------------------------------------------------------------------------------------------------------------------

下面我们结合日志来看看流程:

1)里面调用requestCompaction之后会打出这条log

2011-11-01 22:08:13,395 DEBUG org.apache.hadoop.hbase.regionserver.CompactSplitThread:Compaction requested for acookie_log_201111,,1320112381265.72ba7278e7fdc44dcd0cd2d33ea64e21. because regionserver60020.cacheFlusher; priority=12, compaction queue size=11


2)里面调用HRegion::compactStores会打出这条log
2011-11-01 22:09:06,253 INFO org.apache.hadoop.hbase.regionserver.HRegion:Starting compaction on region acookie_log_201111,,1320112381265.72ba7278e7fdc44dcd0cd2d33ea64e21.


3)里循环所有store时,进入每一个store的compact函数,打出这条log,这里进入log_01这个store(cf)

2011-11-01 22:09:06,257 INFO org.apache.hadoop.hbase.regionserver.Store:Started compaction of 5 file(s) in cf=log_01  into hdfs://peckermaster:9000/hbase-pecker/acookie_log_201111/72ba7278e7fdc44dcd0cd2d33ea64e21/.tmp, seqid=4502194, totalSize=63.5m
2011-11-01 22:09:06,257 DEBUG org.apache.hadoop.hbase.regionserver.Store: Compacting hdfs://peckermaster:9000/hbase-pecker/acookie_log_201111/72ba7278e7fdc44dcd0cd2d33ea64e21/log_01/2245326706716741897, keycount=65931, bloomtype=NONE, size=19.5m
2011-11-01 22:09:06,257 DEBUG org.apache.hadoop.hbase.regionserver.Store: Compacting hdfs://peckermaster:9000/hbase-pecker/acookie_log_201111/72ba7278e7fdc44dcd0cd2d33ea64e21/log_01/6002518747579084070, keycount=44261, bloomtype=NONE, size=13.4m
2011-11-01 22:09:06,257 DEBUG org.apache.hadoop.hbase.regionserver.Store: Compacting hdfs://peckermaster:9000/hbase-pecker/acookie_log_201111/72ba7278e7fdc44dcd0cd2d33ea64e21/log_01/5027939740336284309, keycount=11, bloomtype=NONE, size=5.4k
2011-11-01 22:09:06,257 DEBUG org.apache.hadoop.hbase.regionserver.Store: Compacting hdfs://peckermaster:9000/hbase-pecker/acookie_log_201111/72ba7278e7fdc44dcd0cd2d33ea64e21/log_01/8369520228608794009, keycount=65366, bloomtype=NONE, size=15.5m
2011-11-01 22:09:06,257 DEBUG org.apache.hadoop.hbase.regionserver.Store: Compacting hdfs://peckermaster:9000/hbase-pecker/acookie_log_201111/72ba7278e7fdc44dcd0cd2d33ea64e21/log_01/1400675554949016167, keycount=65377, bloomtype=NONE, size=15.1m
2011-11-01 22:09:15,287 INFO org.apache.hadoop.hbase.regionserver.Store: Completed compaction of 5 file(s), new file=hdfs://peckermaster:9000/hbase-pecker/acookie_log_201111/72ba7278e7fdc44dcd0cd2d33ea64e21/log_01/3031490329706398476, size=46.2m; total size for store is 2.0g
2011-11-01 22:09:15,288 INFO org.apache.hadoop.hbase.regionserver.HRegion: completed compaction on region acookie_log_201111,,1320112381265.72ba7278e7fdc44dcd0cd2d33ea64e21. after 9sec



你可能感兴趣的:(HBase Region操作实战分析之Store Compaction)