HBase 中有一种数据行为叫Compaction,从字面的意思就是数据文件合并,本文对Compaction的目的,控制方法,具体实施过程等几个方面对HBase 的 Compaction 行为进行了介绍。
合并多个HFile生成一个HFile
Compaction有两种:
修改Hbase配置文件可以控制compaction行为
键值 | 默认值 | 意义 |
hbase.regionserver.thread.splitcompactcheckfrequency | 20s | compaction检查周期 |
hbase.hstore.compactionThreshold | 3 | 最小minor compaction的文件个数 |
hbase.hstore.blockingStoreFiles | 7 | Block flush操作的Store个数 |
hbase.hstore.blockingWaitTime | 90s | Block flush操作的等待时间 |
hbase.hstore.compaction.max | 10 | 最大minor compaction的文件个数 |
hbase.hregion.majorcompaction | 1 day | Major compaction的周期 |
Compaction是一个Async的过程,可以由客户端发起,也可能是服务器端自己检查发起compaction.
Client端:
HBaseAdmin::compaction or majorCompaction
==>HMaster modifyTable
==>RegionManager::startAction
==> put into map regionsToCompact and regionsToMajorCompact
==>Send to HRegionServer
Server端:
HRegionServer::run forward the request to CompactionSplitThread
==>CompactionSplitThread handle the request from queue
==>HRegion::compactStores
==>Do compaction preparations, create the compaction folder
==>HStore::compaction
==>Create a HFile.Writer for writing
==>Create a StoreScanner for major compaction
==>Create a MinorCompactionStoreScanner for minor compaction
==>Scan the scanner and write to the hfile
==>Complete the compaction,delete old files and move the file to store folder
Major compaction:
Major compaction由region server定期检查
==>HRegionServer::MajorCompactionChecker
==>Send the request to CompactionSplitThread
Minor compaction:
Minor compaction由Memstore flush到HDFS前检查
==>MemStoreFlusher::flushRegion
==>Send the request to CompactionSplitThread
原文链接:http://www.spnguru.com/?p=271