详解HBase Compaction

1. Compaction是什么

合并多个HFile生成一个HFile

Compaction有两种:

  • Minor Compaction(部分文件合并)
  • Major Compaction(完整文件合并)

2.为什么要Compaction?

  • 减少HFile文件的个数
  • 提高性能
  • 清除过期和删除数据

3.配置

修改Hbase配置文件可以控制compaction行为

键值

默认值

意义

hbase.regionserver.thread.splitcompactcheckfrequency

20s

compaction检查周期

hbase.hstore.compactionThreshold

3

最小minor compaction的文件个数

hbase.hstore.blockingStoreFiles

7

Block flush操作的Store个数

hbase.hstore.blockingWaitTime

90s

Block flush操作的等待时间

hbase.hstore.compaction.max

10

最大minor compaction的文件个数

hbase.hregion.majorcompaction

1 day

Major compaction的周期

4.流程

Compaction是一个Async的过程,可以由客户端发起,也可能是服务器端自己检查发起compaction.

1)客户端发起

Client端:

HBaseAdmin::compaction or majorCompaction

==>HMaster modifyTable

==>RegionManager::startAction

==> put into map regionsToCompact and regionsToMajorCompact

==>Send to HRegionServer

Server端:

HRegionServer::run forward the request to CompactionSplitThread

==>CompactionSplitThread handle the request from queue

==>HRegion::compactStores

==>Do compaction preparations, create the compaction folder

==>HStore::compaction

==>Create a HFile.Writer for writing

==>Create a StoreScanner for major compaction

==>Create a MinorCompactionStoreScanner for minor compaction

==>Scan the scanner and write to the hfile

==>Complete the compaction,delete old files and move the file to store folder

2) Server检查发起

Major compaction:

Major compaction由region server定期检查

==>HRegionServer::MajorCompactionChecker

==>Send the request to CompactionSplitThread

Minor compaction:

Minor compaction由Memstore flush到HDFS前检查

==>MemStoreFlusher::flushRegion

==>Send the request to CompactionSplitThread

你可能感兴趣的:(详解HBase Compaction)