minor compaction时的scan操作分析

minor compaction时的scan操作分析

minor compaction时的scan主要是对store下的几个storefile文件进行合并,通常不做数据删除操作。

compaction的发起通过CompactSplitThread.requestCompactionInternal-->

CompactSplitThread.CompactionRunner.run-->region.compact-->HStore.compact

-->DefaultStoreEngine.DefaultCompactionContext.compact-->

DefaultCompactor.compact



生成compaction时的StoreScanner

1.通过要进行compact的几个storefile生成StoreFileScanner,,以下是生成实例时的方法层次调用



DefaultCompactor.compact方法中的部分代码,得到每一个storefileStoreFileScanner实例

List<StoreFileScanner>scanners= createFileScanners(request.getFiles());

protectedList<StoreFileScanner>createFileScanners(

finalCollection<StoreFile>filesToCompact)throwsIOException {

returnStoreFileScanner.getScannersForStoreFiles(filesToCompact,false,false,true);

}


publicstaticList<StoreFileScanner>getScannersForStoreFiles(

Collection<StoreFile>files,booleancacheBlocks,booleanusePread,

booleanisCompaction)throwsIOException {

returngetScannersForStoreFiles(files,cacheBlocks,usePread,isCompaction,

null);

}

在调用此方法时,ScanQueryMatcher传入为null

publicstaticList<StoreFileScanner>getScannersForStoreFiles(

Collection<StoreFile>files,booleancacheBlocks,booleanusePread,

booleanisCompaction,ScanQueryMatcher matcher)throwsIOException {

List<StoreFileScanner>scanners =newArrayList<StoreFileScanner>(

files.size());

for(StoreFile file: files) {

迭代每一个storefile,生成storefilereader实例,并根据reader生成storefilescanner

生成reader实例-->HFile.createReader-->HFileReaderV2-->StoreFile.Reader


StoreFile.Reader r= file.createReader();


每一个StoreFileScanner中包含一个HFileScanner

实例生成HFileReaderV2.getScanner-->

检查在table的此cf中配置有DATA_BLOCK_ENCODING属性,表示有指定ENCODING,

此配置的可选值,请参见DataBlockEncoding(如前缀树等)

如果encoding的配置不是NODEHFileScanner的实例生成为HFileReaderV2.EncodedScannerV2

否则生成的实例为HFileReaderV2.ScannerV2-->

生成StoreFileScanner实例,此实例引用StoreFile.ReaderHFileScanner

以下代码中的isCompactiontrue


StoreFileScanner scanner= r.getStoreFileScanner(cacheBlocks,usePread,

isCompaction);

此时的matchernull

scanner.setScanQueryMatcher(matcher);

scanners.add(scanner);

}

returnscanners;

}


DefaultCompactor.compact方法中的部分代码,生成StoreScanner实例



得到一个ScanType为保留删除数据的ScanType,scanType=COMPACT_RETAIN_DELETES

ScanTypescanType=

request.isMajor()? ScanType.COMPACT_DROP_DELETES

:ScanType.COMPACT_RETAIN_DELETES;

scanner= preCreateCoprocScanner(request,scanType,fd.earliestPutTs,scanners);

if(scanner ==null){

生成一个Scan实例,这个Scan为查询所有版本的Scan,maxVersioncf设置的最大的maxVersion

生成StoreScanner实例


scanner= createScanner(store,scanners,scanType,smallestReadPoint,fd.earliestPutTs);

}

scanner= postCreateCoprocScanner(request,scanType,scanner);

if(scanner ==null){

//NULL scanner returned from coprocessorhooks means skip normal processing.

returnnewFiles;

}


生成StoreScanner的构造方法要做和处理流程:代码调用层级如下所示:


protectedInternalScannercreateScanner(Storestore,List<StoreFileScanner>scanners,

ScanTypescanType,longsmallestReadPoint,longearliestPutTs)throwsIOException {

Scan scan= newScan();

scan.setMaxVersions(store.getFamily().getMaxVersions());

returnnewStoreScanner(store,store.getScanInfo(),scan,scanners,

scanType,smallestReadPoint,earliestPutTs);

}


publicStoreScanner(Storestore,ScanInfo scanInfo,Scan scan,

List<?extendsKeyValueScanner>scanners,ScanTypescanType,

longsmallestReadPoint,longearliestPutTs)throwsIOException {

this(store,scanInfo,scan,scanners,scanType,smallestReadPoint,earliestPutTs,null,null);

}


privateStoreScanner(Storestore,ScanInfo scanInfo,Scan scan,

List<?extendsKeyValueScanner>scanners,ScanTypescanType,longsmallestReadPoint,

longearliestPutTs,byte[]dropDeletesFromRow,byte[]dropDeletesToRow)

throwsIOException {


调用相关构造方法生成ttl的过期时间,最小版本等信息

检查hbase.storescanner.parallel.seek.enable配置是否为true,true表示并行scanner

如果是并行scan时,拿到rs中的执行线程池


this(store,false,scan, null,scanInfo.getTtl(),

scanInfo.getMinVersions());

if(dropDeletesFromRow== null){

此时通过这里生成ScanQueryMatcher实例

matcher= newScanQueryMatcher(scan,scanInfo,null,scanType,

smallestReadPoint,earliestPutTs,oldestUnexpiredTS);

}else{

matcher= newScanQueryMatcher(scan,scanInfo,null,smallestReadPoint,

earliestPutTs,oldestUnexpiredTS,dropDeletesFromRow,dropDeletesToRow);

}


过滤掉bloomfilter不存在的storefilescanner,不在时间范围内的scannerttl过期的scanner

如果一个storefile中最大的更新时间超过了ttl的设置,那么此storefile已经没用,不用参与scan

//Filter the list of scanners using Bloom filters, time range, TTL,etc.

scanners= selectScannersFrom(scanners);


如果没有配置并行scanner,迭代把每一个scannerseek到指定的开始key处,由于是compactionscan,默认不seek

//Seek all scanners to the initial key

if(!isParallelSeekEnabled){

for(KeyValueScannerscanner :scanners) {

scanner.seek(matcher.getStartKey());

}

}else{

通过线程池,生成ParallelSeekHandler实例,并行去seek到指定的开始位置

parallelSeek(scanners,matcher.getStartKey());

}

生成一个具体的扫描的scanner,把所有要查找的storefilescanner添加进去,

每次的next都需要从不同的scanner里找到最小的一个kv

KeyValueHeap中维护一个PriorityQueue的优先级队列,

在默认生成此实例时会生成根据如下来检查那一个storefilescanner在队列的前面

1.比较两个storefilescanner中最前面的一个kv

a.如果rowkey部分不相同直接返回按大小的排序

b.如果rowkey部分相同,比较cf/column/type谁更大,

c.可参见KeyValue.KVComparator.compare

2.如果两个storefilescanner中最小的kv相同,比较谁的storefileseqid更大,返回更大的

3.得到当前所有的storefilescanner中最小的kv的一个storefilescannerHeyValueHead.current属性的值


//Combine all seekedscanners with a heap

heap= newKeyValueHeap(scanners,store.getComparator());

}


KeyValueScanner.seek流程分析:

KeyValueScanner的实例StoreFileScanner,调用StoreFileScanner.seek,代码调用层级


publicbooleanseek(KeyValuekey) throwsIOException {

if(seekCount!= null)seekCount.incrementAndGet();


try{

try{

if(!seekAtOrAfter(hfs,key)) {

close();

returnfalse;

}


cur= hfs.getKeyValue();


return!hasMVCCInfo? true: skipKVsNewerThanReadpoint();

} finally{

realSeekDone= true;

}

}catch(IOException ioe){

thrownewIOException("Couldnot seek " + this+ " to key "+ key,ioe);

}

}

调用HFileScanner的实现HFileReaderV2.EncodedScannerV2or HFileReaderV2.ScannerV2seekTo方法

publicstaticbooleanseekAtOrAfter(HFileScanners, KeyValuek)

throwsIOException {

调用下面会提到的HFileReaderV2.AbstractScannerV2.seekTo方法

如果返回的值==0表示刚好对应上,直接返回true,不需要在进行next操作(当前的kv就是对的kv)


intresult =s.seekTo(k.getBuffer(),k.getKeyOffset(),k.getKeyLength());

if(result< 0) {

小米搞的一个对index中存储的key的优化,HBASE-7845

indexkey的值在小米的hbase-7845进行了优化,

存储的key是大于上一个block的最后一个key与小于当前block第一个key的一个值,如果是此值返回的值为-2

此时不需要像其它小于0的情况把当前的kv向下移动一个指针位,因为当前的值已经在第一位上

if(result ==HConstants.INDEX_KEY_MAGIC){

//using faked key

returntrue;

}

移动到文件的第一个block的开始位置,此部分代码通常不会被执行

//Passed KV is smaller than first KV in file, work from start of file

returns.seekTo();

}elseif(result> 0) {

当前scanstartkey小于当前的blockcurrentkey,移动到下一条数据

//Passed KV is larger than current KV in file, if there is a next

//it is the "after", if not then this scanner is done.

returns.next();

}

//Seekedto the exact key

returntrue;

}

HFileReaderV2.AbstractScannerV2.seekTo方法

publicintseekTo(byte[]key, intoffset, intlength)throwsIOException {

//Always rewind to the first key of the block, because the given key

//might be before or after the current key.

returnseekTo(key,offset,length,true);

}

seekTo的嵌套调用

protectedintseekTo(byte[]key, intoffset, intlength,booleanrewind)

throwsIOException {

得到HFileReaderV2中的block索引的reader实例,HFileBlockIndex.BlockIndexReader


HFileBlockIndex.BlockIndexReaderindexReader=

reader.getDataBlockIndexReader();


blockindexreader中得到key对应的HFileBlock信息,

每一个block的第一个key都存储在metablock中在readerblockKeys,

indexkey的值在小米的hbase-7845进行了优化,

存储的key是大于上一个block的最后一个key与小于当前block第一个key的一个值

同时存储有此block对应的offset(readerblockOffsets)blocksize大小(readerblockDataSizes)

1.通过二分查找到metablock的所有key中比较,得到当前scanstartkey对应的block块的下标值

2.通过下标拿到block的开始位置,

3.通过下标拿到block的大小

4.加载对应的block信息,并封装成BlockWithScanInfo实例返回


BlockWithScanInfoblockWithScanInfo=

indexReader.loadDataBlockWithScanInfo(key,offset,length,block,

cacheBlocks,pread,isCompaction);

if(blockWithScanInfo== null|| blockWithScanInfo.getHFileBlock()== null){

//This happens if the key e.g. falls before the beginning of the file.

return-1;

}

调用HFileReaderV2.EncodedScannerV2or HFileReaderV2.ScannerV2loadBlockAndSeekToKey方法

1.更新当前的block块为seek后的block块,

2.把指标移动到指定的key的指针位置。


returnloadBlockAndSeekToKey(blockWithScanInfo.getHFileBlock(),

blockWithScanInfo.getNextIndexedKey(),rewind,key,offset,length,false);

}


执行StoreScanner.next方法处理

回到DefaultCompactor.compact的代码内,得到scanner后,要执行的写入新storefile文件的操作。


writer= store.createWriterInTmp(fd.maxKeyCount,this.compactionCompression,true,

fd.maxMVCCReadpoint>= smallestReadPoint);

booleanfinished =performCompaction(scanner,writer,smallestReadPoint);


performcompaction中通过StoreScanner.next(kvlist,limit)读取kv数据,

其中limit的大小通过hbase.hstore.compaction.kv.max配置,默认值为10,太大可能会出现oom的情况

通过HFileWriterV2.append添加kv到新的storefile文件中。

通过hbase.hstore.close.check.interval配置写入多少数据后检查一次store是否是可写的状态,

默认10*1000*1000(10m)


StoreScanner.next(kvlist,limit)


publicbooleannext(List<Cell>outResult,intlimit)throwsIOException {

lock.lock();

try{

if(checkReseek()){

returntrue;

}


//if the heap was left null, then the scanners had previously run outanyways, close and

//return.

if(this.heap== null){

close();

returnfalse;

}

通过调用KeyValueHeap.peek-->StoreFileScanner.peek,得到当前seek后的keyvalue

如果当前的keyvaluenull,表示没有要查找的数据了,结束此次scan

KeyValue peeked= this.heap.peek();

if(peeked ==null){

close();

returnfalse;

}


//only call setRow if the row changes; avoids confusing the querymatcher

//if scanning intra-row

byte[]row =peeked.getBuffer();

intoffset =peeked.getRowOffset();

shortlength =peeked.getRowLength();

此处的if检查通常在第一次运行时,或者说已经不是在一行查询内时,会进行,设置matcher.row为当前行的rowkey

if(limit <0 || matcher.row== null|| !Bytes.equals(row,offset,length,matcher.row,

matcher.rowOffset,matcher.rowLength)){

this.countPerRow= 0;

matcher.setRow(row,offset,length);

}


KeyValue kv;

KeyValue prevKV= null;


//Only do a sanity-check if store and comparator are available.

KeyValue.KVComparator comparator=

store!= null? store.getComparator(): null;


intcount = 0;

LOOP: while((kv= this.heap.peek())!= null){

++kvsScanned;

//Check that the heap gives us KVs in an increasing order.

assertprevKV ==null|| comparator== null|| comparator.compare(prevKV,kv) <= 0:

"Key" + prevKV+ " followed by a "+ "smaller key "+ kv + "in cf " + store;

prevKV= kv;

检查kv

1.过滤filter.filterAllRemaining()==true,表示结束查询,返回DONE_SCAN

2.检查matcher中的rowkey(row属性,表示当前查找的所有kv在相同行),

如果matcher.row小于当前的peekkv,表示当前row的查找结束(currentkv已经在下一行,返回DONE)

如果matcher.row大于当前的peekkv,peek出来的kvmatcher.row小,需要seek到下一行,返回SEEK_NEXT_ROW

3.检查ttl是否过期,如果过期返回SEEK_NEXT_COL

4.如果是minorcompactscan,这时的scantypeCOMPACT_RETAIN_DELETES,返回INCLUDE

5.如果kvdelete的类型,同时在deletesScanDeleteTracker)中包含此条数据

如果删除类型为FAMILY_DELETED/COLUMN_DELETED,那么返回SEEK_NEXT_COL

如果删除类型为VERSION_DELETED/FAMILY_VERSION_DELETED,那么返回SKIP

6.检查timestamp的值是否在TimeRange的范围内。如果超过最大值,返回SKIP,否则返回SEEK_NEXT_COL

7.执行filter.filterKeyValue().

如果filter返回为SKIP,直接返回SKIP

如果filter返回为NEXT_COL,返回SEEK_NEXT_COL

如果filter返回为NEXT_ROW,返回SEEK_NEXT_ROW

如果filter返回为SEEK_NEXT_USING_HINT,返回SEEK_NEXT_USING_HINT

否则表示filter返回为INCLUDEINCLUDEAND SEEK NEXT,执行下面流程

8.检查如果非delete类型的kv,是否超过maxVersion,如果是,或者数据ttl过期,返回SEEK_NEXT_ROW

如果数据没有过期,同时没有超过maxVersion,同时filter返回为INCLUDE_AND_NEXT_COL

返回INCLUDE_AND_SEEK_NEXT_COL。否则返回INCLUDE

ScanQueryMatcher.MatchCodeqcode =matcher.match(kv);

switch(qcode){

caseINCLUDE:

caseINCLUDE_AND_SEEK_NEXT_ROW:

caseINCLUDE_AND_SEEK_NEXT_COL:

执行filtertransformCell操作,此处可以想办法让KV的值最可能的小,减少返回的值大小。

Filterf =matcher.getFilter();

if(f != null){

//TODOconvert Scan Query Matcher to be Cell instead of KV based ?

kv= KeyValueUtil.ensureKeyValue(f.transformCell(kv));

}


this.countPerRow++;

此时是compactscan,storeLimit-1,storeOffset0,此处的if检查不会执行

if(storeLimit> -1 &&

this.countPerRow> (storeLimit+ storeOffset)){

//do what SEEK_NEXT_ROW does.

if(!matcher.moreRowsMayExistAfter(kv)){

returnfalse;

}

reseek(matcher.getKeyForNextRow(kv));

breakLOOP;

}

把数据添加到返回的列表中。可通过storeLimitstoreOffset来设置每一个store查询的分页值。

前提是只有一个cf,只有一个kv的情况下

//add to results only if we have skipped #storeOffset kvs

//also update metric accordingly

if(this.countPerRow> storeOffset){

outResult.add(kv);

count++;

}


if(qcode ==ScanQueryMatcher.MatchCode.INCLUDE_AND_SEEK_NEXT_ROW){

检查是否有下一行数据,也就是检查当前的kv是否达到stopkv值。

if(!matcher.moreRowsMayExistAfter(kv)){

returnfalse;

}

移动到当前kv的后面,通过kvrowkey部分,加上long.minvalue,

cfcolumn的值都设置为null,这个值就是最大的kv,kv的比较方式可参见KeyValue.KVComparator


reseek(matcher.getKeyForNextRow(kv));

} elseif(qcode ==ScanQueryMatcher.MatchCode.INCLUDE_AND_SEEK_NEXT_COL){


由于此时是compactionnextcol,所以直接移动到下一行去了。

否则得到下一个column的列名,移动到下一个列的数据前。见ScanQueryMatcher.getKeyForNextColumn方法


reseek(matcher.getKeyForNextColumn(kv));

} else{

否则是include,直接移动到下一行

this.heap.next();

}


if(limit >0 && (count== limit)){

如果达到limit的值,跳出while

breakLOOP;

}

continue;


caseDONE:

当前row查询结束

returntrue;


caseDONE_SCAN:

结束本次的SCAN操作

close();

returnfalse;


caseSEEK_NEXT_ROW:

计算出当前的ROW的后面位置,也就是比当前的KV大,比下一行的KV小,并通过

reseek-->StoreFileScanner.reseek-->HFile.seekTo移动到下一个大于此rowkv

//This is just a relatively simple end of scan fix, to short-cut end

//us if there is an endKey in the scan.

if(!matcher.moreRowsMayExistAfter(kv)){

returnfalse;

}


reseek(matcher.getKeyForNextRow(kv));

break;


caseSEEK_NEXT_COL:

计算出比当前KV大的下一列的KV值,移动到下一个KV

reseek(matcher.getKeyForNextColumn(kv));

break;


caseSKIP:

执行StoreScanner.KeyValueHeap.next

this.heap.next();

break;


caseSEEK_NEXT_USING_HINT:

如果存在下一列(kv),移动到下一个KV上,否则执行StoreScanner.KeyValueHeap.next

//TODOconvert reseeto Cell?

KeyValue nextKV= KeyValueUtil.ensureKeyValue(matcher.getNextKeyHint(kv));

if(nextKV !=null){

reseek(nextKV);

} else{

heap.next();

}

break;


default:

thrownewRuntimeException("UNEXPECTED");

}

}


if(count >0) {

returntrue;

}


//No more keys

close();

returnfalse;

}finally{

lock.unlock();

}

}


KeyValueHeap.next方法流程:


publicKeyValue next() throwsIOException {

if(this.current== null){

returnnull;

}

得到当前队列中topStoreFileScanner中的currentkv的值,并把topscanner指针向下移动到下一个kv的位置

KeyValue kvReturn= this.current.next();

得到移动后的topcurrent(此时是kvReturn的下一个kv的值)

KeyValue kvNext= this.current.peek();

如果nextkv的值是null,表示topscanner已经移动到文件的尾部,关闭此scanner,重新计算队列中的top

if(kvNext ==null){

this.current.close();

this.current= pollRealKV();

}else{

重新计算出currenttopscanner

KeyValueScannertopScanner= this.heap.peek();

if(topScanner== null||

this.comparator.compare(kvNext,topScanner.peek())>= 0) {

this.heap.add(this.current);

this.current= pollRealKV();

}

}

returnkvReturn;

}


compactionstorefile合并的新storefile写入流程


回到DefaultCompactor.compact的代码内,-->performcompaction(DefaultCompactor的上级类中Compactor)

performcompaction中通过StoreScanner.next(kvlist,limit)读取kv数据,

其中limit的大小通过hbase.hstore.compaction.kv.max配置,默认值为10,太大可能会出现oom的情况

通过HFileWriterV2.append添加kv到新的storefile文件中。

通过hbase.hstore.close.check.interval配置写入多少数据后检查一次store是否是可写的状态,

默认10*1000*1000(10m)

在每next一条数据后,一条数据包含多个column,所以会有多个kv的值。通过如下代码写入到新的storefile

do{

查找一行数据

hasMore= scanner.next(kvs,compactionKVMax);

//output to writer:

for(Cellc : kvs){

KeyValue kv= KeyValueUtil.ensureKeyValue(c);

if(kv.getMvccVersion()<= smallestReadPoint){

kv.setMvccVersion(0);

}

执行写入操作

writer.append(kv);

++progress.currentCompactedKVs;

.................................此处省去一些代码

kvs.clear();

}while(hasMore);


通过writer实例appendkv到新的storefile中,writer实例通过如下代码生成:

DefaultCompactor.compact方法代码中:


writer= store.createWriterInTmp(fd.maxKeyCount,this.compactionCompression,true,

fd.maxMVCCReadpoint>= smallestReadPoint);


Hstore.createWriterIntmp-->StoreFile.WriterBuilder.build生成StoreFile.Writer实例,

此实例中引用的具体writer实例为HFileWriterV2

通过hfile.format.version配置,writer/reader的具体的版本,目前只能配置为2


HstoreFile.Writer.append(kv)流程:


publicvoidappend(finalKeyValue kv)throwsIOException {

写入到bloomfilter,如果kv与上一次写入的kvrow/rowcol的值是相同的,不写入,

保证每次写入到bloomfilter中的数据都是不同的rowrowcol

通过io.storefile.bloom.block.size配置bloomblock的大小,默认为128*1024


appendGeneralBloomfilter(kv);


如果kv是一个deletekv,把row写入到deletebloomfilterblock中。

同一个行的多个kv只添加一次,要添加到此bloomfilter中,kvdeletetype要是如下类型:

kv.isDeleteFamily==true,同时kv.isDeleteFamilyVersion==true


appendDeleteFamilyBloomFilter(kv);


把数据写入到HFileWriterV2output中。计算出此storefile的最大的timestamp(所有appendkv中最大的mvcc)

hfilev2的写入格式:klen(int)vlen(int) key value

hfilev2key的格式:klen(int)vlen(int)

rowlen(short) rowcflen(byte)

cf columntimestamp(long) type(byte)

每次append的过程中会检查block是否达到flush的值,

如果达到cf中配置的BLOCKSIZE的值,默认为65536,执行finishBlock操作写入数据,

同时写入此blockbloomfilter.生成一个新的block


writer.append(kv);


更新此storefile的包含的timestamp的范围,也就是更新最大/最小值


trackTimestamps(kv);

}


完成数据读取与写入操作后,回到DefaultCompactor.compact方法中,关闭writer实例

if(writer !=null){

writer.appendMetadata(fd.maxSeqId,request.isMajor());

writer.close();

newFiles.add(writer.getPath());

}

添加此storefile的最大的seqidfileinfo中。StoreFile.Writer中的方法

publicvoidappendMetadata(finallongmaxSequenceId,finalbooleanmajorCompaction)

throwsIOException {

writer.appendFileInfo(MAX_SEQ_ID_KEY,Bytes.toBytes(maxSequenceId));

是否执行的majorCompaction

writer.appendFileInfo(MAJOR_COMPACTION_KEY,

Bytes.toBytes(majorCompaction));

appendTrackedTimestampsToMetadata();

}


publicvoidappendTrackedTimestampsToMetadata()throwsIOException {

appendFileInfo(TIMERANGE_KEY,WritableUtils.toByteArray(timeRangeTracker));

appendFileInfo(EARLIEST_PUT_TS,Bytes.toBytes(earliestPutTs));

}


publicvoidclose()throwsIOException {

以下两行代码作用于添加相关信息到fileinfo,see下面的两个方法流程,不说明。

booleanhasGeneralBloom= this.closeGeneralBloomFilter();

booleanhasDeleteFamilyBloom= this.closeDeleteFamilyBloomFilter();


writer.close();


//Log final Bloom filter statistics. This needs to be done afterclose()

//because compound Bloom filters might be finalized as part of closing.

if(StoreFile.LOG.isTraceEnabled()){

StoreFile.LOG.trace((hasGeneralBloom? "": "NO ")+ "General Bloom and "+

(hasDeleteFamilyBloom? "": "NO ")+ "DeleteFamily"+ " was added to HFile "+

getPath());

}


}


privatebooleancloseGeneralBloomFilter()throwsIOException {

booleanhasGeneralBloom= closeBloomFilter(generalBloomFilterWriter);


//add the general Bloom filter writer and append file info

if(hasGeneralBloom){

writer.addGeneralBloomFilter(generalBloomFilterWriter);

writer.appendFileInfo(BLOOM_FILTER_TYPE_KEY,

Bytes.toBytes(bloomType.toString()));

if(lastBloomKey!= null){

writer.appendFileInfo(LAST_BLOOM_KEY,Arrays.copyOfRange(

lastBloomKey,lastBloomKeyOffset,lastBloomKeyOffset

+ lastBloomKeyLen));

}

}

returnhasGeneralBloom;

}


privatebooleancloseDeleteFamilyBloomFilter()throwsIOException {

booleanhasDeleteFamilyBloom= closeBloomFilter(deleteFamilyBloomFilterWriter);


//add the delete family Bloom filter writer

if(hasDeleteFamilyBloom){

writer.addDeleteFamilyBloomFilter(deleteFamilyBloomFilterWriter);

}


//append file info about the number of delete family kvs

//even if there is no delete family Bloom.

writer.appendFileInfo(DELETE_FAMILY_COUNT,

Bytes.toBytes(this.deleteFamilyCnt));


returnhasDeleteFamilyBloom;

}


HFileWriterV2.close()方法流程:

写入用户数据/写入bloomfilter的数据,写入datablockindex的数据,更新写入fileinfo,

写入FixedFileTrailer到文件最后。


你可能感兴趣的:(分布式,源代码,hbase)