前面一篇 HBase源码分析之BlockCache一:综述及LruBlockCache的实现,分析了各种BlockCache及其特点,本篇将分析BucketCache的实现,以及将LruBlockCache和BucketCache组合在一起的CombinedBlockCache。
先来看看BucketCache中的一些重要的属性
// BucketCache.java
// Store/read block data
// IOEngine具体实现各类数据存储,即最终的Block会缓存的地方。
final IOEngine ioEngine;
// Store the block in this map before writing it to cache
// 临时存储block,后续会写入到缓存
final ConcurrentMap ramCache;
// In this map, store the block's meta data like offset, length
// 记录block的元数据信息
ConcurrentMap backingMap;
// 多个writerThread来负责将block写入cache中,每个writerThread有一个BlockingQueue.
final ArrayList> writerQueues =
new ArrayList>();
@VisibleForTesting
final WriterThread[] writerThreads;
在实例话BucketCache时,会将这些属性都初始化,writerThread会启动对应个数的后台线程。同时也会启动一个统计线程,每5分钟会在日志中打印cache的统计信息,我们可以通过这个来了解和分析cache的一些情况。
先看看cacheBlock的过程
public void cacheBlockWithWait(BlockCacheKey cacheKey, Cacheable cachedItem, boolean inMemory,
boolean wait) {
if (!cacheEnabled) {
return;
}
if (backingMap.containsKey(cacheKey)) {
return;
}
/*
* Stuff the entry into the RAM cache so it can get drained to the persistent store
*/
RAMQueueEntry re =
new RAMQueueEntry(cacheKey, cachedItem, accessCount.incrementAndGet(), inMemory);
if (ramCache.putIfAbsent(cacheKey, re) != null) {
return;
}
int queueNum = (cacheKey.hashCode() & 0x7FFFFFFF) % writerQueues.size();
BlockingQueue bq = writerQueues.get(queueNum);
boolean successfulAddition = false;
if (wait) {
try {
successfulAddition = bq.offer(re, DEFAULT_CACHE_WAIT_TIME, TimeUnit.MILLISECONDS);
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
}
} else {
successfulAddition = bq.offer(re);
}
if (!successfulAddition) {
ramCache.remove(cacheKey);
failedBlockAdditions.incrementAndGet();
} else {
this.blockNumber.incrementAndGet();
this.heapSize.addAndGet(cachedItem.heapSize());
blocksByHFile.put(cacheKey.getHfileName(), cacheKey);
}
}
这里有对用到的几个相关类做下说明
cacheBlockWithWait只是将需要缓存的块放入writerQueue,然后通过writeThread来将这些块写入缓存。下面看看writeThread写的过程。
//BucketCache.java
// WriteThread
public void run() {
List entries = new ArrayList();
try {
while (cacheEnabled && writerEnabled) {
try {
try {
// Blocks
entries = getRAMQueueEntries(inputQueue, entries);
} catch (InterruptedException ie) {
if (!cacheEnabled) break;
}
doDrain(entries);
} catch (Exception ioe) {
LOG.error("WriterThread encountered error", ioe);
}
}
} catch (Throwable t) {
LOG.warn("Failed doing drain", t);
}
LOG.info(this.getName() + " exiting, cacheEnabled=" + cacheEnabled);
}
步骤如下
我们在分析下block写入缓存的过程,具体的写入是通过RAMQueueEntry的writeToCache来处理的
public BucketEntry writeToCache(final IOEngine ioEngine,
final BucketAllocator bucketAllocator,
final UniqueIndexMap<Integer> deserialiserMap,
final AtomicLong realCacheSize) throws CacheFullException, IOException,
BucketAllocatorException {
int len = data.getSerializedLength();
// This cacheable thing can't be serialized...
if (len == 0) return null;
long offset = bucketAllocator.allocateBlock(len);
BucketEntry bucketEntry = new BucketEntry(offset, len, accessCounter, inMemory);
bucketEntry.setDeserialiserReference(data.getDeserializer(), deserialiserMap);
try {
if (data instanceof HFileBlock) {
HFileBlock block = (HFileBlock) data;
ByteBuffer sliceBuf = block.getBufferReadOnlyWithHeader();
sliceBuf.rewind();
assert len == sliceBuf.limit() + HFileBlock.EXTRA_SERIALIZATION_SPACE ||
len == sliceBuf.limit() + block.headerSize() + HFileBlock.EXTRA_SERIALIZATION_SPACE;
ByteBuffer extraInfoBuffer = ByteBuffer.allocate(HFileBlock.EXTRA_SERIALIZATION_SPACE);
block.serializeExtraInfo(extraInfoBuffer);
ioEngine.write(sliceBuf, offset);
ioEngine.write(extraInfoBuffer, offset + len - HFileBlock.EXTRA_SERIALIZATION_SPACE);
} else {
ByteBuffer bb = ByteBuffer.allocate(len);
data.serialize(bb);
ioEngine.write(bb, offset);
}
} catch (IOException ioe) {
// free it in bucket allocator
bucketAllocator.freeBlock(offset);
throw ioe;
}
realCacheSize.addAndGet(len);
return bucketEntry;
}
}
BucketAllocator会负责缓存的管理,我们来看看其内部实现。
在初始化BucketAllocator时会将缓存空间分配为多个固定大小的Bucket,每个Bucket大小为配置的最大的bucketsize的4倍,每个Bucket中会维护一个缓存空间的offset信息。另外通过private BucketSizeInfo[] bucketSizeInfos维护每种bucketsize的Bucket的列表.
// BucketAllocator.java
final class BucketSizeInfo {
// Free bucket means it has space to allocate a block;
// Completely free bucket means it has no block.
private LinkedMap bucketList, freeBuckets, completelyFreeBuckets;
private int sizeIndex;
}
下面看看block的申请的过程
// BucketAllocator.java
public synchronized long allocateBlock(int blockSize) throws CacheFullException,
BucketAllocatorException {
assert blockSize > 0;
BucketSizeInfo bsi = roundUpToBucketSizeInfo(blockSize);
if (bsi == null) {
throw new BucketAllocatorException("Allocation too big size=" + blockSize +
"; adjust BucketCache sizes " + CacheConfig.BUCKET_CACHE_BUCKETS_KEY +
" to accomodate if size seems reasonable and you want it cached.");
}
long offset = bsi.allocateBlock();
// Ask caller to free up space and try again!
if (offset < 0)
throw new CacheFullException(blockSize, bsi.sizeIndex());
usedSize += bucketSizes[bsi.sizeIndex()];
return offset;
}
// BucketSizeInfo
public long allocateBlock() {
Bucket b = null;
if (freeBuckets.size() > 0) {
// Use up an existing one first...
b = (Bucket) freeBuckets.lastKey();
}
if (b == null) {
b = grabGlobalCompletelyFreeBucket();
if (b != null) instantiateBucket(b);
}
if (b == null) return -1;
long result = b.allocate();
blockAllocated(b);
return result;
}
首先根据申请的大小选择合适的BucketSize(比申请大小大的最小的bucketSize)。然后通过对应bucketSize的BucketSizeInfo来分配块,
步骤如下:
在allocateBlock的时候会抛出CacheFullException或者使用空间超过acceptableSize(总空间*0.95)会调用freeSpace方法。BucketCache中也是分为simple、multi和memory,空间释放逻辑和LruBlockCache也是一样的。
接下来看看getBlock的过程
接下来我们说说LruBlockCache和BucketCache如何组合来处理block的。
public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean inMemory,
final boolean cacheDataInL1) {
boolean isMetaBlock = buf.getBlockType().getCategory() != BlockCategory.DATA;
if (isMetaBlock || cacheDataInL1) {
lruCache.cacheBlock(cacheKey, buf, inMemory, cacheDataInL1);
} else {
l2Cache.cacheBlock(cacheKey, buf, inMemory, false);
}
}
1、在cacheBlock的时候会将MetaBlock(包括META、INDEX、BLOOM等非DATA block)或者在表定义中指定为cacheDataInL1(CACHE_DATA_IN_L1 => ‘true’)放入LruBlockCache,其他的情况会放入BucketCache
2、在初始化blockCache的时候会把LruBlockCache的victimHandler配置为BucketCache,这样BucketCache就可以作为LruBlockCache的二级缓存使用。
最后我们来梳理下BucketCache中的相关参数配置:
hbase.bucketcache.ioengine bucketcache的实现方式,默认null
hbase.bucketcache.size bucketcache的大小,小于1时为整体内存的比例,大于等于1时为cache的实际容量,单位M。默认0
hbase.bucketcache.bucket.sizes 配置bucket的大小的列表,默认值:5k~513k等14个值。
hbase.bucketcache.writer.threads 设置writer threads的个数,默认值3
hbase.bucketcache.writer.queuelength 设置每个writer queue的长度 默认值64
hbase.bucketcache.persistent.path 持久化路径配置,没有默认值。