HBase源码分析之BlockCache二:BucketCache

前面一篇 HBase源码分析之BlockCache一:综述及LruBlockCache的实现,分析了各种BlockCache及其特点,本篇将分析BucketCache的实现,以及将LruBlockCache和BucketCache组合在一起的CombinedBlockCache。

BucketCache

先来看看BucketCache中的一些重要的属性

  // BucketCache.java
  // Store/read block data
  // IOEngine具体实现各类数据存储,即最终的Block会缓存的地方。
  final IOEngine ioEngine;

  // Store the block in this map before writing it to cache
  // 临时存储block,后续会写入到缓存
  final ConcurrentMap ramCache;
  // In this map, store the block's meta data like offset, length
  // 记录block的元数据信息
  ConcurrentMap backingMap;

  // 多个writerThread来负责将block写入cache中,每个writerThread有一个BlockingQueue.
  final ArrayList> writerQueues =
      new ArrayList>();
  @VisibleForTesting
  final WriterThread[] writerThreads;

在实例话BucketCache时,会将这些属性都初始化,writerThread会启动对应个数的后台线程。同时也会启动一个统计线程,每5分钟会在日志中打印cache的统计信息,我们可以通过这个来了解和分析cache的一些情况。

先看看cacheBlock的过程

public void cacheBlockWithWait(BlockCacheKey cacheKey, Cacheable cachedItem, boolean inMemory,
      boolean wait) {
    if (!cacheEnabled) {
      return;
    }

    if (backingMap.containsKey(cacheKey)) {
      return;
    }

    /*
     * Stuff the entry into the RAM cache so it can get drained to the persistent store
     */
    RAMQueueEntry re =
        new RAMQueueEntry(cacheKey, cachedItem, accessCount.incrementAndGet(), inMemory);
    if (ramCache.putIfAbsent(cacheKey, re) != null) {
      return;
    }
    int queueNum = (cacheKey.hashCode() & 0x7FFFFFFF) % writerQueues.size();
    BlockingQueue bq = writerQueues.get(queueNum);
    boolean successfulAddition = false;
    if (wait) {
      try {
        successfulAddition = bq.offer(re, DEFAULT_CACHE_WAIT_TIME, TimeUnit.MILLISECONDS);
      } catch (InterruptedException e) {
        Thread.currentThread().interrupt();
      }
    } else {
      successfulAddition = bq.offer(re);
    }
    if (!successfulAddition) {
      ramCache.remove(cacheKey);
      failedBlockAdditions.incrementAndGet();
    } else {
      this.blockNumber.incrementAndGet();
      this.heapSize.addAndGet(cachedItem.heapSize());
      blocksByHFile.put(cacheKey.getHfileName(), cacheKey);
    }
  }
  1. 查看backingMap中是否包含该cacheKey(类型为BlockCacheKey),包含就直接返回,否则就继续往下
  2. 根据传入的参数,生成一个RAMQueueEntry对象,然后放入ramCache中,如果ramCache中已经存在了就返回。
  3. 根据cacheKey的hash值对writeQueues的大小取余来选择对应的writeQueue,然后将刚才生成的RAMQueueEntry放入这个队列。

这里有对用到的几个相关类做下说明

  1. BlockCacheKey,用来标示一个缓存的block的key。由hfile的文件名和offset来区分不同的缓存block。
  2. BucketEntry,记录缓冲中的block的位置信息,包括offset,长度和优先级
  3. RAMQueueEntry,需要缓存的block,包含key, data,cache访问的计数等。

cacheBlockWithWait只是将需要缓存的块放入writerQueue,然后通过writeThread来将这些块写入缓存。下面看看writeThread写的过程。

//BucketCache.java
// WriteThread
public void run() {
      List entries = new ArrayList();
      try {
        while (cacheEnabled && writerEnabled) {
          try {
            try {
              // Blocks
              entries = getRAMQueueEntries(inputQueue, entries);
            } catch (InterruptedException ie) {
              if (!cacheEnabled) break;
            }
            doDrain(entries);
          } catch (Exception ioe) {
            LOG.error("WriterThread encountered error", ioe);
          }
        }
      } catch (Throwable t) {
        LOG.warn("Failed doing drain", t);
      }
      LOG.info(this.getName() + " exiting, cacheEnabled=" + cacheEnabled);
    }

步骤如下

  1. getRAMQueueEntries,从队列中取出所有准备写入的RAMQueueEntry
  2. doDrain方法,将第一步获取到的entries写入缓存,并在backingMap记录对应的位置信息,然后从ramCache清除对应的cachekey的数据。

我们在分析下block写入缓存的过程,具体的写入是通过RAMQueueEntry的writeToCache来处理的

public BucketEntry writeToCache(final IOEngine ioEngine,
        final BucketAllocator bucketAllocator,
        final UniqueIndexMap<Integer> deserialiserMap,
        final AtomicLong realCacheSize) throws CacheFullException, IOException,
        BucketAllocatorException {
      int len = data.getSerializedLength();
      // This cacheable thing can't be serialized...
      if (len == 0) return null;
      long offset = bucketAllocator.allocateBlock(len);
      BucketEntry bucketEntry = new BucketEntry(offset, len, accessCounter, inMemory);
      bucketEntry.setDeserialiserReference(data.getDeserializer(), deserialiserMap);
      try {
        if (data instanceof HFileBlock) {
          HFileBlock block = (HFileBlock) data;
          ByteBuffer sliceBuf = block.getBufferReadOnlyWithHeader();
          sliceBuf.rewind();
          assert len == sliceBuf.limit() + HFileBlock.EXTRA_SERIALIZATION_SPACE ||
            len == sliceBuf.limit() + block.headerSize() + HFileBlock.EXTRA_SERIALIZATION_SPACE;
          ByteBuffer extraInfoBuffer = ByteBuffer.allocate(HFileBlock.EXTRA_SERIALIZATION_SPACE);
          block.serializeExtraInfo(extraInfoBuffer);
          ioEngine.write(sliceBuf, offset);
          ioEngine.write(extraInfoBuffer, offset + len - HFileBlock.EXTRA_SERIALIZATION_SPACE);
        } else {
          ByteBuffer bb = ByteBuffer.allocate(len);
          data.serialize(bb);
          ioEngine.write(bb, offset);
        }
      } catch (IOException ioe) {
        // free it in bucket allocator
        bucketAllocator.freeBlock(offset);
        throw ioe;
      }

      realCacheSize.addAndGet(len);
      return bucketEntry;
    }
  }

BucketAllocator会负责缓存的管理,我们来看看其内部实现。
在初始化BucketAllocator时会将缓存空间分配为多个固定大小的Bucket,每个Bucket大小为配置的最大的bucketsize的4倍,每个Bucket中会维护一个缓存空间的offset信息。另外通过private BucketSizeInfo[] bucketSizeInfos维护每种bucketsize的Bucket的列表.

// BucketAllocator.java
final class BucketSizeInfo {
    // Free bucket means it has space to allocate a block;
    // Completely free bucket means it has no block.
    private LinkedMap bucketList, freeBuckets, completelyFreeBuckets;
    private int sizeIndex;
    }

下面看看block的申请的过程

// BucketAllocator.java
public synchronized long allocateBlock(int blockSize) throws CacheFullException,
      BucketAllocatorException {
    assert blockSize > 0;
    BucketSizeInfo bsi = roundUpToBucketSizeInfo(blockSize);
    if (bsi == null) {
      throw new BucketAllocatorException("Allocation too big size=" + blockSize +
        "; adjust BucketCache sizes " + CacheConfig.BUCKET_CACHE_BUCKETS_KEY +
        " to accomodate if size seems reasonable and you want it cached.");
    }
    long offset = bsi.allocateBlock();

    // Ask caller to free up space and try again!
    if (offset < 0)
      throw new CacheFullException(blockSize, bsi.sizeIndex());
    usedSize += bucketSizes[bsi.sizeIndex()];
    return offset;
  }

  // BucketSizeInfo
  public long allocateBlock() {
      Bucket b = null;
      if (freeBuckets.size() > 0) {
        // Use up an existing one first...
        b = (Bucket) freeBuckets.lastKey();
      }
      if (b == null) {
        b = grabGlobalCompletelyFreeBucket();
        if (b != null) instantiateBucket(b);
      }
      if (b == null) return -1;
      long result = b.allocate();
      blockAllocated(b);
      return result;
    }

首先根据申请的大小选择合适的BucketSize(比申请大小大的最小的bucketSize)。然后通过对应bucketSize的BucketSizeInfo来分配块,
步骤如下:

  1. 从freeBuckets中取bucket,如果没有就从所有的bucketSizeInfos来获取空闲的bucket(会初始化为当前的bucketSize)
  2. bucket进行分配,返回分配的offset地址

在allocateBlock的时候会抛出CacheFullException或者使用空间超过acceptableSize(总空间*0.95)会调用freeSpace方法。BucketCache中也是分为simple、multi和memory,空间释放逻辑和LruBlockCache也是一样的。

接下来看看getBlock的过程

  1. 先到ramCache中查找,找到就返回RAMQueueEntry(ramCache的value)中的值。
  2. 没找到,就从backingMap获取cacheKey的bucket信息,找到了就根据offset信息从缓存中反序列化

CombinedBlockCache

接下来我们说说LruBlockCache和BucketCache如何组合来处理block的。

public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean inMemory,
      final boolean cacheDataInL1) {
    boolean isMetaBlock = buf.getBlockType().getCategory() != BlockCategory.DATA;
    if (isMetaBlock || cacheDataInL1) {
      lruCache.cacheBlock(cacheKey, buf, inMemory, cacheDataInL1);
    } else {
      l2Cache.cacheBlock(cacheKey, buf, inMemory, false);
    }
  }

1、在cacheBlock的时候会将MetaBlock(包括META、INDEX、BLOOM等非DATA block)或者在表定义中指定为cacheDataInL1(CACHE_DATA_IN_L1 => ‘true’)放入LruBlockCache,其他的情况会放入BucketCache
2、在初始化blockCache的时候会把LruBlockCache的victimHandler配置为BucketCache,这样BucketCache就可以作为LruBlockCache的二级缓存使用。

最后我们来梳理下BucketCache中的相关参数配置:
hbase.bucketcache.ioengine bucketcache的实现方式,默认null
hbase.bucketcache.size bucketcache的大小,小于1时为整体内存的比例,大于等于1时为cache的实际容量,单位M。默认0
hbase.bucketcache.bucket.sizes 配置bucket的大小的列表,默认值:5k~513k等14个值。
hbase.bucketcache.writer.threads 设置writer threads的个数,默认值3
hbase.bucketcache.writer.queuelength 设置每个writer queue的长度 默认值64
hbase.bucketcache.persistent.path 持久化路径配置,没有默认值。

你可能感兴趣的:(HBase源码分析)