Guava Cache

Guava Cache以下的特性：

automatic loading of entries into the cache;
least-recently-used eviction when a maximum size is exceeded;
time-based expiration of entries, measured since last access or last write;
keys automatically wrapped in WeakReference;
values automatically wrapped in WeakReference or SoftReference soft;
notification of evicted (or otherwise removed) entries;
accumulation of cache access statistics.

总结来说：

一定要有的读、写、移除等接口；
还有loading特性，即if cached, return; otherwise create/load/compute, cache and return；
还需要时效策略（基于最大容量的时效、和基于读写时间的时效）；
基于不同引用级别的key/value；
缓存时效后的通知回调；
最后是cache相关的统计。

以上特性是可选的，通过CacheBuilder构造自己的Cache。下面是两个简单的例子：

第一个例子，构造一个最大容量为2的cache，插入3个数据。在插入第三个数据后，key为b的entry被失效了，因为基于lru原则，a多访问过一次。

    Cache cache = CacheBuilder.newBuilder().maximumSize(2).build();
    Obj obj_a = new Obj(1, 2);
    Obj obj_b = new Obj(2, 2);
    Obj obj_c = new Obj(3, 2);

    // first put count=1
    cache.put("a", obj_a);
    Assert.assertEquals(obj_a, cache.getIfPresent("a"));
    
    // 2nd put count=2
    cache.put("b", obj_b);
    
    // use a more than b
    cache.getIfPresent("a");
    Assert.assertEquals(obj_b, cache.getIfPresent("b"));
    Assert.assertEquals(obj_a, cache.getIfPresent("a"));

    // 3rd put count=3 need remove one
    cache.put("c", obj_c);
    Assert.assertEquals(obj_c, cache.getIfPresent("c"));
    Assert.assertTrue(cache.getIfPresent("b") == null);

第二个例子，构造了一个自动实现load逻辑的LoadingCache。可以看到，第一次取key为d的数据，会自动调用用户覆盖的load方法返回，loadingcache会自动将该value写入cache。以后再从cache中直接取的时候，就可以得到值。

    LoadingCache cache = CacheBuilder.newBuilder().build(new CacheLoader() {
        @Override
        public Obj load(String key) throws Exception {
            return new Obj(3,3);
        }
    });
    Obj obj_d = new Obj(3, 3);
    
    // get method is the same as get(key,new Callable)
    Assert.assertEquals(obj_a,cache.get("d"));
    
    Assert.assertEquals(obj_a,cache.getIfPresent("d"));// 从cache重直接取

Guava Cache结构

Cache配置
Cache及其实现和扩展（包括不同级别的引用）、 Cache的失效通知回调
Cache状态

一、Cache及其实现和扩展

通过类图，可以看出有一个Cache接口，不同的Cache均要实现该接口的方法，或者拓展新的方法。还有一个LoadingCache接口，增加了get()方法，实际是一种getorload逻辑（如果cache中存在就get，否则执行用户指定的load逻辑），后面会细说。针对cache和loadingCache有两个实现类，LocalManualCache和基于LocalManualCache实现的LocalLoadingCache。

guava_cache_cache

代理模式

这两个类并没有直接实现“缓存”的功能，而是通过另个类的方法去实现缓存所需的所有功能，这个类就是LocalCache，它的变量是包级访问级别。

保护(Protect or Access)代理： 控制对一个对象的访问，可以给不同的用户提供不同级别的使用权限。

那么LocalCache是如何实现cache的呢？

LocalCache实现了ConcurrentMap，并且使用了Segment的设计思想（不知道是否因为ConcurrentHashMap的影响，EhCache也使用了这种思想）。

补充：

segment

ConcurrentHashMap采用了二次hash的方式，第一次hash将key映射到对应的segment，而第二次hash则是映射到segment的不同桶中。为什么要用二次hash，主要原因是为了构造 分离锁 ，使得对于map的修改不会锁住整个容器，提高并发能力。

// map维护segment数组
final Segment[] segments;

1. Segment

Segment继承ReentrantLock，说明在cache的segments数组中的每个segment加锁。基本上所有的cache功能都是在segment上实现的。我们一步一步来看：

Segment中的put操作

首先它需要获得锁，然后做一些清理工作（guava的cache都是这种基于懒清理的模式，在put、get等操作的前/后执行清理）；

  long now = map.ticker.read();//当次进入的时间，nano
  preWriteCleanup(now);// 基于当前时间做清理，比如写入后3s中失效这种场景

接下来，根据长度判断是否需要expand，expand后会生成一个newTable；

  if (newCount > this.threshold) { // ensure capacity
    expand();
    newCount = this.count + 1;
  }

然后，根据put的语义，如果已存在entry，需要返回旧的entry。那么根据二次hash找到segment中的一个链表的头，开始遍历，找到存在的entry。
当找到一个已存在的entry时，需要先判断当前的ValueRefernece中的值事实上已经被回收，因为它们可以是WeakReference或者SoftReferenc类型。对于弱引用和软引用如果被回收，valueReference.get()会返回null。如果没有回收，就替换旧的值，换做新值。
然后，针对移除的对象，构造移除通知对象（RemovalNotification），指定相应的原因：COLLECTED、REPLACED等，进入队列。之后，会统一顺序利用LocalCache注册的RemovalListener，执行针对通知对象的回调。由于回调事件处理可能会有很长时间，因而这里将事件处理的逻辑在退出锁以后才做。代码中是hash值，是第二次的hash值。
```
  enqueueNotification(key, hash, valueReference, RemovalCause.COLLECTED);
  enqueueNotification(key, hash, valueReference, RemovalCause.REPLACED);
```

如果链表中没有该key对应的entry。create后，加入到链表头。

  // Create a new entry.
  ++modCount;
  ReferenceEntry newEntry = newEntry(key, hash, first);
  // Sets a new value of an entry. Adds newly created entries at the end of the access queue.
  setValue(newEntry, key, value, now);
  // 设入新的table
  table.set(index, newEntry);
  newCount = this.count + 1;
  this.count = newCount; // write-volatile

注意，返回null不能代表需要被移除

因为有一种value是LoadingValueReference类型的。在需要动态加载一个key的值时（场景就是第二个例子：如果cache中没有key，会调用load方法加载），实现是：

先把该值封装在LoadingValueReference中，以表达该key对应的值已经在加载了；
如果其他线程也要查询该key对应的值，就能得到该引用，然后同步执行load方法；
在该只加载完成后，将LoadingValueReference替换成其他ValueReference类型。

所以，在load()执行完成之前，在其他线程得到的value一定是一个 不完全对象 ，因此不能认为应该将它remove。

那么如何区分呢

在valueReference增加了一个active方法，用来标明这个entry是否已经正确在cache中，由于新建的LoadingValueReference，其内部初始值是UNSET，它的isActive为false，这样通过isActive就可以判断该ValueReference是否是一个完全状态。

put中对Active的判断，可以看到如果active为false就直接赋值。不过很可能，一会load执行完后，值又被替换成load后的值了（替换的时候，count会减1）。

  if (valueReference.isActive()) {
      enqueueNotification(key, hash, valueReference, RemovalCause.COLLECTED);
      setValue(e, key, value, now);
      newCount = this.count; // count remains unchanged
  } else {
      setValue(e, key, value, now);
      newCount = this.count + 1;
  }

Segment的get操作

get操作分为两种，一种是get，另一种是带CacheLoader的get。逻辑分别是：

get从缓存中取结果；
带CacheLoader的get，如果缓存中无结果，返回cacheloader的load的方法返回的结果，然后写入缓存entry。

具体介绍一下带CacheLoader的get操作：

则获取对象引用（引用可能是非alive的，比如是需要失效的、比如是loading的）；
判断对象引用是否是alive的（如果entry是非法的、部分回收的、loading状态、需要失效的，则认为不是alive）。
如果对象是alive的，如果设置refresh，则异步刷新查询value，然后等待返回最新value。
针对不是alive的，但却是在loading的，等待loading完成（阻塞等待）。
这里如果value还没有拿到，则查询loader方法获取对应的值（阻塞获取）。

以上就是get的逻辑，代码如下：

V get(K key, int hash, CacheLoader loader) throws ExecutionException {
  checkNotNull(key);
  checkNotNull(loader);
  try {
    if (count != 0) { // read-volatile
      // don't call getLiveEntry, which would ignore loading values
      ReferenceEntry e = getEntry(key, hash);
      if (e != null) {
        // 记录进入时间
        long now = map.ticker.read();
        // 判断是否为alive（此处是懒失效，在每次get时才检查是否达到失效时机）
        V value = getLiveValue(e, now);
        if (value != null) {
          // 记录
          recordRead(e, now);
          // 命中
          statsCounter.recordHits(1);
          // 刷新
          return scheduleRefresh(e, key, hash, value, now, loader);
        }
        ValueReference valueReference = e.getValueReference();
        if (valueReference.isLoading()) {
          // 如果正在加载的，等待加载完成后，返回加载的值。（阻塞，future的get)
          return waitForLoadingValue(e, key, valueReference);
        }
      }
    }
    // 此处或者为null，或者已经被失效。
    return lockedGetOrLoad(key, hash, loader);
  } catch (ExecutionException ee) {
    Throwable cause = ee.getCause();
    if (cause instanceof Error) {
      throw new ExecutionError((Error) cause);
    } else if (cause instanceof RuntimeException) {
      throw new UncheckedExecutionException(cause);
    }
    throw ee;
  } finally {
    postReadCleanup();
  }
}

lockedGetOrLoad方法

V lockedGetOrLoad(K key, int hash, CacheLoader loader)
    throws ExecutionException {
  ReferenceEntry e;
  ValueReference valueReference = null;
  LoadingValueReference loadingValueReference = null;
  boolean createNewEntry = true;

  // 对segment加锁
  lock();
  try {
    // re-read ticker once inside the lock
    long now = map.ticker.read();
    // 加锁清清理GC遗留引用数据和超时数据（重入锁）
    preWriteCleanup(now);

    int newCount = this.count - 1;
    AtomicReferenceArray> table = this.table;
    int index = hash & (table.length() - 1);
    ReferenceEntry first = table.get(index);

    for (e = first; e != null; e = e.getNext()) {
      K entryKey = e.getKey();
      if (e.getHash() == hash && entryKey != null
          && map.keyEquivalence.equivalent(key, entryKey)) {
        // 在链表中找到e
        valueReference = e.getValueReference();
        // 正在loading 不需要新load
        if (valueReference.isLoading()) {
          createNewEntry = false;
        } else {
          V value = valueReference.get();
          // 为null，说明被gc回收了。
          if (value == null) {
            // 相关通知操作
            enqueueNotification(entryKey, hash, valueReference, RemovalCause.COLLECTED);
          } else if (map.isExpired(e, now)) {
            // This is a duplicate check, as preWriteCleanup already purged expired
            // entries, but let's accomodate an incorrect expiration queue.
            enqueueNotification(entryKey, hash, valueReference, RemovalCause.EXPIRED);
          } else {
            recordLockedRead(e, now);
            statsCounter.recordHits(1);
            // we were concurrent with loading; don't consider refresh
            return value;
          }

          // 清除掉非法的数据（被回收的、失效的）
          writeQueue.remove(e);
          accessQueue.remove(e);
          this.count = newCount; // write-volatile
        }
        break;
      }
    }

    if (createNewEntry) {
      // LoadingValueReference类型
      loadingValueReference = new LoadingValueReference();

      if (e == null) {
        // 新建一个entry
        e = newEntry(key, hash, first);
        e.setValueReference(loadingValueReference);
        // 写入index的位置
        table.set(index, e);
      } else {
        // 到此，说e找到，但是是非法的，数据已被移除。e放入新建的引用
        e.setValueReference(loadingValueReference);
      }
    }
  } finally {
    unlock();
    postWriteCleanup();
  }

  // 上面加锁部分建完了新的entry，设置完valueReference（isAlive为false，isLoading 为false），到此，锁已经被释放，其他线程可以拿到一个loading状态的引用。这就符合get时，拿到loading状态引用后，阻塞等待加载的逻辑了。
  if (createNewEntry) {
    try {
      // 这里只对e加锁，而不是segment，允许get操作进入。
      synchronized (e) {
        // 这个方法同步、线程安全的将key和value都放到cache中。
        return loadSync(key, hash, loadingValueReference, loader);
      }
    } finally {
      statsCounter.recordMisses(1);
    }
  } else {
    // The entry already exists. Wait for loading.
    return waitForLoadingValue(e, key, valueReference);
  }
}

2. ReferenceEntry和ValueReference

之前说过，guava cache支持不同级别的的引用。首先来确认一下，java中的四种引用。

四种引用

强引用
- StringBuffer buffer = new StringBuffer();
- 如果一个对象通过一串强引用链接可到达(Strongly reachable)，它是不会被回收的
弱引用
- 在垃圾回收器线程扫描它所管辖的内存区域的过程中，一旦发现了只具有弱引用的对象，不管当前内存空间足够与否，都会回收它的内存。
- 不过，由于垃圾回收器是一个优先级很低的线程，因此不一定会很快发现那些只具有弱引用的对象。
```
  Car obj = new Car("red");
  WeakReference weakCar = new WeakReference(obj);
  //  obj=new Car("blue");
  while(true){
      String[] arr = new String[1000];
      if(weakCar.get()!=null){
          // do something
      }else{
          System.out.println("Object has been collected.");
          break;
      }
  }
```
- 如上述代码，weak引用的对象，有个强引用也就是red Car，所以是不会回收的。
- 但是，如果就上面的注释删去，那么原来的obj引用了新的对象也就是blue car。原来对象已经没有强引用了，所以虚拟机会将weak回收掉。
软引用
- 软引用基本上和弱引用差不多，只是相比弱引用
- 当内存不足时垃圾回收器才会回收这些软引用可到达的对象。
虚引用
- 与软引用，弱引用不同，虚引用指向的对象十分脆弱，我们不可以通过get方法来得到其指向的对象。
- 它的唯一作用就是当其指向的对象被回收之后，自己被加入到引用队列，用作记录该引用指向的对象已被销毁。

引用队列(Reference Queue)

引用队列可以很容易地实现跟踪不需要的引用。
一旦弱引用对象开始返回null，该弱引用指向的对象就被标记成了垃圾。
当你在构造WeakReference时传入一个ReferenceQueue对象，当该引用指向的对象被标记为垃圾的时候，这个引用对象会自动地加入到引用队列里面。

ReferenceEntry的类图

Cache中的所有Entry都是基于ReferenceEntry的实现。
信息包括：自身hash值，写入时间，读取时间。每次写入和读取的队列。以及链表指针。
每个Entry中均包含一个ValueReference类型来表示值。

guava_cache_reference

ValueReference的类图

对于ValueReference，有三个实现类：StrongValueReference、SoftValueReference、WeakValueReference。为了支持动态加载机制，它还有一个LoadingValueReference，在需要动态加载一个key的值时，先把该值封装在LoadingValueReference中，以表达该key对应的值已经在加载了，如果其他线程也要查询该key对应的值，就能得到该引用，并且等待改值加载完成，从而保证该值只被加载一次（可以在evict以后重新加载）。在该值加载完成后，将LoadingValueReference替换成其他ValueReference类型。（后面会细说）
每个ValueReference都纪录了weight值，所谓weight从字面上理解是“该值的重量”，它由Weighter接口计算而得。
还定义了Stength枚举类型作为ValueReference的factory类，它有三个枚举值：Strong、Soft、Weak，这三个枚举值分别创建各自的ValueReference。

guava_cache_value

WeakEntry为例子

在cache的put操作和带CacheBuilder中的都有newEntry的操作。newEntry根据cache builder的配置生成不用级别的引用，比如put操作：

// Create a new entry.
++modCount;
// 新建一个entry
ReferenceEntry newEntry = newEntry(key, hash, first);
// 设置值，也就是valueRerence
setValue(newEntry, key, value, now);

newEntry方法

根据cache创建时的配置（代码中生成的工厂），生成不同的Entry。

ReferenceEntry newEntry(K key, int hash, @Nullable ReferenceEntry next) {
  return map.entryFactory.newEntry(this, checkNotNull(key), hash, next);
}

调用WEAK的newEntry，其中segment.keyReferenceQueue是key的引用队列。还有一个value的引用队列，valueReferenceQueue一会会出现。

WEAK {
  @Override
   ReferenceEntry newEntry(
      Segment segment, K key, int hash, @Nullable ReferenceEntry next) {
    return new WeakEntry(segment.keyReferenceQueue, key, hash, next);
  }
},

setValue方法

首先要生成一个valueReference，然后set到entry中。

ValueReference valueReference =
      map.valueStrength.referenceValue(this, entry, value, weight);
entry.setValueReference(valueReference);

Value的WEAK跟key的WEAK形式很像。只不过，增加了weight值（cachebuilder复写不同k，v对应的权重）和value的比较方法。

WEAK {
  @Override
   ValueReference referenceValue(
      Segment segment, ReferenceEntry entry, V value, int weight) {
    return (weight == 1)
        ? new WeakValueReference(segment.valueReferenceQueue, value, entry)
        : new WeightedWeakValueReference(
            segment.valueReferenceQueue, value, entry, weight);
  }

  @Override
  Equivalence

Guava Cache

Guava Cache以下的特性：

Guava Cache结构

一、Cache及其实现和扩展

1. Segment

Segment中的put操作

Segment的get操作

2. ReferenceEntry和ValueReference

四种引用

引用队列(Reference Queue)

ReferenceEntry的类图

ValueReference的类图

WeakEntry为例子

cache如何基于引用做清理

3. Cache的失效和回调

基于读写时间失效

失效的通知回调

二、Cache的统计功能

1. CacheStats

2. StatsCounter

三、Cache配置

1. CacheBuilder

参数

构造

2. LocalCache

你可能感兴趣的:(Guava Cache)