聊聊Java WeakHashMap背后的事情

前言

本月还没有写过Java相关的东西,今天终于挤出点时间来了,弄一篇基础知识吧。

聊聊Java WeakHashMap背后的事情_第1张图片

WeakHashMap是平时常见的HashMap的变种,它是基于弱引用(WeakReference)的。我们已经知道,不管内存是否足够,弱引用对象都会随着GC被回收,所以WeakHashMap特别适用于内存敏感的局部缓存场景。本文简单探究一下它的部分实现细节。

属性

WeakHashMap的属性比HashMap来得简单,列举如下。

public class WeakHashMap
    extends AbstractMap
    implements Map {
    /**
     * The default initial capacity -- MUST be a power of two.
     */
    private static final int DEFAULT_INITIAL_CAPACITY = 16;
    /**
     * The maximum capacity, used if a higher value is implicitly specified
     * by either of the constructors with arguments.
     * MUST be a power of two <= 1<<30.
     */
    private static final int MAXIMUM_CAPACITY = 1 << 30;
    /**
     * The load factor used when none specified in constructor.
     */
    private static final float DEFAULT_LOAD_FACTOR = 0.75f;
    /**
     * The table, resized as necessary. Length MUST Always be a power of two.
     */
    Entry[] table;
    /**
     * The number of key-value mappings contained in this weak hash map.
     */
    private int size;
    /**
     * The next size value at which to resize (capacity * load factor).
     */
    private int threshold;
    /**
     * The load factor for the hash table.
     */
    private final float loadFactor;
    /**
     * Reference queue for cleared WeakEntries
     */
    private final ReferenceQueue queue = new ReferenceQueue<>();
    // ...... 
   

WeakHashMap的初始容量、最大容量和默认装载因子与HashMap相同,但是没有转化成红黑树的阈值(TREEIFY_THRESHOLD、MIN_TREEIFY_CAPACITY),说明WeakHashMap的哈希桶数组table是纯拉链的。另外一个不同点是WeakHashMap需要维护一个引用队列,弱引用会注册到这个引用队列,它的作用下面会看到。

Entry(键值对)结构

/**
 * The entries in this hash table extend WeakReference, using its main ref
 * field as the key.
 */
private static class Entry extends WeakReference implements Map.Entry {
    V value;
    final int hash;
    Entry next;

    /**
     * Creates new entry.
     */
    Entry(Object key, V value,
          ReferenceQueue queue,
          int hash, Entry next) {
        super(key, queue);
        this.value = value;
        this.hash  = hash;
        this.next  = next;
    }

    @SuppressWarnings("unchecked")
    public K getKey() {
        return (K) WeakHashMap.unmaskNull(get());
    }

    public V getValue() {
        return value;
    }

    // ......
} 
   

Entry内部类直接继承自WeakReference,在构造Entry时也会先调用WeakReference的构造方法WeakReference(T referent, ReferenceQueue q),传入被引用对象(Entry的key)和注册到的引用队列(queue),所以“弱Entry”的本质其实是“弱key”——即当key的可达性发生变化时(变为弱可达或者不可达),GC的同时会自动将key关联的Entry放入队列。

弱Entry的清除

WeakHashMap上的读写等操作就是HashMap对应逻辑的简化版,不再赘述,仅来看看弱Entry是如何被清除掉的。

聊聊Java WeakHashMap背后的事情_第2张图片

如上图所示,对WeakHashMap的几乎所有操作都会调用getTable()这个方法来取得哈希桶数组。但是在返回table之前,要先调用expungeStaleEntries()方法:

private Entry[] getTable() {
    expungeStaleEntries();
    return table;
}

/**
 * Expunges stale entries from the table.
 */
private void expungeStaleEntries() {
    for (Object x; (x = queue.poll()) != null; ) {
        synchronized (queue) {
            @SuppressWarnings("unchecked")
                Entry e = (Entry) x;
            int i = indexFor(e.hash, table.length);
            Entry prev = table[i];
            Entry p = prev;
            while (p != null) {
                Entry next = p.next;
                if (p == e) {
                    if (prev == e)
                        table[i] = next;
                    else
                        prev.next = next;
                    // Must not null out e.next;
                    // stale entries may be in use by a HashIterator
                    e.value = null; // Help GC
                    size--;
                    break;
                }
                prev = p;
                p = next;
            }
        }
    }
}

expungeStaleEntries()方法会加锁遍历引用队列,取出其中已经被回收掉key的Entry,调用indexFor()方法根据其哈希值定位到哈希桶数组中的桶,再遍历桶对应的单链表,删除对应的Entry(注意区分Entry是否在链表头的情况)。

可见,所谓“删除”只是将Entry的value置为空,因为key已经被回收掉了,这样做就能保证整个Entry在下一次GC时被彻底清理。通过在增删改查之前调用expungeStaleEntries()方法,GC的效果就可以及时地反映到table上了。

一个小问题

看官可能会有些疑惑:构造Entry时使用的弱引用referent明明是key,为什么从引用队列poll时,出来的却是Entry呢?不妨再看一眼WeakReference的基类Reference中的enqueue()方法:

/**
 * Adds this reference object to the queue with which it is registered,
 * if any.
 *
 * 

This method is invoked only by Java code; when the garbage collector * enqueues references it does so directly, without invoking this method. * * @return true if this reference object was successfully * enqueued; false if it was already enqueued or if * it was not registered with a queue when it was created */ public boolean enqueue() { return this.queue.enqueue(this); } // 这是ReferenceQueue.enqueue()方法 boolean enqueue(Reference r) { /* Called only by Reference class */ synchronized (lock) { // Check that since getting the lock this reference hasn't already been // enqueued (and even then removed) ReferenceQueue queue = r.queue; if ((queue == NULL) || (queue == ENQUEUED)) { return false; } assert queue == this; r.queue = ENQUEUED; r.next = (head == null) ? r : head; head = r; queueLength++; if (r instanceof FinalReference) { sun.misc.VM.addFinalRefCount(1); } lock.notifyAll(); return true; } }

从注释可知,当垃圾收集器将引用插入引用队列时,虽然不会真正调用上述enqueue()方法,但是逻辑是相同的,都是插入this。也就是说,实际进入引用队列的要么是WeakReference本身,要么是继承了WeakReference的对象实例——在WeakHashMap中自然就是Entry了。

测试一下

public class WeakHashMapExample {
  public static void main(String[] args) throws Exception {
    WeakHashMap map = new WeakHashMap<>();

    map.put("first", 1);
    map.put(new String("second"), 2);
    String third = new String("third");
    map.put(third, 3);

    System.gc();

    System.out.println(map.get("first"));  // 1
    System.out.println(map.get("second")); // null
    System.out.println(map.get("third"));  // 3
  }
}

注意到只有"second"会消失。"first"位于字符串常量池中,"third"保持有强引用,所以手动触发GC不会将它们回收掉。

The End

还有事情要做,民那晚安晚安。

你可能感兴趣的:(聊聊Java WeakHashMap背后的事情)