网上看了很多CHM的分析,可能都不是很让我能有恍然大明白的感觉,毕竟1.7到1.8的巨大改动,肯定是有什么更高深的算法或者设计在里面。所以想自己去分析下。个人愚见,不喜欢请轻喷。
看了JDK1.8的CHM的代码,洋洋洒洒6000+行,应该是jdk源码最复杂的一个类了吧。头疼,分析一下流程和核心API吧~
对比1.8和1.7的升级优化的区别如下:
普通的数据节点,维护了hash, key , value, nextNode的属性
红黑树的数据节点,维护了hash, key , value, nextNode的属性
红黑树的根节点。没实际数据意义,hash = -2
当map需要reszie的时候,放在链表头的节点。没实际数据意义,hash = -1
当map是computeIfAbsent时候用于保留,hash = -3
/**
* Table initialization and resizing control. When negative, the
* table is being initialized or resized: -1 for initialization,
* else -(1 + the number of active resizing threads). Otherwise,
* when table is null, holds the initial table size to use upon
* creation, or 0 for default. After initialization, holds the
* next element count value upon which to resize the table.
*/
private transient volatile int sizeCtl;
主要用途是用来标志现在map的状态的。
/**
* A padded cell for distributing counts. Adapted from LongAdder
* and Striped64. See their internal docs for explanation.
*/
@sun.misc.Contended static final class CounterCell {
volatile long value;
CounterCell(long x) { value = x; }
}
精髓之一
用于计算totalSize,当并发过大的时候,如果通过CAS来进行同步,当然可以,但是效率很低。所以在计算totalSize的时候是通过baseCount和CountCell[]来计算的,运用了分流治之的思想。
源码如下:
/**
* Maps the specified key to the specified value in this table.
* Neither the key nor the value can be null.
*
* The value can be retrieved by calling the {@code get} method
* with a key that is equal to the original key.
*
* @param key key with which the specified value is to be associated
* @param value value to be associated with the specified key
* @return the previous value associated with {@code key}, or
* {@code null} if there was no mapping for {@code key}
* @throws NullPointerException if the specified key or value is null
*/
public V put(K key, V value) {
return putVal(key, value, false);
}
/** Implementation for put and putIfAbsent */
final V putVal(K key, V value, boolean onlyIfAbsent) {
if (key == null || value == null) throw new NullPointerException();
int hash = spread(key.hashCode());
int binCount = 0;
for (Node<K,V>[] tab = table;;) {
Node<K,V> f; int n, i, fh;
if (tab == null || (n = tab.length) == 0)
tab = initTable();
else if ((f = tabAt(tab, i = (n - 1) & hash)) == null) {
if (casTabAt(tab, i, null,
new Node<K,V>(hash, key, value, null)))
break; // no lock when adding to empty bin
}
else if ((fh = f.hash) == MOVED)
tab = helpTransfer(tab, f);
else {
V oldVal = null;
synchronized (f) {
if (tabAt(tab, i) == f) {
if (fh >= 0) {
binCount = 1;
for (Node<K,V> e = f;; ++binCount) {
K ek;
if (e.hash == hash &&
((ek = e.key) == key ||
(ek != null && key.equals(ek)))) {
oldVal = e.val;
if (!onlyIfAbsent)
e.val = value;
break;
}
Node<K,V> pred = e;
if ((e = e.next) == null) {
pred.next = new Node<K,V>(hash, key,
value, null);
break;
}
}
}
else if (f instanceof TreeBin) {
Node<K,V> p;
binCount = 2;
if ((p = ((TreeBin<K,V>)f).putTreeVal(hash, key,
value)) != null) {
oldVal = p.val;
if (!onlyIfAbsent)
p.val = value;
}
}
}
}
if (binCount != 0) {
if (binCount >= TREEIFY_THRESHOLD)
treeifyBin(tab, i);
if (oldVal != null)
return oldVal;
break;
}
}
}
addCount(1L, binCount);
return null;
}
虽然主函数只有不到100行-。-可是各个地方都透露着精髓~
第一个if:初始化
/**
* Initializes table, using the size recorded in sizeCtl.
*/
private final Node<K,V>[] initTable() {
Node<K,V>[] tab; int sc;
while ((tab = table) == null || tab.length == 0) {
if ((sc = sizeCtl) < 0)
Thread.yield(); // lost initialization race; just spin
else if (U.compareAndSwapInt(this, SIZECTL, sc, -1)) {
try {
if ((tab = table) == null || tab.length == 0) {
int n = (sc > 0) ? sc : DEFAULT_CAPACITY;
@SuppressWarnings("unchecked")
Node<K,V>[] nt = (Node<K,V>[])new Node<?,?>[n];
table = tab = nt;
sc = n - (n >>> 2);
}
} finally {
sizeCtl = sc;
}
break;
}
}
return tab;
}
主要是通过对于sizeCtl 的判断和cas进行初始化工作,哪个线程抢到了sizeCtl(这个变量是volatile的 ),哪个线程进行初始化。其中CAS是通过Unsafe类进行操作。
第二个if是当前下标数组为null,通过cas setValue
else if ((f = tabAt(tab, i = (n - 1) & hash)) == null) {
if (casTabAt(tab, i, null,
new Node<K,V>(hash, key, value, null)))
break; // no lock when adding to empty bin
}
static final <K,V> Node<K,V> tabAt(Node<K,V>[] tab, int i) {
return (Node<K,V>)U.getObjectVolatile(tab, ((long)i << ASHIFT) + ABASE);
}
这个地方有一个知识点,既然table是volatile的,为什么还要用Unsafe类进行cas呢?
因为volatile只是对对象引用是线程可见的,对内部元素不是!
第三个if 判断是否在resize
else if ((fh = f.hash) == MOVED)
tab = helpTransfer(tab, f);
resize也是CHM的精髓之一
废话不多说,先上源码:
/**
* Helps transfer if a resize is in progress.
*/
final Node<K,V>[] helpTransfer(Node<K,V>[] tab, Node<K,V> f) {
Node<K,V>[] nextTab; int sc;
if (tab != null && (f instanceof ForwardingNode) &&
(nextTab = ((ForwardingNode<K,V>)f).nextTable) != null) {
int rs = resizeStamp(tab.length);
while (nextTab == nextTable && table == tab &&
(sc = sizeCtl) < 0) {
if ((sc >>> RESIZE_STAMP_SHIFT) != rs || sc == rs + 1 ||
sc == rs + MAX_RESIZERS || transferIndex <= 0)
break;
if (U.compareAndSwapInt(this, SIZECTL, sc, sc + 1)) {
transfer(tab, nextTab);
break;
}
}
return nextTab;
}
return table;
}
判断是否需要帮助 —>看nextTable是否已经完成
前提知识点:自己线程的stamp是什么?(resizeStamp函数)
/**
* The number of bits used for generation stamp in sizeCtl.
* Must be at least 6 for 32bit arrays.
*/
private static int RESIZE_STAMP_BITS = 16;
/**
* The maximum number of threads that can help resize.
* Must fit in 32 - RESIZE_STAMP_BITS bits.
*/
private static final int MAX_RESIZERS = (1 << (32 - RESIZE_STAMP_BITS)) - 1;
/**
* The bit shift for recording size stamp in sizeCtl.
*/
private static final int RESIZE_STAMP_SHIFT = 32 - RESIZE_STAMP_BITS;
/**
* Returns the stamp bits for resizing a table of size n.
* Must be negative when shifted left by RESIZE_STAMP_SHIFT.
*/
static final int resizeStamp(int n) {
return Integer.numberOfLeadingZeros(n) | (1 << (RESIZE_STAMP_BITS - 1));
}
首先拓容线程stamp和源table的长度有关~
比如n = 16 ; 二进制 —> 0000 0000 0000 0000 0000 0000 0001 0000
Interger.numberofLeadingZeros(16) = 27 (返回无符号数第一个不为0数字前有多少个0)
27的二进制: 0000 0000 0000 0000 0000 0000 0001 1100
resizeStamp(16) 结果就是 —> 0000 0000 0000 0000 1000 0000 0001 1100
ok 到此为止,线程的拓容戳已经创建好了~
前提知识点:sizeCtl的resize更新
第一次初始化resize值 ( 1 + resize的线程数)
else if (U.compareAndSwapInt(this, SIZECTL, sc,
(rs << RESIZE_STAMP_SHIFT) + 2))
随后增加线程帮助拓容 + 1
if (U.compareAndSwapInt(this, SIZECTL, sc, sc + 1))
第一次sizeCtl更新为(如果table长度是16)
再有线程进入帮忙拓容则在+1
高16位 | 低16位 |
---|---|
拓容标记 | 并行拓容的线程数量 |
transfer函数
简单来说,就是将原有table通过指针划分成多个区间,然后各个线程负责自己的区间。每个区间的数组按照逆向遍历的方式进行迁移,前已完成的数组下标的元素会标记为ForwardingNode,表示该数组下标已经拓容完成。不多说先上源码(以table.size = 16为例):
/** Number of CPUS, to place bounds on some sizings */
static final int NCPU = Runtime.getRuntime().availableProcessors();
private static final int MIN_TRANSFER_STRIDE = 16;
/**
* Moves and/or copies the nodes in each bin to new table. See
* above for explanation.
*/
private final void transfer(Node<K,V>[] tab, Node<K,V>[] nextTab) {
int n = tab.length, stride;
//计算每个线程负责区间的长度,通过当前机器的CPU和确定,如果小于16,则按照16来进行计算
if ((stride = (NCPU > 1) ? (n >>> 3) / NCPU : n) < MIN_TRANSFER_STRIDE)
stride = MIN_TRANSFER_STRIDE; // subdivide range
//初始化,构建 << 1长度的table
if (nextTab == null) { // initiating
try {
@SuppressWarnings("unchecked")
Node<K,V>[] nt = (Node<K,V>[])new Node<?,?>[n << 1];
nextTab = nt;
} catch (Throwable ex) { // try to cope with OOME
sizeCtl = Integer.MAX_VALUE;
return;
}
nextTable = nextTab;
transferIndex = n;
}
int nextn = nextTab.length;
//创建一个FowwardingNode,维护的是拓容后的数组。用来告知是否数组bucket已经拓容完成
ForwardingNode<K,V> fwd = new ForwardingNode<K,V>(nextTab);
//拓容是否推进,即是否拓容完了一个bucket,是否进行下一个
boolean advance = true;
//节点是否已经拓容完毕
boolean finishing = false; // to ensure sweep before committing nextTab
for (int i = 0, bound = 0;;) {
//通过循环来处理bucket中的元素,通过CAS设置transferIndex,循环中i表示正在处理的bucket位置,bound表示需要处理的边界,初始化transferIndex = 16.
Node<K,V> f; int fh;
while (advance) {
int nextIndex, nextBound;
//i = -1;
// --i保证了正在处理的bucket向下的遍历
if (--i >= bound || finishing)
advance = false;
//nextIndex = 16;
else if ((nextIndex = transferIndex) <= 0) {
i = -1;
advance = false;
}
//transferIndex = 16;
//nextBound = 0;
else if (U.compareAndSwapInt
(this, TRANSFERINDEX, nextIndex,
nextBound = (nextIndex > stride ?
nextIndex - stride : 0))) {
//bound = 0;
//i = 15;
bound = nextBound;
i = nextIndex - 1;
advance = false;
}
}
//经过分配后,当前线程的处理[0,15],transferIndex = 0;
// 当完成了当前线程的任务
if (i < 0 || i >= n || i + n >= nextn) {
int sc;
// 拓容完成,更新成员变量
if (finishing) {
nextTable = null;
table = nextTab;
sizeCtl = (n << 1) - (n >>> 1);
return;
}
// 当前线程拓展完毕,则更新sizeCtl - 1
if (U.compareAndSwapInt(this, SIZECTL, sc = sizeCtl, sc - 1)) {
// 全部拓容线程更新完毕,则执行return
if ((sc - 2) != resizeStamp(n) << RESIZE_STAMP_SHIFT)
return;
// 更新当前线程标记量
finishing = advance = true;
// i = 16
i = n; // recheck before commit
}
}
// 如果当前老table的i = 15 位置为空,则放一个fwd
else if ((f = tabAt(tab, i)) == null)
advance = casTabAt(tab, i, null, fwd);
// 已经处理过的bucket
else if ((fh = f.hash) == MOVED)
advance = true; // already processed
// 如果当前位置老table的i不为Null,首先锁定链表首节点
else {
synchronized (f) {
if (tabAt(tab, i) == f) {
Node<K,V> ln, hn;
if (fh >= 0) {
int runBit = fh & n;//ln 表示低位, hn 表示高位;接下来这段代码的作用 是把链表拆分成两部分,0 在低位,1 在高位
Node<K,V> lastRun = f; //链表头
for (Node<K,V> p = f.next; p != null; p = p.next) {
// n = 16, 二进制是 1 0000, & 表示看p.hash的第5位是否为1
int b = p.hash & n;
if (b != runBit) {
runBit = b;
lastRun = p;
}
}
// 开始一直不明白为什么需要加这个循环,直接区分高低位,形成两个新的链表即可,后来仔细研究才明白,配合后面的循环,目的是为了如果尾部的runbit都一致,那就没必要在进行重新构造。
if (runBit == 0) {
ln = lastRun;
hn = null;
}
else {
hn = lastRun;
ln = null;
}
// 创新生成高低链表
for (Node<K,V> p = f; p != lastRun; p = p.next) {
int ph = p.hash; K pk = p.key; V pv = p.val;
if ((ph & n) == 0)
ln = new Node<K,V>(ph, pk, pv, ln);
else
hn = new Node<K,V>(ph, pk, pv, hn);
}
// 链表插入到对应的位置,并且更新老table当前i的fwd-->已经迁移完成
setTabAt(nextTab, i, ln);
setTabAt(nextTab, i + n, hn);
setTabAt(tab, i, fwd);
advance = true;
}
else if (f instanceof TreeBin) {
TreeBin<K,V> t = (TreeBin<K,V>)f;
TreeNode<K,V> lo = null, loTail = null;
TreeNode<K,V> hi = null, hiTail = null;
int lc = 0, hc = 0;
for (Node<K,V> e = t.first; e != null; e = e.next) {
int h = e.hash;
TreeNode<K,V> p = new TreeNode<K,V>
(h, e.key, e.val, null, null);
if ((h & n) == 0) {
if ((p.prev = loTail) == null)
lo = p;
else
loTail.next = p;
loTail = p;
++lc;
}
else {
if ((p.prev = hiTail) == null)
hi = p;
else
hiTail.next = p;
hiTail = p;
++hc;
}
}
ln = (lc <= UNTREEIFY_THRESHOLD) ? untreeify(lo) :
(hc != 0) ? new TreeBin<K,V>(lo) : t;
hn = (hc <= UNTREEIFY_THRESHOLD) ? untreeify(hi) :
(lc != 0) ? new TreeBin<K,V>(hi) : t;
setTabAt(nextTab, i, ln);
setTabAt(nextTab, i + n, hn);
setTabAt(tab, i, fwd);
advance = true;
}
}
}
}
}
}
图解:
计算区间大小 = 16;
遍历各个线程负责的区间,根据i来处理对应的bucket.
if (i < 0 || i >= n || i + n >= nextn) {
int sc;
if (finishing) {
nextTable = null;
table = nextTab;
sizeCtl = (n << 1) - (n >>> 1);
return;
}
if (U.compareAndSwapInt(this, SIZECTL, sc = sizeCtl, sc - 1)) {
if ((sc - 2) != resizeStamp(n) << RESIZE_STAMP_SHIFT)
return;
finishing = advance = true;
i = n; // recheck before commit
}
}
总结:
充分利用了CAS和并发的效率,从而可以高效的进行拓容操作。
最后是要遍历链表或者数组进行SetValue
else {
V oldVal = null;
synchronized (f) {
if (tabAt(tab, i) == f) {
if (fh >= 0) {
binCount = 1;
for (Node<K,V> e = f;; ++binCount) {
K ek;
if (e.hash == hash &&
((ek = e.key) == key ||
(ek != null && key.equals(ek)))) {
oldVal = e.val;
if (!onlyIfAbsent)
e.val = value;
break;
}
Node<K,V> pred = e;
if ((e = e.next) == null) {
pred.next = new Node<K,V>(hash, key,
value, null);
break;
}
}
}
else if (f instanceof TreeBin) {
Node<K,V> p;
binCount = 2;
if ((p = ((TreeBin<K,V>)f).putTreeVal(hash, key,
value)) != null) {
oldVal = p.val;
if (!onlyIfAbsent)
p.val = value;
}
}
}
}
if (binCount != 0) {
if (binCount >= TREEIFY_THRESHOLD)
treeifyBin(tab, i);
if (oldVal != null)
return oldVal;
break;
}
}
首先锁住链表的head,防止其他线程进入,进入单线程模式~(这里也就是1.7和1.8锁的粒度不一致的地方了,1.8更高效,因为锁的粒度更低)
之后就是遍历链表或者红黑树,和hashmap将数据放进去即可~
最后判断一下数组长度,如果是大于红黑树阈值,就转化为红黑树数据结构。
最后是重中之重 ,put了一个元素进去,该增加size了~
/**
* Adds to count, and if table is too small and not already
* resizing, initiates transfer. If already resizing, helps
* perform transfer if work is available. Rechecks occupancy
* after a transfer to see if another resize is already needed
* because resizings are lagging additions.
*
* @param x the count to add
* @param check if <0, don't check resize, if <= 1 only check if uncontended
*/
private final void addCount(long x, int check) {
CounterCell[] as; long b, s;
if ((as = counterCells) != null ||
!U.compareAndSwapLong(this, BASECOUNT, b = baseCount, s = b + x)) {
CounterCell a; long v; int m;
boolean uncontended = true;
if (as == null || (m = as.length - 1) < 0 ||
(a = as[ThreadLocalRandom.getProbe() & m]) == null ||
!(uncontended =
U.compareAndSwapLong(a, CELLVALUE, v = a.value, v + x))) {
fullAddCount(x, uncontended);
return;
}
if (check <= 1)
return;
s = sumCount();
}
if (check >= 0) {
Node<K,V>[] tab, nt; int n, sc;
while (s >= (long)(sc = sizeCtl) && (tab = table) != null &&
(n = tab.length) < MAXIMUM_CAPACITY) {
int rs = resizeStamp(n);
if (sc < 0) {
if ((sc >>> RESIZE_STAMP_SHIFT) != rs || sc == rs + 1 ||
sc == rs + MAX_RESIZERS || (nt = nextTable) == null ||
transferIndex <= 0)
break;
if (U.compareAndSwapInt(this, SIZECTL, sc, sc + 1))
transfer(tab, nt);
}
else if (U.compareAndSwapInt(this, SIZECTL, sc,
(rs << RESIZE_STAMP_SHIFT) + 2))
transfer(tab, null);
s = sumCount();
}
}
}
流程图如下:
第一步更新baseCount,通过CAS更新失败,则说明存在并发,那么走第二个if
第二步获取随机数,然后计算到底这个count应该加再哪个CounterCell数组中,然后通过cas添加。如果失败,则说明存在竞争,进入fullAddCount方法
需要注意两个地方:
第三步 fullAddCount方法我放弃了。简要说下思路:
初始化CounterCell size = 2并且以<<1 拓容
本身方法是一个自旋
使用到了自旋锁
/**
* Spinlock (locked via CAS) used when resizing and/or creating CounterCells.
*/
private transient volatile int cellsBusy;
总结下为什么要这么做?把这个addCount方法设计的这么复杂。
- 不直接使用synchronized是为了效率
- 不直接使用CAS,是害怕在高并发的时候,不停在CAS丢失了效率
- 使用baseCount + CounterCell[]的好处是在于
- 低并发的时候可以直接用baseCount解决问题
- 高并发的时候,可以通过ThreadLocalRandom和CounterCell[]进行很好的分流工作,有效的减少了无意义的锁和CAS。类似于Nginx负载均衡的效果。
get相对来说比较简单了直接寻找hash和key.equals上源码:
/**
* Returns the value to which the specified key is mapped,
* or {@code null} if this map contains no mapping for the key.
*
* More formally, if this map contains a mapping from a key
* {@code k} to a value {@code v} such that {@code key.equals(k)},
* then this method returns {@code v}; otherwise it returns
* {@code null}. (There can be at most one such mapping.)
*
* @throws NullPointerException if the specified key is null
*/
public V get(Object key) {
Node<K,V>[] tab; Node<K,V> e, p; int n, eh; K ek;
int h = spread(key.hashCode());
//正常寻找数组中的bucket位置
if ((tab = table) != null && (n = tab.length) > 0 &&
(e = tabAt(tab, (n - 1) & h)) != null) {
//hash相同的话,直接比较key
if ((eh = e.hash) == h) {
if ((ek = e.key) == key || (ek != null && key.equals(ek)))
return e.val;
}
//bucket中的hash 是负数 -->可能在拓容,遍历链表
else if (eh < 0)
return (p = e.find(h, key)) != null ? p.val : null;
//遍历bucket的链表寻找元素
while ((e = e.next) != null) {
if (e.hash == h &&
((ek = e.key) == key || (ek != null && key.equals(ek))))
return e.val;
}
}
return null;
}
看过了put方法的addCount方法,其实这个就很简单了,直接上源码:
/**
* {@inheritDoc}
*/
public int size() {
long n = sumCount();
return ((n < 0L) ? 0 :
(n > (long)Integer.MAX_VALUE) ? Integer.MAX_VALUE :
(int)n);
}
final long sumCount() {
CounterCell[] as = counterCells; CounterCell a;
long sum = baseCount;
if (as != null) {
for (int i = 0; i < as.length; ++i) {
if ((a = as[i]) != null)
sum += a.value;
}
}
return sum;
}
就是遍历counterCell的值再累加baseCount即可~~