在上一篇博文我们介绍了ConcurrentHashMap在jdk1.8中所必要的知识,作为基础入门。因为jdk1.8的ConcurrentHashMap做了太多的变动,所以新知识学习是必要的。今天是ConcurrentHashMap的第二篇,第二篇主要是认识ConcurrentHashMap,我将会介绍一下它的关键成员变量和一些关键的类。大家可以结合前几篇博文的HashMap的知识,很多还是很相似的。笔者目前整理的一些blog针对面试都是超高频出现的。大家可以点击链接:http://blog.csdn.net/u012403290
如何实现线程安全:
我们都知道ConcurrentHashMap核心是线程安全的,那么它又是用什么来实现线程安全的呢?在jdk1.8中主要是采用了CAS算法实现线程安全的。在上一篇博文中已经介绍了CAS的无锁操作,这里不再赘述。同时它通过CAS算法又实现了3种原子操作(线程安全的保障就是操作具有原子性),下面我赋值了源码分别表示哪些成员变量采用了CAS算法,然后又是哪些方法实现了操作的原子性:
// Unsafe mechanics CAS保障了哪些成员变量操作是原子性的
private static final sun.misc.Unsafe U;
private static final long LOCKSTATE;
static {
try {
U = sun.misc.Unsafe.getUnsafe();
Class> k = TreeBin.class; //操作TreeBin,后面会介绍这个类
LOCKSTATE = U.objectFieldOffset
(k.getDeclaredField("lockState"));
} catch (Exception e) {
throw new Error(e);
}
}
--------------------------------------------------------------------------------------
private static final sun.misc.Unsafe U;
private static final long SIZECTL;
private static final long TRANSFERINDEX;
private static final long BASECOUNT;
private static final long CELLSBUSY;
private static final long CELLVALUE;
private static final long ABASE;
private static final int ASHIFT;
static {
try {
//以下变量会在下面介绍到
U = sun.misc.Unsafe.getUnsafe();
Class> k = ConcurrentHashMap.class;
SIZECTL = U.objectFieldOffset
(k.getDeclaredField("sizeCtl"));
TRANSFERINDEX = U.objectFieldOffset
(k.getDeclaredField("transferIndex"));
BASECOUNT = U.objectFieldOffset
(k.getDeclaredField("baseCount"));
CELLSBUSY = U.objectFieldOffset
(k.getDeclaredField("cellsBusy"));
Class> ck = CounterCell.class;
CELLVALUE = U.objectFieldOffset
(ck.getDeclaredField("value"));
Class> ak = Node[].class;
ABASE = U.arrayBaseOffset(ak);
int scale = U.arrayIndexScale(ak);
if ((scale & (scale - 1)) != 0)
throw new Error("data type scale not a power of two");
ASHIFT = 31 - Integer.numberOfLeadingZeros(scale);
} catch (Exception e) {
throw new Error(e);
}
}
//3个原子性操作方法:
/* ---------------- Table element access -------------- */
/*
* Volatile access methods are used for table elements as well as
* elements of in-progress next table while resizing. All uses of
* the tab arguments must be null checked by callers. All callers
* also paranoically precheck that tab's length is not zero (or an
* equivalent check), thus ensuring that any index argument taking
* the form of a hash value anded with (length - 1) is a valid
* index. Note that, to be correct wrt arbitrary concurrency
* errors by users, these checks must operate on local variables,
* which accounts for some odd-looking inline assignments below.
* Note that calls to setTabAt always occur within locked regions,
* and so in principle require only release ordering, not
* full volatile semantics, but are currently coded as volatile
* writes to be conservative.
*/
@SuppressWarnings("unchecked")
static final Node tabAt(Node[] tab, int i) {
return (Node)U.getObjectVolatile(tab, ((long)i << ASHIFT) + ABASE);
}
static final boolean casTabAt(Node[] tab, int i,
Node c, Node v) {
return U.compareAndSwapObject(tab, ((long)i << ASHIFT) + ABASE, c, v);
}
static final void setTabAt(Node[] tab, int i, Node v) {
U.putObjectVolatile(tab, ((long)i << ASHIFT) + ABASE, v);
}
以上这些基本实现了线程安全,还有一点是jdk1.8优化的结果,在以前的ConcurrentHashMap中是锁定了Segment,而在jdk1.8被移除,现在锁定的是一个Node头节点(注意,synchronized锁定的是头结点,这一点从下面的源码中就可以看出来),减小了锁的粒度,性能和冲突都会减少,以下是源码中的体现:
//这段代码其实是在扩容阶段对头节点的锁定,其实还有很多地方不一一列举。
synchronized (f) {
if (tabAt(tab, i) == f) {
Node ln, hn;
if (fh >= 0) {
int runBit = fh & n;
Node lastRun = f;
for (Node p = f.next; p != null; p = p.next) {
int b = p.hash & n;
if (b != runBit) {
runBit = b;
lastRun = p;
}
}
if (runBit == 0) {
ln = lastRun;
hn = null;
}
else {
hn = lastRun;
ln = null;
}
for (Node p = f; p != lastRun; p = p.next) {
int ph = p.hash; K pk = p.key; V pv = p.val;
if ((ph & n) == 0)
ln = new Node(ph, pk, pv, ln);
else
hn = new Node(ph, pk, pv, hn);
}
setTabAt(nextTab, i, ln);
setTabAt(nextTab, i + n, hn);
setTabAt(tab, i, fwd);
advance = true;
}
else if (f instanceof TreeBin) {
.....
}
}
如何存储数据:
知道了ConcurrentHashMap是如何实现线程安全的同时,最起码我们还要知道ConcurrentHashMap又是怎么实现数据存储的。以下是存储的图:
有人看了之后会想,这个不是HashMap的存储结构么?在jdk1.8中取消了segment,所以结构其实和HashMap是极其相似的,在HashMap的基础上实现了线程安全,同时在每一个“桶”中的节点会被锁定。
重要的成员变量:
1、capacity:容量,表示目前map的存储大小,在源码中分为默认和最大,默认是在没有指定容量大小的时候会赋予这个值,最大表示当容量达到这个值时,不再支持扩容。
/**
* The largest possible table capacity. This value must be
* exactly 1<<30 to stay within Java array allocation and indexing
* bounds for power of two table sizes, and is further required
* because the top two bits of 32bit hash fields are used for
* control purposes.
*/
private static final int MAXIMUM_CAPACITY = 1 << 30;
/**
* The default initial table capacity. Must be a power of 2
* (i.e., at least 1) and at most MAXIMUM_CAPACITY.
*/
private static final int DEFAULT_CAPACITY = 16;
2、laodfactor:加载因子,这个和HashMap是一样的,默认值也是0.75f。有不清楚的可以去寻找上篇介绍HashMap的博文。
/**
* The load factor for this table. Overrides of this value in
* constructors affect only the initial table capacity. The
* actual floating point value isn't normally used -- it is
* simpler to use expressions such as {@code n - (n >>> 2)} for
* the associated resizing threshold.
*/
private static final float LOAD_FACTOR = 0.75f;
3、TREEIFY_THRESHOLD与UNTREEIFY_THRESHOLD:作为了解,这个两个主要是控制链表和红黑树转化的,前者表示大于这个值,需要把链表转换为红黑树,后者表示如果红黑树的节点小于这个值需要重新转化为链表。关于为什么要把链表转化为红黑树,在HashMap的介绍中,我已经详细解释过了。
/**
* The bin count threshold for using a tree rather than list for a
* bin. Bins are converted to trees when adding an element to a
* bin with at least this many nodes. The value must be greater
* than 2, and should be at least 8 to mesh with assumptions in
* tree removal about conversion back to plain bins upon
* shrinkage.
*/
static final int TREEIFY_THRESHOLD = 8;
/**
* The bin count threshold for untreeifying a (split) bin during a
* resize operation. Should be less than TREEIFY_THRESHOLD, and at
* most 6 to mesh with shrinkage detection under removal.
*/
static final int UNTREEIFY_THRESHOLD = 6;
4、下面3个参数作为了解,主要是在扩容和参与扩容(当线程进入put的时候,发现该map正在扩容,那么它会协助扩容)的时候使用,在下一篇博文中会简单介绍到。
/**
* The number of bits used for generation stamp in sizeCtl.
* Must be at least 6 for 32bit arrays.
*/
private static int RESIZE_STAMP_BITS = 16;
/**
* The maximum number of threads that can help resize.
* Must fit in 32 - RESIZE_STAMP_BITS bits.
*/
private static final int MAX_RESIZERS = (1 << (32 - RESIZE_STAMP_BITS)) - 1;
/**
* The bit shift for recording size stamp in sizeCtl.
*/
private static final int RESIZE_STAMP_SHIFT = 32 - RESIZE_STAMP_BITS;
5、下面2个字段比较重要,是线程判断map当前处于什么阶段。MOVED表示该节点是个forwarding Node,表示有线程处理过了。后者表示判断到这个节点是一个树节点。
static final int MOVED = -1; // hash for forwarding nodes
static final int TREEBIN = -2; // hash for roots of trees
6、sizeCtl,标志控制符。这个参数非常重要,出现在ConcurrentHashMap的各个阶段,不同的值也表示不同情况和不同功能:
①负数代表正在进行初始化或扩容操作
②-N 表示有N-1个线程正在进行扩容操作 (前面已经说过了,当线程进行值添加的时候判断到正在扩容,它就会协助扩容)
③正数或0代表hash表还没有被初始化,这个数值表示初始化或下一次进行扩容的大小,类似于扩容阈值。它的值始终是当前ConcurrentHashMap容量的0.75倍,这与loadfactor是对应的。实际容量>=sizeCtl,则扩容。
注意:在某些情况下,这个值就相当于HashMap中的threshold阀值。用于控制扩容。
极其重要的几个内部类:
如果要理解ConcurrentHashMap的底层,必须要了解它相关联的一些内部类。
1、Node
/**
* Key-value entry. This class is never exported out as a
* user-mutable Map.Entry (i.e., one supporting setValue; see
* MapEntry below), but can be used for read-only traversals used
* in bulk tasks. Subclasses of Node with a negative hash field
* are special, and contain null keys and values (but are never
* exported). Otherwise, keys and vals are never null.
*/
static class Node implements Map.Entry {
final int hash;
final K key;
volatile V val; //用volatile修饰
volatile Node next;//用volatile修饰
Node(int hash, K key, V val, Node next) {
this.hash = hash;
this.key = key;
this.val = val;
this.next = next;
}
public final K getKey() { return key; }
public final V getValue() { return val; }
public final int hashCode() { return key.hashCode() ^ val.hashCode(); }
public final String toString(){ return key + "=" + val; }
public final V setValue(V value) {
throw new UnsupportedOperationException(); //不可以直接setValue
}
public final boolean equals(Object o) {
Object k, v, u; Map.Entry,?> e;
return ((o instanceof Map.Entry) &&
(k = (e = (Map.Entry,?>)o).getKey()) != null &&
(v = e.getValue()) != null &&
(k == key || k.equals(key)) &&
(v == (u = val) || v.equals(u)));
}
/**
* Virtualized support for map.get(); overridden in subclasses.
*/
Node find(int h, Object k) {
Node e = this;
if (k != null) {
do {
K ek;
if (e.hash == h &&
((ek = e.key) == k || (ek != null && k.equals(ek))))
return e;
} while ((e = e.next) != null);
}
return null;
}
}
从上面的Node内部类源码可以看出,它的value 和 next是用volatile修饰的,关于volatile已经在前面一篇博文介绍过,使得value和next具有可见性和有序性,从而保证线程安全。同时大家仔细看过代码就会发现setValue()方法访问是会抛出异常,是禁止用该方法直接设置value值的。同时它还错了一个find的方法,该方法主要是用户寻找某一个节点。
2、TreeNode和TreeBin
/**
* Nodes for use in TreeBins
*/
static final class TreeNode extends Node {
TreeNode parent; // red-black tree links
TreeNode left;
TreeNode right;
TreeNode prev; // needed to unlink next upon deletion
boolean red;
TreeNode(int hash, K key, V val, Node next,
TreeNode parent) {
super(hash, key, val, next);
this.parent = parent;
}
Node find(int h, Object k) {
return findTreeNode(h, k, null);
}
/**
* Returns the TreeNode (or null if not found) for the given key
* starting at given root.
*/
final TreeNode findTreeNode(int h, Object k, Class> kc) {
if (k != null) {
TreeNode p = this;
do {
int ph, dir; K pk; TreeNode q;
TreeNode pl = p.left, pr = p.right;
if ((ph = p.hash) > h)
p = pl;
else if (ph < h)
p = pr;
else if ((pk = p.key) == k || (pk != null && k.equals(pk)))
return p;
else if (pl == null)
p = pr;
else if (pr == null)
p = pl;
else if ((kc != null ||
(kc = comparableClassFor(k)) != null) &&
(dir = compareComparables(kc, k, pk)) != 0)
p = (dir < 0) ? pl : pr;
else if ((q = pr.findTreeNode(h, k, kc)) != null)
return q;
else
p = pl;
} while (p != null);
}
return null;
}
}
//TreeBin太长,笔者截取了它的构造方法:
TreeBin(TreeNode b) {
super(TREEBIN, null, null, null);
this.first = b;
TreeNode r = null;
for (TreeNode x = b, next; x != null; x = next) {
next = (TreeNode)x.next;
x.left = x.right = null;
if (r == null) {
x.parent = null;
x.red = false;
r = x;
}
else {
K k = x.key;
int h = x.hash;
Class> kc = null;
for (TreeNode p = r;;) {
int dir, ph;
K pk = p.key;
if ((ph = p.hash) > h)
dir = -1;
else if (ph < h)
dir = 1;
else if ((kc == null &&
(kc = comparableClassFor(k)) == null) ||
(dir = compareComparables(kc, k, pk)) == 0)
dir = tieBreakOrder(k, pk);
TreeNode xp = p;
if ((p = (dir <= 0) ? p.left : p.right) == null) {
x.parent = xp;
if (dir <= 0)
xp.left = x;
else
xp.right = x;
r = balanceInsertion(r, x);
break;
}
}
}
}
this.root = r;
assert checkInvariants(root);
}
从上面的源码可以看出,在ConcurrentHashMap中不是直接存储TreeNode来实现的,而是用TreeBin来包装TreeNode来实现的。也就是说在实际的ConcurrentHashMap桶中,存放的是TreeBin对象,而不是TreeNode对象。之所以TreeNode继承自Node是为了附带next指针,而这个next指针可以在TreeBin中寻找下一个TreeNode,这里也是与HashMap之间比较大的区别。
3、ForwordingNode
/**
* A node inserted at head of bins during transfer operations.
*/
static final class ForwardingNode extends Node {
final Node[] nextTable;
ForwardingNode(Node[] tab) {
super(MOVED, null, null, null);
this.nextTable = tab;
}
Node find(int h, Object k) {
// loop to avoid arbitrarily deep recursion on forwarding nodes
outer: for (Node[] tab = nextTable;;) {
Node e; int n;
if (k == null || tab == null || (n = tab.length) == 0 ||
(e = tabAt(tab, (n - 1) & h)) == null)
return null;
for (;;) {
int eh; K ek;
if ((eh = e.hash) == h &&
((ek = e.key) == k || (ek != null && k.equals(ek))))
return e;
if (eh < 0) {
if (e instanceof ForwardingNode) {
tab = ((ForwardingNode)e).nextTable;
continue outer;
}
else
return e.find(h, k);
}
if ((e = e.next) == null)
return null;
}
}
}
}
这个静态内部内就显得独具匠心,它的使用主要是在扩容阶段,它是链接两个table的节点类,有一个next属性用于指向下一个table,注意要理解这个table,它并不是说有2个table,而是在扩容的时候当线程读取到这个地方发现这个地方为空,这会设置为forwordingNode,或者线程处理完该节点也会设置该节点为forwordingNode,别的线程发现这个forwordingNode会继续向后执行遍历,这样一来就很好的解决了多线程安全的问题。这里有小伙伴就会问,那一个线程开始处理这个节点还没处理完,别的线程进来怎么办,而且这个节点还不是forwordingNode呐?说明你前面没看详细,在处理某个节点(桶里面第一个节点)的时候会对该节点上锁,上面文章中我已经说过了。
认识阶段就写到这里,对这些东西有一定的了解,在下一篇,也就是尾篇中,我会逐字逐句来介绍transfer()扩容,put()添加和get()查询三个方法。
如果博文存在什么问题,或者有什么想法,可以联系我呀,下面是我的微信二维码: