JDK1.8逐字逐句带你理解ConcurrentHashMap(2)

引言:

在上一篇博文我们介绍了ConcurrentHashMap在jdk1.8中所必要的知识,作为基础入门。因为jdk1.8的ConcurrentHashMap做了太多的变动,所以新知识学习是必要的。今天是ConcurrentHashMap的第二篇,第二篇主要是认识ConcurrentHashMap,我将会介绍一下它的关键成员变量和一些关键的类。大家可以结合前几篇博文的HashMap的知识,很多还是很相似的。笔者目前整理的一些blog针对面试都是超高频出现的。大家可以点击链接:http://blog.csdn.net/u012403290

如何实现线程安全:
我们都知道ConcurrentHashMap核心是线程安全的,那么它又是用什么来实现线程安全的呢?在jdk1.8中主要是采用了CAS算法实现线程安全的。在上一篇博文中已经介绍了CAS的无锁操作,这里不再赘述。同时它通过CAS算法又实现了3种原子操作(线程安全的保障就是操作具有原子性),下面我赋值了源码分别表示哪些成员变量采用了CAS算法,然后又是哪些方法实现了操作的原子性:

  // Unsafe mechanics  CAS保障了哪些成员变量操作是原子性的

    private static final sun.misc.Unsafe U;
    private static final long LOCKSTATE;
      static {
            try {
                U = sun.misc.Unsafe.getUnsafe();
                Class k = TreeBin.class; //操作TreeBin,后面会介绍这个类
                LOCKSTATE = U.objectFieldOffset
                    (k.getDeclaredField("lockState"));
            } catch (Exception e) {
                throw new Error(e);
            }
        }
--------------------------------------------------------------------------------------
    private static final sun.misc.Unsafe U;
    private static final long SIZECTL;
    private static final long TRANSFERINDEX;
    private static final long BASECOUNT;
    private static final long CELLSBUSY;
    private static final long CELLVALUE;
    private static final long ABASE;
    private static final int ASHIFT;

    static {
        try {
        //以下变量会在下面介绍到
            U = sun.misc.Unsafe.getUnsafe();
            Class k = ConcurrentHashMap.class;
            SIZECTL = U.objectFieldOffset
                (k.getDeclaredField("sizeCtl"));
            TRANSFERINDEX = U.objectFieldOffset
                (k.getDeclaredField("transferIndex"));
            BASECOUNT = U.objectFieldOffset
                (k.getDeclaredField("baseCount"));
            CELLSBUSY = U.objectFieldOffset
                (k.getDeclaredField("cellsBusy"));
            Class ck = CounterCell.class;
            CELLVALUE = U.objectFieldOffset
                (ck.getDeclaredField("value"));
            Class ak = Node[].class;
            ABASE = U.arrayBaseOffset(ak);
            int scale = U.arrayIndexScale(ak);
            if ((scale & (scale - 1)) != 0)
                throw new Error("data type scale not a power of two");
            ASHIFT = 31 - Integer.numberOfLeadingZeros(scale);
        } catch (Exception e) {
            throw new Error(e);
        }
    }




//3个原子性操作方法:

    /* ---------------- Table element access -------------- */

    /*
     * Volatile access methods are used for table elements as well as
     * elements of in-progress next table while resizing.  All uses of
     * the tab arguments must be null checked by callers.  All callers
     * also paranoically precheck that tab's length is not zero (or an
     * equivalent check), thus ensuring that any index argument taking
     * the form of a hash value anded with (length - 1) is a valid
     * index.  Note that, to be correct wrt arbitrary concurrency
     * errors by users, these checks must operate on local variables,
     * which accounts for some odd-looking inline assignments below.
     * Note that calls to setTabAt always occur within locked regions,
     * and so in principle require only release ordering, not
     * full volatile semantics, but are currently coded as volatile
     * writes to be conservative.
     */

    @SuppressWarnings("unchecked")
    static final  Node tabAt(Node[] tab, int i) {
        return (Node)U.getObjectVolatile(tab, ((long)i << ASHIFT) + ABASE);
    }

    static final  boolean casTabAt(Node[] tab, int i,
                                        Node c, Node v) {
        return U.compareAndSwapObject(tab, ((long)i << ASHIFT) + ABASE, c, v);
    }

    static final  void setTabAt(Node[] tab, int i, Node v) {
        U.putObjectVolatile(tab, ((long)i << ASHIFT) + ABASE, v);
    }

以上这些基本实现了线程安全,还有一点是jdk1.8优化的结果,在以前的ConcurrentHashMap中是锁定了Segment,而在jdk1.8被移除,现在锁定的是一个Node头节点(注意,synchronized锁定的是头结点,这一点从下面的源码中就可以看出来),减小了锁的粒度,性能和冲突都会减少,以下是源码中的体现:

//这段代码其实是在扩容阶段对头节点的锁定,其实还有很多地方不一一列举。
               synchronized (f) {
                    if (tabAt(tab, i) == f) {
                        Node ln, hn;
                        if (fh >= 0) {
                            int runBit = fh & n;
                            Node lastRun = f;
                            for (Node p = f.next; p != null; p = p.next) {
                                int b = p.hash & n;
                                if (b != runBit) {
                                    runBit = b;
                                    lastRun = p;
                                }
                            }
                            if (runBit == 0) {
                                ln = lastRun;
                                hn = null;
                            }
                            else {
                                hn = lastRun;
                                ln = null;
                            }
                            for (Node p = f; p != lastRun; p = p.next) {
                                int ph = p.hash; K pk = p.key; V pv = p.val;
                                if ((ph & n) == 0)
                                    ln = new Node(ph, pk, pv, ln);
                                else
                                    hn = new Node(ph, pk, pv, hn);
                            }
                            setTabAt(nextTab, i, ln);
                            setTabAt(nextTab, i + n, hn);
                            setTabAt(tab, i, fwd);
                            advance = true;
                        }
                        else if (f instanceof TreeBin) {
                        .....
                   }
                }

如何存储数据:
知道了ConcurrentHashMap是如何实现线程安全的同时,最起码我们还要知道ConcurrentHashMap又是怎么实现数据存储的。以下是存储的图:
JDK1.8逐字逐句带你理解ConcurrentHashMap(2)_第1张图片
有人看了之后会想,这个不是HashMap的存储结构么?在jdk1.8中取消了segment,所以结构其实和HashMap是极其相似的,在HashMap的基础上实现了线程安全,同时在每一个“桶”中的节点会被锁定。

重要的成员变量

1、capacity:容量,表示目前map的存储大小,在源码中分为默认和最大,默认是在没有指定容量大小的时候会赋予这个值,最大表示当容量达到这个值时,不再支持扩容。

    /**
     * The largest possible table capacity.  This value must be
     * exactly 1<<30 to stay within Java array allocation and indexing
     * bounds for power of two table sizes, and is further required
     * because the top two bits of 32bit hash fields are used for
     * control purposes.
     */
    private static final int MAXIMUM_CAPACITY = 1 << 30;

    /**
     * The default initial table capacity.  Must be a power of 2
     * (i.e., at least 1) and at most MAXIMUM_CAPACITY.
     */
    private static final int DEFAULT_CAPACITY = 16;

2、laodfactor:加载因子,这个和HashMap是一样的,默认值也是0.75f。有不清楚的可以去寻找上篇介绍HashMap的博文。

    /**
     * The load factor for this table. Overrides of this value in
     * constructors affect only the initial table capacity.  The
     * actual floating point value isn't normally used -- it is
     * simpler to use expressions such as {@code n - (n >>> 2)} for
     * the associated resizing threshold.
     */
    private static final float LOAD_FACTOR = 0.75f;

3、TREEIFY_THRESHOLD与UNTREEIFY_THRESHOLD:作为了解,这个两个主要是控制链表和红黑树转化的,前者表示大于这个值,需要把链表转换为红黑树,后者表示如果红黑树的节点小于这个值需要重新转化为链表。关于为什么要把链表转化为红黑树,在HashMap的介绍中,我已经详细解释过了。

    /**
     * The bin count threshold for using a tree rather than list for a
     * bin.  Bins are converted to trees when adding an element to a
     * bin with at least this many nodes. The value must be greater
     * than 2, and should be at least 8 to mesh with assumptions in
     * tree removal about conversion back to plain bins upon
     * shrinkage.
     */
    static final int TREEIFY_THRESHOLD = 8;

    /**
     * The bin count threshold for untreeifying a (split) bin during a
     * resize operation. Should be less than TREEIFY_THRESHOLD, and at
     * most 6 to mesh with shrinkage detection under removal.
     */
    static final int UNTREEIFY_THRESHOLD = 6;

4、下面3个参数作为了解,主要是在扩容和参与扩容(当线程进入put的时候,发现该map正在扩容,那么它会协助扩容)的时候使用,在下一篇博文中会简单介绍到。

    /**
     * The number of bits used for generation stamp in sizeCtl.
     * Must be at least 6 for 32bit arrays.
     */
    private static int RESIZE_STAMP_BITS = 16;

    /**
     * The maximum number of threads that can help resize.
     * Must fit in 32 - RESIZE_STAMP_BITS bits.
     */
    private static final int MAX_RESIZERS = (1 << (32 - RESIZE_STAMP_BITS)) - 1;

    /**
     * The bit shift for recording size stamp in sizeCtl.
     */
    private static final int RESIZE_STAMP_SHIFT = 32 - RESIZE_STAMP_BITS;

5、下面2个字段比较重要,是线程判断map当前处于什么阶段。MOVED表示该节点是个forwarding Node,表示有线程处理过了。后者表示判断到这个节点是一个树节点。

    static final int MOVED     = -1; // hash for forwarding nodes
    static final int TREEBIN   = -2; // hash for roots of trees

6、sizeCtl,标志控制符。这个参数非常重要,出现在ConcurrentHashMap的各个阶段,不同的值也表示不同情况和不同功能:
①负数代表正在进行初始化或扩容操作
②-N 表示有N-1个线程正在进行扩容操作 (前面已经说过了,当线程进行值添加的时候判断到正在扩容,它就会协助扩容)
③正数或0代表hash表还没有被初始化,这个数值表示初始化或下一次进行扩容的大小,类似于扩容阈值。它的值始终是当前ConcurrentHashMap容量的0.75倍,这与loadfactor是对应的。实际容量>=sizeCtl,则扩容。

注意:在某些情况下,这个值就相当于HashMap中的threshold阀值。用于控制扩容。

极其重要的几个内部类:
如果要理解ConcurrentHashMap的底层,必须要了解它相关联的一些内部类。

1、Node

    /**
     * Key-value entry.  This class is never exported out as a
     * user-mutable Map.Entry (i.e., one supporting setValue; see
     * MapEntry below), but can be used for read-only traversals used
     * in bulk tasks.  Subclasses of Node with a negative hash field
     * are special, and contain null keys and values (but are never
     * exported).  Otherwise, keys and vals are never null.
     */
    static class Node implements Map.Entry {
        final int hash;
        final K key;
        volatile V val;  //用volatile修饰
        volatile Node next;//用volatile修饰

        Node(int hash, K key, V val, Node next) {
            this.hash = hash;
            this.key = key;
            this.val = val;
            this.next = next;
        }

        public final K getKey()       { return key; }
        public final V getValue()     { return val; }
        public final int hashCode()   { return key.hashCode() ^ val.hashCode(); }
        public final String toString(){ return key + "=" + val; }
        public final V setValue(V value) {
            throw new UnsupportedOperationException();  //不可以直接setValue
        }

        public final boolean equals(Object o) {
            Object k, v, u; Map.Entry e;
            return ((o instanceof Map.Entry) &&
                    (k = (e = (Map.Entry)o).getKey()) != null &&
                    (v = e.getValue()) != null &&
                    (k == key || k.equals(key)) &&
                    (v == (u = val) || v.equals(u)));
        }

        /**
         * Virtualized support for map.get(); overridden in subclasses.
         */
        Node find(int h, Object k) {
            Node e = this;
            if (k != null) {
                do {
                    K ek;
                    if (e.hash == h &&
                        ((ek = e.key) == k || (ek != null && k.equals(ek))))
                        return e;
                } while ((e = e.next) != null);
            }
            return null;
        }
    }

从上面的Node内部类源码可以看出,它的value 和 next是用volatile修饰的,关于volatile已经在前面一篇博文介绍过,使得value和next具有可见性和有序性,从而保证线程安全。同时大家仔细看过代码就会发现setValue()方法访问是会抛出异常,是禁止用该方法直接设置value值的。同时它还错了一个find的方法,该方法主要是用户寻找某一个节点。

2、TreeNode和TreeBin

 /**
     * Nodes for use in TreeBins
     */
    static final class TreeNode extends Node {
        TreeNode parent;  // red-black tree links
        TreeNode left;
        TreeNode right;
        TreeNode prev;    // needed to unlink next upon deletion
        boolean red;

        TreeNode(int hash, K key, V val, Node next,
                 TreeNode parent) {
            super(hash, key, val, next);
            this.parent = parent;
        }

        Node find(int h, Object k) {
            return findTreeNode(h, k, null);
        }

        /**
         * Returns the TreeNode (or null if not found) for the given key
         * starting at given root.
         */
        final TreeNode findTreeNode(int h, Object k, Class kc) {
            if (k != null) {
                TreeNode p = this;
                do  {
                    int ph, dir; K pk; TreeNode q;
                    TreeNode pl = p.left, pr = p.right;
                    if ((ph = p.hash) > h)
                        p = pl;
                    else if (ph < h)
                        p = pr;
                    else if ((pk = p.key) == k || (pk != null && k.equals(pk)))
                        return p;
                    else if (pl == null)
                        p = pr;
                    else if (pr == null)
                        p = pl;
                    else if ((kc != null ||
                              (kc = comparableClassFor(k)) != null) &&
                             (dir = compareComparables(kc, k, pk)) != 0)
                        p = (dir < 0) ? pl : pr;
                    else if ((q = pr.findTreeNode(h, k, kc)) != null)
                        return q;
                    else
                        p = pl;
                } while (p != null);
            }
            return null;
        }
    }


//TreeBin太长,笔者截取了它的构造方法:

 TreeBin(TreeNode b) {
            super(TREEBIN, null, null, null);
            this.first = b;
            TreeNode r = null;
            for (TreeNode x = b, next; x != null; x = next) {
                next = (TreeNode)x.next;
                x.left = x.right = null;
                if (r == null) {
                    x.parent = null;
                    x.red = false;
                    r = x;
                }
                else {
                    K k = x.key;
                    int h = x.hash;
                    Class kc = null;
                    for (TreeNode p = r;;) {
                        int dir, ph;
                        K pk = p.key;
                        if ((ph = p.hash) > h)
                            dir = -1;
                        else if (ph < h)
                            dir = 1;
                        else if ((kc == null &&
                                  (kc = comparableClassFor(k)) == null) ||
                                 (dir = compareComparables(kc, k, pk)) == 0)
                            dir = tieBreakOrder(k, pk);
                            TreeNode xp = p;
                        if ((p = (dir <= 0) ? p.left : p.right) == null) {
                            x.parent = xp;
                            if (dir <= 0)
                                xp.left = x;
                            else
                                xp.right = x;
                            r = balanceInsertion(r, x);
                            break;
                        }
                    }
                }
            }
            this.root = r;
            assert checkInvariants(root);
        }

从上面的源码可以看出,在ConcurrentHashMap中不是直接存储TreeNode来实现的,而是用TreeBin来包装TreeNode来实现的。也就是说在实际的ConcurrentHashMap桶中,存放的是TreeBin对象,而不是TreeNode对象。之所以TreeNode继承自Node是为了附带next指针,而这个next指针可以在TreeBin中寻找下一个TreeNode,这里也是与HashMap之间比较大的区别。

3、ForwordingNode

    /**
     * A node inserted at head of bins during transfer operations.
     */
    static final class ForwardingNode extends Node {
        final Node[] nextTable;
        ForwardingNode(Node[] tab) {
            super(MOVED, null, null, null);
            this.nextTable = tab;
        }

        Node find(int h, Object k) {
            // loop to avoid arbitrarily deep recursion on forwarding nodes
            outer: for (Node[] tab = nextTable;;) {
                Node e; int n;
                if (k == null || tab == null || (n = tab.length) == 0 ||
                    (e = tabAt(tab, (n - 1) & h)) == null)
                    return null;
                for (;;) {
                    int eh; K ek;
                    if ((eh = e.hash) == h &&
                        ((ek = e.key) == k || (ek != null && k.equals(ek))))
                        return e;
                    if (eh < 0) {
                        if (e instanceof ForwardingNode) {
                            tab = ((ForwardingNode)e).nextTable;
                            continue outer;
                        }
                        else
                            return e.find(h, k);
                    }
                    if ((e = e.next) == null)
                        return null;
                }
            }
        }
    }

这个静态内部内就显得独具匠心,它的使用主要是在扩容阶段,它是链接两个table的节点类,有一个next属性用于指向下一个table,注意要理解这个table,它并不是说有2个table,而是在扩容的时候当线程读取到这个地方发现这个地方为空,这会设置为forwordingNode,或者线程处理完该节点也会设置该节点为forwordingNode,别的线程发现这个forwordingNode会继续向后执行遍历,这样一来就很好的解决了多线程安全的问题。这里有小伙伴就会问,那一个线程开始处理这个节点还没处理完,别的线程进来怎么办,而且这个节点还不是forwordingNode呐?说明你前面没看详细,在处理某个节点(桶里面第一个节点)的时候会对该节点上锁,上面文章中我已经说过了。

认识阶段就写到这里,对这些东西有一定的了解,在下一篇,也就是尾篇中,我会逐字逐句来介绍transfer()扩容,put()添加和get()查询三个方法。

如果博文存在什么问题,或者有什么想法,可以联系我呀,下面是我的微信二维码:
这里写图片描述

你可能感兴趣的:(java集合)