CyclicBarrier内部机制

前言

如果我们希望所有的线程都到达同一个地方才能继续往下执行,那么CyclicBarrier就是一个不错的选择。

开始

在下面一个例子中,我希望10个线程都要到达同一个地方才可以往下走。

    public static void main(String[] args) {
        int num = 10;
        CyclicBarrier cb = new CyclicBarrier(num);
        for (int i = 0; i < num; i++) {
            Thread t = new Thread(new Task(cb, i + 1));
            try {
                Thread.sleep(1000L);
            } catch (InterruptedException e) {
                e.printStackTrace();
            }
            t.start();
        }
    }
    static class Task implements Runnable {
        private final CyclicBarrier cb;
        private final int id;

        Task(CyclicBarrier cb, int id) {
            this.cb = cb;
            this.id = id;
        }

        @Override
        public void run() {
            try {
                System.out.println("Task#" + id + " is waiting...");
                cb.await();
                System.out.println("Task#" + id + " is finished...");
            } catch (Exception e) {
                e.printStackTrace();
            }
        }
    }

结果如下所示:

Task#1 is waiting...
Task#2 is waiting...
Task#3 is waiting...
Task#4 is waiting...
Task#5 is waiting...
Task#6 is waiting...
Task#7 is waiting...
Task#8 is waiting...
Task#9 is waiting...
Task#10 is waiting...
Task#10 is finished...
Task#1 is finished...
Task#3 is finished...
Task#4 is finished...
Task#5 is finished...
Task#6 is finished...
Task#2 is finished...
Task#8 is finished...
Task#9 is finished...
Task#7 is finished...

可以看见,线程一个接一个启动了,一旦第十个线程到达了前九个线程到达的地方,那么每个线程都启动了,启动后的输出就没有什么顺序了。这是CycicBarrier一个比较简单的例子。再利用jstack可以看到线程等待的状态:

"Thread-0" #12 prio=5 os_prio=0 tid=0x00007f7438136800 nid=0x3dc4 waiting on condition [0x00007f7411de2000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x00000000d6d7ce38> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
        at java.util.concurrent.CyclicBarrier.dowait(CyclicBarrier.java:234)
        at java.util.concurrent.CyclicBarrier.await(CyclicBarrier.java:362)
        at com.jdk.CyclicBarrierDemo$Task.run(CyclicBarrierDemo.java:39)
        at java.lang.Thread.run(Thread.java:745)

这就很清楚了,从调用深度高到低排序,LockSupport的park方法,AQS里的ConditionObject的await方法,CyclicBarrier的dowait方法,找它们错不了。

CyclicBarrier的dowait方法

dowait方法这个描述很形象,一下子就告诉了我们,我就是让线程抱团停下的方法,有什么事找我吧。

    // 这是CyclicBarrier提供给我们调用的API
    public int await() throws InterruptedException, BrokenBarrierException {
        try {
            return dowait(false, 0L);
        } catch (TimeoutException toe) {
            throw new Error(toe); // cannot happen
        }
    }

    /**
     * Main barrier code, covering the various policies.
     */
    private int dowait(boolean timed, long nanos)
        throws InterruptedException, BrokenBarrierException,
               TimeoutException {
        //毕竟是临界区,安全起见,还是要上锁了,
        //每个new出来的CyclicBarrier对象就有一把唯一(final)的重入锁
        final ReentrantLock lock = this.lock;
        lock.lock();
        try {
            //CyclicBarrier是可以重用的,它的重用机制就是通过设置一个Generation实现
            //当我们希望重用这个CyclicBarrier对象,reset()调用以后
            //generation属性就被重新创建。
            final Generation g = generation;

            //这段代码是想说,所有的线程还没到达同一个地方,
            //CyclicBarrier就被“无效”了(broken),那是不正常的,要抛异常
            if (g.broken)
                throw new BrokenBarrierException();

            //线程被设置了中断标签,这是符合突破这个CyclicBarrier的条件了,
            //所以,调用breakBarrier()没毛病,但是还是要抛出异常
            if (Thread.interrupted()) {
                breakBarrier();
                throw new InterruptedException();
            }

            //每次拿到锁的线程走到这里都要对计数count--
            //虽然count--有三个步骤要走,但是毕竟上锁了,who cares
            int index = --count;
            if (index == 0) {  // tripped
                //走到这里,CyclicBarrier就真的被翻越了
                boolean ranAction = false;
                try {
                    //按照文档的说法,我们可以设置一个Runnable
                    //在CyclicBarrier被翻越的时候执行
                    final Runnable command = barrierCommand;
                    if (command != null)
                        command.run();
                    ranAction = true;
                    //设置新的Generation已提供再用。
                    nextGeneration();
                    return 0;
                } finally {
                    if (!ranAction)
                        breakBarrier();
                }
            }

            // loop until tripped, broken, interrupted, or timed out
            for (;;) {
                try {
                    //没有超时,调用ConditionObject的await或者awaitNanos方法,当前线程被阻塞
                    //至于拿到的锁,会在await或者awaitNanos方法里面释放
                    if (!timed)
                        trip.await();
                    else if (nanos > 0L)
                        nanos = trip.awaitNanos(nanos);
                } catch (InterruptedException ie) {
                    if (g == generation && ! g.broken) {
                        breakBarrier();
                        throw ie;
                    } else {
                        // We're about to finish waiting even if we had not
                        // been interrupted, so this interrupt is deemed to
                        // "belong" to subsequent execution.
                        Thread.currentThread().interrupt();
                    }
                }
                //以下代码和上面说的一样,总结一下就是
                //如果Thread的中断状态位被设置了就抛出InterruptedException
                //或者在还有线程等待的时候CyclicBarrier被翻越BrokenBarrierException
                if (g.broken)
                    throw new BrokenBarrierException();

                if (g != generation)
                    return index;
                //等待超时,也要抛异常,这次是TimeoutException
                if (timed && nanos <= 0L) {
                    breakBarrier();
                    throw new TimeoutException();
                }
            }
        } finally {
            //解锁好习惯
            lock.unlock();
        }
    }

AQS里的ConditionObject的await方法

    //这段代码还是在CyclicBarrier里面
    //线程等待的入口就是通过这个实现了Condition接口的trip来完成的
    //是通过ReentrantLock获取的,这也是它能够释放锁的伏笔了。
    private final Condition trip = lock.newCondition();
    
        //以下这个方法才是主角
        public final void await() throws InterruptedException {
            if (Thread.interrupted())
                throw new InterruptedException();
            //先加入一个waitStatus=CONDITION的Node
            //这个Node就是AQS里面使用的CLH队列的Node
            //加入的是条件(condition)队列,而不是同步队列(sync)
            Node node = addConditionWaiter();
            //释放锁的地方就在这里了,要知道,ReebtrantLock继承了AQS,
            //这实际上是在一个类里面操作,还要返回这个线程上了几次锁,
            //也就是AQS里state属性的值,保存在savedState,
            //给突破CyclicBarrier以后,争取锁使用
            int savedState = fullyRelease(node);
            int interruptMode = 0;
            //只要不在同步队列中,当前线程就要调用LockSupport的park方法
            //调用park方法的线程会阻塞
            while (!isOnSyncQueue(node)) {
                LockSupport.park(this);
                //等于0表示没有cancel,还继续要等待
                if ((interruptMode = checkInterruptWhileWaiting(node)) != 0)
                    break;
            }
            //尝试把node加入阻塞队列,加入阻塞队列就是获取公平锁的机制了
            if (acquireQueued(node, savedState) && interruptMode != THROW_IE)
                interruptMode = REINTERRUPT;
            //Condition队列用的是nextWaiter指针,属于单向链表
            //在这里要清理已经取消等待的线程
            //unlinkCancelledWaiters使用的思路就是我们在单向链表删除节点的思路,
            //从头遍历,改变nextWaiter指针。
            if (node.nextWaiter != null) // clean up if cancelled
                unlinkCancelledWaiters();
            //不等于0,确实是需要中断,给线程设置中断标志位,或者抛出异常
            //这可以在reportInterruptAfterWait方法立细究
            if (interruptMode != 0)
                reportInterruptAfterWait(interruptMode);
        }
    

唤醒线程

    private void breakBarrier() {
        generation.broken = true;
        count = parties;
        trip.signalAll();
    }

所以,现在就主要看ConditionObject的signalAll的方法了。

ConditionObject的signalAll方法

        public final void signalAll() {
            //如果不是异己锁,抛出异常
            if (!isHeldExclusively())
                throw new IllegalMonitorStateException();
            Node first = firstWaiter;
            if (first != null)
                doSignalAll(first);
        }
        private void doSignalAll(Node first) {
            lastWaiter = firstWaiter = null;
            do {
                //将节点脱离条件队列,就是将nextWaiter指针置空
                Node next = first.nextWaiter;
                first.nextWaiter = null;
                //把脱离条件队列的一个接一个节点加入同步队列,并唤醒
                transferForSignal(first);
                first = next;
            } while (first != null);
        }
        
        final boolean transferForSignal(Node node) {
        /*
         * 把节点的状态值从CONDITION设置为0
         * 0的意义就是,他不属于CONDITION,CANCELED,SIGNAL这三种状态,属于等待状态。
         * 如果设置失败,就只能返回false了
         */
        if (!compareAndSetWaitStatus(node, Node.CONDITION, 0))
            return false;

        /*
         * Splice onto queue and try to set waitStatus of predecessor to
         * indicate that thread is (probably) waiting. If cancelled or
         * attempt to set waitStatus fails, wake up to resync (in which
         * case the waitStatus can be transiently and harmlessly wrong).
         */
        // 这就是前面说的,加入同步队列的地方
        Node p = enq(node);
        int ws = p.waitStatus;
        // !compareAndSetWaitStatus(p, ws, Node.SIGNAL)这句话的意思是说,重设waitStatus可能会暂时无害的错误
        // ws > 0代表Node的状态是CANCELLED
        // 整体来说就是,如果Node状态是CANCELLED或者把Node的状态设为SIGNAL失败,
        // 就把线程唤醒,也让他去争取锁,源码上写这是暂时的,无害的错误
        // 在AQS的acquireQueued方法里面,没有获取到锁会一直卡在那个方法里面
        if (ws > 0 || !compareAndSetWaitStatus(p, ws, Node.SIGNAL))
            LockSupport.unpark(node.thread);
        return true;
    }

后记

总结起来就是,要想让规定数量的线程都达到同一个点才开始执行,就得让线程等待,计数。
AQS的条件队列和同步队列的设计正是用的这种思想,线程等待,就加入条件队列,要释放线程,因为还要获取锁才能越过CyclicBarrier的await方法,所以要加入同步队列获取锁。

你可能感兴趣的:(CyclicBarrier内部机制)