JUC 源码解读系列--ThreadPoolExecutor 篇

线程池的出现将开发人员从线程的管理中解放出来,让开发人员有更多精力去关注业务代码。但是 JUC 中对线程池的高度封装,让不少开发人员渐渐忽视或生疏了线程池的底层实现,作为一个有追求的开发者,不仅知其然也要知其所以然,同时这一部分也是许多大厂面试的必问题目。本篇我们一起来走进线程池的源码一探究竟。

1. 构造参数

// 当我们使用 Executors 工具/工厂类创建线程池时,不管是使用下面三种方式的哪一种,
// 其底层都是通过 ThreadPoolExecutor 来创建线程池的
ExecutorService cachedThreadPool = Executors.newCachedThreadPool();
ExecutorService fixedThreadPool = Executors.newFixedThreadPool(2);
ExecutorService singleThreadExecutor = Executors.newSingleThreadExecutor();
/**
 * Creates a new {@code ThreadPoolExecutor} with the given initial
 * parameters.
 *
 * @param corePoolSize the number of threads to keep in the pool, even
 *        if they are idle, unless {@code allowCoreThreadTimeOut} is set
 * @param maximumPoolSize the maximum number of threads to allow in the
 *        pool
 * @param keepAliveTime when the number of threads is greater than
 *        the core, this is the maximum time that excess idle threads
 *        will wait for new tasks before terminating.
 * @param unit the time unit for the {@code keepAliveTime} argument
 * @param workQueue the queue to use for holding tasks before they are
 *        executed.  This queue will hold only the {@code Runnable}
 *        tasks submitted by the {@code execute} method.
 * @param threadFactory the factory to use when the executor
 *        creates a new thread
 * @param handler the handler to use when execution is blocked
 *        because the thread bounds and queue capacities are reached
 * @throws IllegalArgumentException if one of the following holds:
* {@code corePoolSize < 0}
* {@code keepAliveTime < 0}
* {@code maximumPoolSize <= 0}
* {@code maximumPoolSize < corePoolSize} * @throws NullPointerException if {@code workQueue} * or {@code threadFactory} or {@code handler} is null */
public ThreadPoolExecutor(int corePoolSize,// 线程池中的线程数 int maximumPoolSize,// 线程池最大线程数 long keepAliveTime,// 超过 corePoolSize 之外的线程的最大空闲时间 TimeUnit unit,// 时间单位 BlockingQueue<Runnable> workQueue,// 存放待执行任务的队列 ThreadFactory threadFactory,// 创建线程使用的工厂 RejectedExecutionHandler handler) {// 当任务队列已满,且没有可用线程执行到来的任务时,要执行的拒绝策略 if (corePoolSize < 0 || maximumPoolSize <= 0 || maximumPoolSize < corePoolSize || keepAliveTime < 0) throw new IllegalArgumentException(); if (workQueue == null || threadFactory == null || handler == null) throw new NullPointerException(); this.corePoolSize = corePoolSize; this.maximumPoolSize = maximumPoolSize; // newCachedThreadPool 方法中默认使用的队列是 SynchronousQueue, // 这是比较有意思的一种队列,后面会在本系列中详细讲解。 // newSingleThreadExecutor 和 newFixedThreadPool 方法中默认使用的是 LinkedBlockingQueue, // 且是无参构造,即队列的长度是默认值 Integer.MAX_VALUE,这样当大量任务并发时会有 OOM 的隐患。 this.workQueue = workQueue; this.keepAliveTime = unit.toNanos(keepAliveTime); this.threadFactory = threadFactory; // ThreadPoolExecutor 中提供了四中拒绝策略的实现: // CallerRunsPolicy 调用者线程执行当前任务 // AbortPolicy 抛出异常 // DiscardPolicy 直接丢弃,什么也不做 // DiscardOldestPolicy 丢弃队列中最旧的任务,执行当前任务 this.handler = handler; }

2. 任务执行流程

execute()

public void execute(Runnable command) {
    if (command == null)
        throw new NullPointerException();
    /*
     * Proceed in 3 steps:
     *
     * 1. If fewer than corePoolSize threads are running, try to
     * start a new thread with the given command as its first
     * task.  The call to addWorker atomically checks runState and
     * workerCount, and so prevents false alarms that would add
     * threads when it shouldn't, by returning false.
     *
     * 2. If a task can be successfully queued, then we still need
     * to double-check whether we should have added a thread
     * (because existing ones died since last checking) or that
     * the pool shut down since entry into this method. So we
     * recheck state and if necessary roll back the enqueuing if
     * stopped, or start a new thread if there are none.
     *
     * 3. If we cannot queue task, then we try to add a new
     * thread.  If it fails, we know we are shut down or saturated
     * and so reject the task.
     */
    // ctl 中的高位用来记录线程池的状态,低位用来记录线程数
    int c = ctl.get();
    // 如果池中的线程数小于核心线程数,则创建一个新的线程,作为核心线程,
    // 线程的初始任务就是当前任务
    if (workerCountOf(c) < corePoolSize) {
        if (addWorker(command, true))
            return;
        // 存在并发或者线程池处于非 RUNNING 状态,核心线程创建失败,
        // 需要判断最新的线程池状态
        c = ctl.get();
    }
    // 上面的创建核心线程失败,会进入这里将任务入队
    // 如果线程池处于 RUNNING 状态,并且任务成功加入队列
    if (isRunning(c) && workQueue.offer(command)) {
        // 再次检查线程池状态:
        //     如果线程池不是 RUNNING 状态,表示不接受新的任务,将当前任务从队列中移除,
        //         如果移除成功,说明任务还没有被执行,进而执行拒绝策略;
        //         如果移除失败,说明任务已经被执行
        int recheck = ctl.get();
        if (! isRunning(recheck) && remove(command))
            reject(command);
        // 假如上面的判断中,线程池处于 RUNNING 状态,remove(command) 方法不会被执行,任务仍在队列中,
        //      此时不确定线程池中是否有活跃线程执行任务,因此需要进一步判断,假如当前无活跃线程,尝试去创建一条
        // 假如上面的判断中,线程池处于非 RUNNING 状态,remove(command) 方法会被执行且执行失败,说明任务已经被执行
        else if (workerCountOf(recheck) == 0)
            addWorker(null, false);
    }
    // 如果上面的入队操作失败(或者线程池状态非 RUNNING),尝试创建非核心线程执行任务,创建失败则执行拒绝策略。
    // 假如线程池状态非 RUNNING,线程池不会接收新的任务,这里是一定会创建失败的。
    else if (!addWorker(command, false))
        reject(command);
}

线程池状态定义

private final AtomicInteger ctl = new AtomicInteger(ctlOf(RUNNING, 0));
private static final int COUNT_BITS = Integer.SIZE - 3;
private static final int CAPACITY   = (1 << COUNT_BITS) - 1;// 可记录的最大线程数量

/**
 *   RUNNING:  Accept new tasks and process queued tasks
 *   SHUTDOWN: Don't accept new tasks, but process queued tasks
 *   STOP:     Don't accept new tasks, don't process queued tasks,
 *             and interrupt in-progress tasks
 *   TIDYING:  All tasks have terminated, workerCount is zero,
 *             the thread transitioning to state TIDYING
 *             will run the terminated() hook method
 *   TERMINATED: terminated() has completed
 */
// runState is stored in the high-order bits
// 高三位用来记录线程池状态
private static final int RUNNING    = -1 << COUNT_BITS;// 高三位 111 表示 RUNNING 状态,接收新的任务,也处理任务队列中的任务
private static final int SHUTDOWN   =  0 << COUNT_BITS;// 高三位 000 表示 SHUTDOWN 状态,不接收新的任务,但处理任务队列中的任务
private static final int STOP       =  1 << COUNT_BITS;// 高三位 001 表示 STOP 状态,不接收新的任务,也不处理任务队列中的任务,并中断执行中的任务
private static final int TIDYING    =  2 << COUNT_BITS;// 高三位 010 表示 TIDYING 状态,所有的任务都结束,线程数为 0,即将执行 terminated() 方法
private static final int TERMINATED =  3 << COUNT_BITS;// 高三位 011 表示 TERMINATED 状态,terminated() 方法执行完成后的状态
// Packing and unpacking ctl
private static int runStateOf(int c)     { return c & ~CAPACITY; } // 根据 ctl 计算线程池状态
private static int workerCountOf(int c)  { return c & CAPACITY; } // 根据 ctl 计算池中线程数量
private static int ctlOf(int rs, int wc) { return rs | wc; } // 根据线程池状态和池中线程数量计算 ctl 

addWorker()

private boolean addWorker(Runnable firstTask, boolean core) {
    retry:
    for (;;) {
        int c = ctl.get();
        int rs = runStateOf(c);

        // Check if queue empty only if necessary.
        // 当线程池状态 rs >= SHUTDOWN 时(即为 SHUTDOWN,STOP,TIDYING 或 TERMINATED 时),
        // 进一步判断:
        //     SHUTDOWN,STOP,TIDYING,TERMINATED 这几个状态都不会再接收新的任务,
        //     同时只有 SHUTDOWN 状态会继续执行队列中未执行的任务。
        //     因此这里只有当线程池状态是 SHUTDOWN 且传过来的任务为空且
        //     当前任务队列不空的情况下才进入下面创建线程的逻辑
        if (rs >= SHUTDOWN &&
            ! (rs == SHUTDOWN &&
               firstTask == null &&
               ! workQueue.isEmpty()))
            return false;

        for (;;) {
            int wc = workerCountOf(c);
            // 如果线程数溢出或者超限,返回失败
            if (wc >= CAPACITY ||
                wc >= (core ? corePoolSize : maximumPoolSize))
                return false;
            // 通过 CAS 将线程数量加一,成功的话结束外层自旋
            if (compareAndIncrementWorkerCount(c))
                break retry;
            // 因为上面的 CAS 是对 ctl 进行的,所以如果 CAS 失败,
            // 则可能是线程池状态的变化或者线程数量的变化,
            // 进一步判断是否是线程池状态的变化,如果是,继续外层自旋,
            // 对状态进行判断处理
            c = ctl.get();  // Re-read ctl
            if (runStateOf(c) != rs)
                continue retry;
            // 如果是线程数量的变化,继续内层的自旋,重新尝试 CAS
            // else CAS failed due to workerCount change; retry inner loop
        }
    }

	// 创建一个新的线程并执行任务
    boolean workerStarted = false;
    boolean workerAdded = false;
    Worker w = null;
    try {
    	// Worker 的构造函数如下,我们可以发现 Worker 中的 thread 在创建的时候,
    	// 传入的参数是 this,即 Worker 对象自身
    	// 因此后在调用 t.start() 启动线程时,线程中执行的是 Worker 的 run(),
    	// 而 Worker 中的 run() 方法如下,
    	// 最终调用的是 runWorker() 方法
    	/**
         *  Worker(Runnable firstTask) {
         *     setState(-1); // inhibit interrupts until runWorker
         *     this.firstTask = firstTask;
         *     this.thread = getThreadFactory().newThread(this);
         *  }
         * 
         *  public void run() { runWorker(this);}
         */
        w = new Worker(firstTask);
        final Thread t = w.thread;
        if (t != null) {
            final ReentrantLock mainLock = this.mainLock;
            mainLock.lock();
            // workers 作为线程池的容器,全局唯一,对它的操作需要保证互斥性
            try {
                // Recheck while holding lock.
                // Back out on ThreadFactory failure or if
                // shut down before lock acquired.
                int rs = runStateOf(ctl.get());

                if (rs < SHUTDOWN ||
                    (rs == SHUTDOWN && firstTask == null)) {
                    if (t.isAlive()) // precheck that t is startable
                    	// 新创建的线程,在启动前一般不会 isAlive
                        throw new IllegalThreadStateException();
                    workers.add(w);
                    int s = workers.size();
                    if (s > largestPoolSize)
                    	// largestPoolSize 是用来记录线程池的历史最大线程数的
                        largestPoolSize = s;
                    workerAdded = true;
                }
            } finally {
                mainLock.unlock();
            }
            if (workerAdded) {
                t.start();
                workerStarted = true;
            }
        }
    } finally {
        if (! workerStarted)
            addWorkerFailed(w);
    }
    return workerStarted;
}

runWorker()

final void runWorker(Worker w) {
    Thread wt = Thread.currentThread();
    Runnable task = w.firstTask;
    w.firstTask = null;
    w.unlock(); // allow interrupts
    // 首任务非空先执行首任务,之后自旋从任务队列中取任务并执行
    boolean completedAbruptly = true;
    try {
        while (task != null || (task = getTask()) != null) {
        	// 获得锁来防止线程在执行任务过程中被其他线程中断
            w.lock();
            // If pool is stopping, ensure thread is interrupted;
            // if not, ensure thread is not interrupted.  This
            // requires a recheck in second case to deal with
            // shutdownNow race while clearing interrupt
            // 只有当线程池状态为 STOP 时,才去中断线程
            if ((runStateAtLeast(ctl.get(), STOP) ||
                 (Thread.interrupted() &&
                  runStateAtLeast(ctl.get(), STOP))) &&
                !wt.isInterrupted())
                wt.interrupt();
            try {
                beforeExecute(wt, task);
                Throwable thrown = null;
                try {
                    task.run();
                } catch (RuntimeException x) {
                    thrown = x; throw x;
                } catch (Error x) {
                    thrown = x; throw x;
                } catch (Throwable x) {
                    thrown = x; throw new Error(x);
                } finally {
                    afterExecute(task, thrown);
                }
            } finally {
                // 任务执行完毕置空,下轮循环重新从队列中获取
                task = null;
                // 计数器加一
                w.completedTasks++;
                w.unlock();
            }
        }
        // 线程正常结束,将标识置为 false,出现异常则为 true
        completedAbruptly = false;
    } finally {
    	// 线程退出线程池,后置处理
        processWorkerExit(w, completedAbruptly);
    }
}

getTask()

private Runnable getTask() {
    boolean timedOut = false; // Did the last poll() time out?

    for (;;) {
        int c = ctl.get();
        int rs = runStateOf(c);

        // Check if queue empty only if necessary.
        // 判断线程池是否处于 STOP 状态
        if (rs >= SHUTDOWN && (rs >= STOP || workQueue.isEmpty())) {
        	// 线程池处于 STOP 状态且任务队列已空,当前线程需要结束,
        	// 先将计数器减一再返回 null 去结束线程
            decrementWorkerCount();
            return null;
        }

        int wc = workerCountOf(c);

        // Are workers subject to culling?
        // 如果 allowCoreThreadTimeOut 为真,所有线程都需要判断超时,
        // 否则只判断超出核心线程数之外的线程的超时
        boolean timed = allowCoreThreadTimeOut || wc > corePoolSize;

        if ((wc > maximumPoolSize || (timed && timedOut))// 线程数超限,或者线程已经空闲超时
            && (wc > 1 || workQueue.isEmpty())) {// 并且线程池中的线程数大于 1 或者任务队列已空
            if (compareAndDecrementWorkerCount(c))// CAS 成功直接返回 null 来结束线程
                return null;
            continue;// CAS 失败,继续自旋进行 CAS
        }

		// 从队列中取任务
        try {
            Runnable r = timed ?
                workQueue.poll(keepAliveTime, TimeUnit.NANOSECONDS) :
                workQueue.take();
            if (r != null)
                return r;
            timedOut = true;
        } catch (InterruptedException retry) {
            timedOut = false;
        }
    }
}

processWorkerExit()

private void processWorkerExit(Worker w, boolean completedAbruptly) {
	// 线程是出现异常突然结束的,需要将计数器减一,因为正常情况下的结束,都有对应的计数处理
    if (completedAbruptly) // If abrupt, then workerCount wasn't adjusted
        decrementWorkerCount();

    final ReentrantLock mainLock = this.mainLock;
    mainLock.lock();
    try {
    	// 将完成的任务数汇总
        completedTaskCount += w.completedTasks;
        // 从容器中移除
        workers.remove(w);
    } finally {
        mainLock.unlock();
    }

	// 尝试将线程池终止,
	// 因为外部调用 shutdown() 方法时,会判断任务队列是否有待处理的任务,
	// 如果有,不会立刻自旋更新线程池的状态到 TIDYING,
	// 最终会在任务队列被处理完毕后,最后一个线程结束时,更新状态。
    tryTerminate();

    int c = ctl.get();
    if (runStateLessThan(c, STOP)) {
    	// 线程正常结束
        if (!completedAbruptly) {
        	// 线程池中当前需要的最小线程数。
        	// 如果允许核心线程空闲超时,则最小线程数根据是否有待处理的任务进一步判断,
        	//     如果有则需要有一个线程,无则线程池可空;
        	// 如果不允许核心线程空闲超时,则线程池中需要一直保持 corePoolSize 个活跃的线程(即使空闲)
            int min = allowCoreThreadTimeOut ? 0 : corePoolSize;
            if (min == 0 && ! workQueue.isEmpty())
                min = 1;
            if (workerCountOf(c) >= min)
                return; // replacement not needed
        }
        // 替补队员上场
        addWorker(null, false);
    }
}

tryTerminate()

final void tryTerminate() {
    for (;;) {
        int c = ctl.get();
        // 线程池处于 RUNNING 状态,不能被 terminate
        // 线程池已经处于 TIDYING 或 TERMINATED 状态,不需要再被 terminate
        // 线程池处于 SHUTDOWN 状态,但是还有任务需要处理,不能被 terminate
        if (isRunning(c) ||
            runStateAtLeast(c, TIDYING) ||
            (runStateOf(c) == SHUTDOWN && ! workQueue.isEmpty()))
            return;
        // 工作线程的数量不为 0,尝试中断一个空闲的线程
        // 线程何时是空闲状态呢?
        // 当阻塞在 getTask() 方法获取任务时,此时可能成功获得 Worker 的锁
        if (workerCountOf(c) != 0) { // Eligible to terminate
        	// 尝试中断一条空闲的线程
            interruptIdleWorkers(ONLY_ONE);
            return;
        }

		// 符合 TIDYING 条件,将线程池状态置为 TIDYING
        final ReentrantLock mainLock = this.mainLock;
        mainLock.lock();
        try {
            if (ctl.compareAndSet(c, ctlOf(TIDYING, 0))) {
                try {
                    terminated();
                } finally {
                	// 执行完 terminated() 方法后,线程池的状态由 TIDYING 转为 TERMINATED
                    ctl.set(ctlOf(TERMINATED, 0));
                    termination.signalAll();
                }
                return;
            }
        } finally {
            mainLock.unlock();
        }
        // else retry on failed CAS
    }
}

interruptIdleWorkers()

private void interruptIdleWorkers(boolean onlyOne) {
    final ReentrantLock mainLock = this.mainLock;
    mainLock.lock();
    try {
        for (Worker w : workers) {
            Thread t = w.thread;
            if (!t.isInterrupted() && w.tryLock()) {
                try {
                    t.interrupt();
                } catch (SecurityException ignore) {
                } finally {
                    w.unlock();
                }
            }
            if (onlyOne)
                break;
        }
    } finally {
        mainLock.unlock();
    }
}

addWorkerFailed()

private void addWorkerFailed(Worker w) {
    final ReentrantLock mainLock = this.mainLock;
    mainLock.lock();
    try {
        if (w != null)
            workers.remove(w);
        decrementWorkerCount();
        tryTerminate();
    } finally {
        mainLock.unlock();
    }
}

remove()

public boolean remove(Runnable task) {
    boolean removed = workQueue.remove(task);
    tryTerminate(); // In case SHUTDOWN and now empty
    return removed;
}

reject()

final void reject(Runnable command) {
	// 执行传入的拒绝策略,默认是 AbortPolicy,抛出异常
    handler.rejectedExecution(command, this);
}

shutdown()

public void shutdown() {
    final ReentrantLock mainLock = this.mainLock;
    mainLock.lock();
    try {
        checkShutdownAccess();
        advanceRunState(SHUTDOWN);// 自旋将线程池状态更新成 SHUTDOWN
        interruptIdleWorkers();// 中断所有空闲的线程
        onShutdown(); // hook for ScheduledThreadPoolExecutor
    } finally {
        mainLock.unlock();
    }
    tryTerminate();
}
private void advanceRunState(int targetState) {
    for (;;) {
        int c = ctl.get();
        if (runStateAtLeast(c, targetState) ||
            ctl.compareAndSet(c, ctlOf(targetState, workerCountOf(c))))
            break;
    }
}

3. 总结

3.1 工作流程

execute 方法中关于工作流程的描述如下:

/*
 * Proceed in 3 steps:
 *
 * 1. If fewer than corePoolSize threads are running, try to
 * start a new thread with the given command as its first
 * task.  The call to addWorker atomically checks runState and
 * workerCount, and so prevents false alarms that would add
 * threads when it shouldn't, by returning false.
 *
 * 2. If a task can be successfully queued, then we still need
 * to double-check whether we should have added a thread
 * (because existing ones died since last checking) or that
 * the pool shut down since entry into this method. So we
 * recheck state and if necessary roll back the enqueuing if
 * stopped, or start a new thread if there are none.
 *
 * 3. If we cannot queue task, then we try to add a new
 * thread.  If it fails, we know we are shut down or saturated
 * and so reject the task.
 */

从上面的描述中我们可以知道,线程池的工作流程大概如下:

  1. 如果池中运行的线程数小于 corePoolSize,尝试创建一个新的线程作为核心线程,线程的 firstTask 就是传过来的任务,该线程创建成功完成后会首先执行传入的任务;
  2. 如果第一步条件不成立,或者创建核心线程失败,会尝试将任务加入任务队列,入队成功后:再次检查线程池的状态,并判断是否需要创建新线程执行任务;
  3. 第二步中入队失败,尝试创建一个非核心线程,线程的 firstTask 就是传过来的任务,该线程创建成功完成后会首先执行传入的任务,如果创建失败,说明线程池处于 SHUTDOWN 状态或者线程池已经饱和,执行拒绝策略。

3.2 状态转换

源码中关于线程池状态的描述如下:

/**
 * The runState provides the main lifecycle control, taking on values:
 *
 *   RUNNING:  Accept new tasks and process queued tasks
 *   SHUTDOWN: Don't accept new tasks, but process queued tasks
 *   STOP:     Don't accept new tasks, don't process queued tasks,
 *             and interrupt in-progress tasks
 *   TIDYING:  All tasks have terminated, workerCount is zero,
 *             the thread transitioning to state TIDYING
 *             will run the terminated() hook method
 *   TERMINATED: terminated() has completed
 *
 * The numerical order among these values matters, to allow
 * ordered comparisons. The runState monotonically increases over
 * time, but need not hit each state. The transitions are:
 *
 * RUNNING -> SHUTDOWN
 *    On invocation of shutdown(), perhaps implicitly in finalize()
 * (RUNNING or SHUTDOWN) -> STOP
 *    On invocation of shutdownNow()
 * SHUTDOWN -> TIDYING
 *    When both queue and pool are empty
 * STOP -> TIDYING
 *    When pool is empty
 * TIDYING -> TERMINATED
 *    When the terminated() hook method has completed
 *
 * Threads waiting in awaitTermination() will return when the
 * state reaches TERMINATED.
 *
 */

从官方的描述中,我们可以得到如下的线程池状态转换图:
JUC 源码解读系列--ThreadPoolExecutor 篇_第1张图片
RUNNING:接收新的任务并且处理任务队列中的任务;
SHUTDOWN:不接收新的任务,但是仍处理任务队列中的任务;
STOP:不接收新的任务,不处理任务队列中的任务,中断正在执行的任务;
TIDYING:所有的任务已终止,线程池中的工作线程数为零,即将执行 terminated() 钩子方法;
TERMINATED:terminated() 钩子方法执行后的线程池状态。

3.3 线程池中线程的生与灭

  1. 何时创建?
    a. 当线程池接收新的任务,工作线程数小于核心线程数时,会创建核心线程。
    b. 当线程池接收新的任务,工作线程数大于核心线程数,且任务队列已满,且工作线程数小于最大线程数时,会创建非核心线程。

  2. 何时销毁?
    a. 允许核心线程空闲超时的情况下,空闲时间超时的核心线程将被销毁。
    b. 超出核心线程数之外的线程,空闲超时将被销毁。
    c. 外部调用 shutdown() 方法时,处于空闲状态的线程将被销毁,处于忙碌状态的线程将会在任务队列中的任务被处理完毕后被销毁。
    d. 外部调用 shutdownNow() 方法时,处于运行状态的线程将被中断并销毁,空闲的线程将被销毁。

你可能感兴趣的:(技术,菜鸟,微服务)