线程池的出现将开发人员从线程的管理中解放出来,让开发人员有更多精力去关注业务代码。但是 JUC 中对线程池的高度封装,让不少开发人员渐渐忽视或生疏了线程池的底层实现,作为一个有追求的开发者,不仅知其然也要知其所以然,同时这一部分也是许多大厂面试的必问题目。本篇我们一起来走进线程池的源码一探究竟。
// 当我们使用 Executors 工具/工厂类创建线程池时,不管是使用下面三种方式的哪一种,
// 其底层都是通过 ThreadPoolExecutor 来创建线程池的
ExecutorService cachedThreadPool = Executors.newCachedThreadPool();
ExecutorService fixedThreadPool = Executors.newFixedThreadPool(2);
ExecutorService singleThreadExecutor = Executors.newSingleThreadExecutor();
/**
* Creates a new {@code ThreadPoolExecutor} with the given initial
* parameters.
*
* @param corePoolSize the number of threads to keep in the pool, even
* if they are idle, unless {@code allowCoreThreadTimeOut} is set
* @param maximumPoolSize the maximum number of threads to allow in the
* pool
* @param keepAliveTime when the number of threads is greater than
* the core, this is the maximum time that excess idle threads
* will wait for new tasks before terminating.
* @param unit the time unit for the {@code keepAliveTime} argument
* @param workQueue the queue to use for holding tasks before they are
* executed. This queue will hold only the {@code Runnable}
* tasks submitted by the {@code execute} method.
* @param threadFactory the factory to use when the executor
* creates a new thread
* @param handler the handler to use when execution is blocked
* because the thread bounds and queue capacities are reached
* @throws IllegalArgumentException if one of the following holds:
* {@code corePoolSize < 0}
* {@code keepAliveTime < 0}
* {@code maximumPoolSize <= 0}
* {@code maximumPoolSize < corePoolSize}
* @throws NullPointerException if {@code workQueue}
* or {@code threadFactory} or {@code handler} is null
*/
public ThreadPoolExecutor(int corePoolSize,// 线程池中的线程数
int maximumPoolSize,// 线程池最大线程数
long keepAliveTime,// 超过 corePoolSize 之外的线程的最大空闲时间
TimeUnit unit,// 时间单位
BlockingQueue<Runnable> workQueue,// 存放待执行任务的队列
ThreadFactory threadFactory,// 创建线程使用的工厂
RejectedExecutionHandler handler) {// 当任务队列已满,且没有可用线程执行到来的任务时,要执行的拒绝策略
if (corePoolSize < 0 ||
maximumPoolSize <= 0 ||
maximumPoolSize < corePoolSize ||
keepAliveTime < 0)
throw new IllegalArgumentException();
if (workQueue == null || threadFactory == null || handler == null)
throw new NullPointerException();
this.corePoolSize = corePoolSize;
this.maximumPoolSize = maximumPoolSize;
// newCachedThreadPool 方法中默认使用的队列是 SynchronousQueue,
// 这是比较有意思的一种队列,后面会在本系列中详细讲解。
// newSingleThreadExecutor 和 newFixedThreadPool 方法中默认使用的是 LinkedBlockingQueue,
// 且是无参构造,即队列的长度是默认值 Integer.MAX_VALUE,这样当大量任务并发时会有 OOM 的隐患。
this.workQueue = workQueue;
this.keepAliveTime = unit.toNanos(keepAliveTime);
this.threadFactory = threadFactory;
// ThreadPoolExecutor 中提供了四中拒绝策略的实现:
// CallerRunsPolicy 调用者线程执行当前任务
// AbortPolicy 抛出异常
// DiscardPolicy 直接丢弃,什么也不做
// DiscardOldestPolicy 丢弃队列中最旧的任务,执行当前任务
this.handler = handler;
}
execute()
public void execute(Runnable command) {
if (command == null)
throw new NullPointerException();
/*
* Proceed in 3 steps:
*
* 1. If fewer than corePoolSize threads are running, try to
* start a new thread with the given command as its first
* task. The call to addWorker atomically checks runState and
* workerCount, and so prevents false alarms that would add
* threads when it shouldn't, by returning false.
*
* 2. If a task can be successfully queued, then we still need
* to double-check whether we should have added a thread
* (because existing ones died since last checking) or that
* the pool shut down since entry into this method. So we
* recheck state and if necessary roll back the enqueuing if
* stopped, or start a new thread if there are none.
*
* 3. If we cannot queue task, then we try to add a new
* thread. If it fails, we know we are shut down or saturated
* and so reject the task.
*/
// ctl 中的高位用来记录线程池的状态,低位用来记录线程数
int c = ctl.get();
// 如果池中的线程数小于核心线程数,则创建一个新的线程,作为核心线程,
// 线程的初始任务就是当前任务
if (workerCountOf(c) < corePoolSize) {
if (addWorker(command, true))
return;
// 存在并发或者线程池处于非 RUNNING 状态,核心线程创建失败,
// 需要判断最新的线程池状态
c = ctl.get();
}
// 上面的创建核心线程失败,会进入这里将任务入队
// 如果线程池处于 RUNNING 状态,并且任务成功加入队列
if (isRunning(c) && workQueue.offer(command)) {
// 再次检查线程池状态:
// 如果线程池不是 RUNNING 状态,表示不接受新的任务,将当前任务从队列中移除,
// 如果移除成功,说明任务还没有被执行,进而执行拒绝策略;
// 如果移除失败,说明任务已经被执行
int recheck = ctl.get();
if (! isRunning(recheck) && remove(command))
reject(command);
// 假如上面的判断中,线程池处于 RUNNING 状态,remove(command) 方法不会被执行,任务仍在队列中,
// 此时不确定线程池中是否有活跃线程执行任务,因此需要进一步判断,假如当前无活跃线程,尝试去创建一条
// 假如上面的判断中,线程池处于非 RUNNING 状态,remove(command) 方法会被执行且执行失败,说明任务已经被执行
else if (workerCountOf(recheck) == 0)
addWorker(null, false);
}
// 如果上面的入队操作失败(或者线程池状态非 RUNNING),尝试创建非核心线程执行任务,创建失败则执行拒绝策略。
// 假如线程池状态非 RUNNING,线程池不会接收新的任务,这里是一定会创建失败的。
else if (!addWorker(command, false))
reject(command);
}
线程池状态定义
private final AtomicInteger ctl = new AtomicInteger(ctlOf(RUNNING, 0));
private static final int COUNT_BITS = Integer.SIZE - 3;
private static final int CAPACITY = (1 << COUNT_BITS) - 1;// 可记录的最大线程数量
/**
* RUNNING: Accept new tasks and process queued tasks
* SHUTDOWN: Don't accept new tasks, but process queued tasks
* STOP: Don't accept new tasks, don't process queued tasks,
* and interrupt in-progress tasks
* TIDYING: All tasks have terminated, workerCount is zero,
* the thread transitioning to state TIDYING
* will run the terminated() hook method
* TERMINATED: terminated() has completed
*/
// runState is stored in the high-order bits
// 高三位用来记录线程池状态
private static final int RUNNING = -1 << COUNT_BITS;// 高三位 111 表示 RUNNING 状态,接收新的任务,也处理任务队列中的任务
private static final int SHUTDOWN = 0 << COUNT_BITS;// 高三位 000 表示 SHUTDOWN 状态,不接收新的任务,但处理任务队列中的任务
private static final int STOP = 1 << COUNT_BITS;// 高三位 001 表示 STOP 状态,不接收新的任务,也不处理任务队列中的任务,并中断执行中的任务
private static final int TIDYING = 2 << COUNT_BITS;// 高三位 010 表示 TIDYING 状态,所有的任务都结束,线程数为 0,即将执行 terminated() 方法
private static final int TERMINATED = 3 << COUNT_BITS;// 高三位 011 表示 TERMINATED 状态,terminated() 方法执行完成后的状态
// Packing and unpacking ctl
private static int runStateOf(int c) { return c & ~CAPACITY; } // 根据 ctl 计算线程池状态
private static int workerCountOf(int c) { return c & CAPACITY; } // 根据 ctl 计算池中线程数量
private static int ctlOf(int rs, int wc) { return rs | wc; } // 根据线程池状态和池中线程数量计算 ctl
addWorker()
private boolean addWorker(Runnable firstTask, boolean core) {
retry:
for (;;) {
int c = ctl.get();
int rs = runStateOf(c);
// Check if queue empty only if necessary.
// 当线程池状态 rs >= SHUTDOWN 时(即为 SHUTDOWN,STOP,TIDYING 或 TERMINATED 时),
// 进一步判断:
// SHUTDOWN,STOP,TIDYING,TERMINATED 这几个状态都不会再接收新的任务,
// 同时只有 SHUTDOWN 状态会继续执行队列中未执行的任务。
// 因此这里只有当线程池状态是 SHUTDOWN 且传过来的任务为空且
// 当前任务队列不空的情况下才进入下面创建线程的逻辑
if (rs >= SHUTDOWN &&
! (rs == SHUTDOWN &&
firstTask == null &&
! workQueue.isEmpty()))
return false;
for (;;) {
int wc = workerCountOf(c);
// 如果线程数溢出或者超限,返回失败
if (wc >= CAPACITY ||
wc >= (core ? corePoolSize : maximumPoolSize))
return false;
// 通过 CAS 将线程数量加一,成功的话结束外层自旋
if (compareAndIncrementWorkerCount(c))
break retry;
// 因为上面的 CAS 是对 ctl 进行的,所以如果 CAS 失败,
// 则可能是线程池状态的变化或者线程数量的变化,
// 进一步判断是否是线程池状态的变化,如果是,继续外层自旋,
// 对状态进行判断处理
c = ctl.get(); // Re-read ctl
if (runStateOf(c) != rs)
continue retry;
// 如果是线程数量的变化,继续内层的自旋,重新尝试 CAS
// else CAS failed due to workerCount change; retry inner loop
}
}
// 创建一个新的线程并执行任务
boolean workerStarted = false;
boolean workerAdded = false;
Worker w = null;
try {
// Worker 的构造函数如下,我们可以发现 Worker 中的 thread 在创建的时候,
// 传入的参数是 this,即 Worker 对象自身
// 因此后在调用 t.start() 启动线程时,线程中执行的是 Worker 的 run(),
// 而 Worker 中的 run() 方法如下,
// 最终调用的是 runWorker() 方法
/**
* Worker(Runnable firstTask) {
* setState(-1); // inhibit interrupts until runWorker
* this.firstTask = firstTask;
* this.thread = getThreadFactory().newThread(this);
* }
*
* public void run() { runWorker(this);}
*/
w = new Worker(firstTask);
final Thread t = w.thread;
if (t != null) {
final ReentrantLock mainLock = this.mainLock;
mainLock.lock();
// workers 作为线程池的容器,全局唯一,对它的操作需要保证互斥性
try {
// Recheck while holding lock.
// Back out on ThreadFactory failure or if
// shut down before lock acquired.
int rs = runStateOf(ctl.get());
if (rs < SHUTDOWN ||
(rs == SHUTDOWN && firstTask == null)) {
if (t.isAlive()) // precheck that t is startable
// 新创建的线程,在启动前一般不会 isAlive
throw new IllegalThreadStateException();
workers.add(w);
int s = workers.size();
if (s > largestPoolSize)
// largestPoolSize 是用来记录线程池的历史最大线程数的
largestPoolSize = s;
workerAdded = true;
}
} finally {
mainLock.unlock();
}
if (workerAdded) {
t.start();
workerStarted = true;
}
}
} finally {
if (! workerStarted)
addWorkerFailed(w);
}
return workerStarted;
}
runWorker()
final void runWorker(Worker w) {
Thread wt = Thread.currentThread();
Runnable task = w.firstTask;
w.firstTask = null;
w.unlock(); // allow interrupts
// 首任务非空先执行首任务,之后自旋从任务队列中取任务并执行
boolean completedAbruptly = true;
try {
while (task != null || (task = getTask()) != null) {
// 获得锁来防止线程在执行任务过程中被其他线程中断
w.lock();
// If pool is stopping, ensure thread is interrupted;
// if not, ensure thread is not interrupted. This
// requires a recheck in second case to deal with
// shutdownNow race while clearing interrupt
// 只有当线程池状态为 STOP 时,才去中断线程
if ((runStateAtLeast(ctl.get(), STOP) ||
(Thread.interrupted() &&
runStateAtLeast(ctl.get(), STOP))) &&
!wt.isInterrupted())
wt.interrupt();
try {
beforeExecute(wt, task);
Throwable thrown = null;
try {
task.run();
} catch (RuntimeException x) {
thrown = x; throw x;
} catch (Error x) {
thrown = x; throw x;
} catch (Throwable x) {
thrown = x; throw new Error(x);
} finally {
afterExecute(task, thrown);
}
} finally {
// 任务执行完毕置空,下轮循环重新从队列中获取
task = null;
// 计数器加一
w.completedTasks++;
w.unlock();
}
}
// 线程正常结束,将标识置为 false,出现异常则为 true
completedAbruptly = false;
} finally {
// 线程退出线程池,后置处理
processWorkerExit(w, completedAbruptly);
}
}
getTask()
private Runnable getTask() {
boolean timedOut = false; // Did the last poll() time out?
for (;;) {
int c = ctl.get();
int rs = runStateOf(c);
// Check if queue empty only if necessary.
// 判断线程池是否处于 STOP 状态
if (rs >= SHUTDOWN && (rs >= STOP || workQueue.isEmpty())) {
// 线程池处于 STOP 状态且任务队列已空,当前线程需要结束,
// 先将计数器减一再返回 null 去结束线程
decrementWorkerCount();
return null;
}
int wc = workerCountOf(c);
// Are workers subject to culling?
// 如果 allowCoreThreadTimeOut 为真,所有线程都需要判断超时,
// 否则只判断超出核心线程数之外的线程的超时
boolean timed = allowCoreThreadTimeOut || wc > corePoolSize;
if ((wc > maximumPoolSize || (timed && timedOut))// 线程数超限,或者线程已经空闲超时
&& (wc > 1 || workQueue.isEmpty())) {// 并且线程池中的线程数大于 1 或者任务队列已空
if (compareAndDecrementWorkerCount(c))// CAS 成功直接返回 null 来结束线程
return null;
continue;// CAS 失败,继续自旋进行 CAS
}
// 从队列中取任务
try {
Runnable r = timed ?
workQueue.poll(keepAliveTime, TimeUnit.NANOSECONDS) :
workQueue.take();
if (r != null)
return r;
timedOut = true;
} catch (InterruptedException retry) {
timedOut = false;
}
}
}
processWorkerExit()
private void processWorkerExit(Worker w, boolean completedAbruptly) {
// 线程是出现异常突然结束的,需要将计数器减一,因为正常情况下的结束,都有对应的计数处理
if (completedAbruptly) // If abrupt, then workerCount wasn't adjusted
decrementWorkerCount();
final ReentrantLock mainLock = this.mainLock;
mainLock.lock();
try {
// 将完成的任务数汇总
completedTaskCount += w.completedTasks;
// 从容器中移除
workers.remove(w);
} finally {
mainLock.unlock();
}
// 尝试将线程池终止,
// 因为外部调用 shutdown() 方法时,会判断任务队列是否有待处理的任务,
// 如果有,不会立刻自旋更新线程池的状态到 TIDYING,
// 最终会在任务队列被处理完毕后,最后一个线程结束时,更新状态。
tryTerminate();
int c = ctl.get();
if (runStateLessThan(c, STOP)) {
// 线程正常结束
if (!completedAbruptly) {
// 线程池中当前需要的最小线程数。
// 如果允许核心线程空闲超时,则最小线程数根据是否有待处理的任务进一步判断,
// 如果有则需要有一个线程,无则线程池可空;
// 如果不允许核心线程空闲超时,则线程池中需要一直保持 corePoolSize 个活跃的线程(即使空闲)
int min = allowCoreThreadTimeOut ? 0 : corePoolSize;
if (min == 0 && ! workQueue.isEmpty())
min = 1;
if (workerCountOf(c) >= min)
return; // replacement not needed
}
// 替补队员上场
addWorker(null, false);
}
}
tryTerminate()
final void tryTerminate() {
for (;;) {
int c = ctl.get();
// 线程池处于 RUNNING 状态,不能被 terminate
// 线程池已经处于 TIDYING 或 TERMINATED 状态,不需要再被 terminate
// 线程池处于 SHUTDOWN 状态,但是还有任务需要处理,不能被 terminate
if (isRunning(c) ||
runStateAtLeast(c, TIDYING) ||
(runStateOf(c) == SHUTDOWN && ! workQueue.isEmpty()))
return;
// 工作线程的数量不为 0,尝试中断一个空闲的线程
// 线程何时是空闲状态呢?
// 当阻塞在 getTask() 方法获取任务时,此时可能成功获得 Worker 的锁
if (workerCountOf(c) != 0) { // Eligible to terminate
// 尝试中断一条空闲的线程
interruptIdleWorkers(ONLY_ONE);
return;
}
// 符合 TIDYING 条件,将线程池状态置为 TIDYING
final ReentrantLock mainLock = this.mainLock;
mainLock.lock();
try {
if (ctl.compareAndSet(c, ctlOf(TIDYING, 0))) {
try {
terminated();
} finally {
// 执行完 terminated() 方法后,线程池的状态由 TIDYING 转为 TERMINATED
ctl.set(ctlOf(TERMINATED, 0));
termination.signalAll();
}
return;
}
} finally {
mainLock.unlock();
}
// else retry on failed CAS
}
}
interruptIdleWorkers()
private void interruptIdleWorkers(boolean onlyOne) {
final ReentrantLock mainLock = this.mainLock;
mainLock.lock();
try {
for (Worker w : workers) {
Thread t = w.thread;
if (!t.isInterrupted() && w.tryLock()) {
try {
t.interrupt();
} catch (SecurityException ignore) {
} finally {
w.unlock();
}
}
if (onlyOne)
break;
}
} finally {
mainLock.unlock();
}
}
addWorkerFailed()
private void addWorkerFailed(Worker w) {
final ReentrantLock mainLock = this.mainLock;
mainLock.lock();
try {
if (w != null)
workers.remove(w);
decrementWorkerCount();
tryTerminate();
} finally {
mainLock.unlock();
}
}
remove()
public boolean remove(Runnable task) {
boolean removed = workQueue.remove(task);
tryTerminate(); // In case SHUTDOWN and now empty
return removed;
}
reject()
final void reject(Runnable command) {
// 执行传入的拒绝策略,默认是 AbortPolicy,抛出异常
handler.rejectedExecution(command, this);
}
shutdown()
public void shutdown() {
final ReentrantLock mainLock = this.mainLock;
mainLock.lock();
try {
checkShutdownAccess();
advanceRunState(SHUTDOWN);// 自旋将线程池状态更新成 SHUTDOWN
interruptIdleWorkers();// 中断所有空闲的线程
onShutdown(); // hook for ScheduledThreadPoolExecutor
} finally {
mainLock.unlock();
}
tryTerminate();
}
private void advanceRunState(int targetState) {
for (;;) {
int c = ctl.get();
if (runStateAtLeast(c, targetState) ||
ctl.compareAndSet(c, ctlOf(targetState, workerCountOf(c))))
break;
}
}
execute 方法中关于工作流程的描述如下:
/*
* Proceed in 3 steps:
*
* 1. If fewer than corePoolSize threads are running, try to
* start a new thread with the given command as its first
* task. The call to addWorker atomically checks runState and
* workerCount, and so prevents false alarms that would add
* threads when it shouldn't, by returning false.
*
* 2. If a task can be successfully queued, then we still need
* to double-check whether we should have added a thread
* (because existing ones died since last checking) or that
* the pool shut down since entry into this method. So we
* recheck state and if necessary roll back the enqueuing if
* stopped, or start a new thread if there are none.
*
* 3. If we cannot queue task, then we try to add a new
* thread. If it fails, we know we are shut down or saturated
* and so reject the task.
*/
从上面的描述中我们可以知道,线程池的工作流程大概如下:
源码中关于线程池状态的描述如下:
/**
* The runState provides the main lifecycle control, taking on values:
*
* RUNNING: Accept new tasks and process queued tasks
* SHUTDOWN: Don't accept new tasks, but process queued tasks
* STOP: Don't accept new tasks, don't process queued tasks,
* and interrupt in-progress tasks
* TIDYING: All tasks have terminated, workerCount is zero,
* the thread transitioning to state TIDYING
* will run the terminated() hook method
* TERMINATED: terminated() has completed
*
* The numerical order among these values matters, to allow
* ordered comparisons. The runState monotonically increases over
* time, but need not hit each state. The transitions are:
*
* RUNNING -> SHUTDOWN
* On invocation of shutdown(), perhaps implicitly in finalize()
* (RUNNING or SHUTDOWN) -> STOP
* On invocation of shutdownNow()
* SHUTDOWN -> TIDYING
* When both queue and pool are empty
* STOP -> TIDYING
* When pool is empty
* TIDYING -> TERMINATED
* When the terminated() hook method has completed
*
* Threads waiting in awaitTermination() will return when the
* state reaches TERMINATED.
*
*/
从官方的描述中,我们可以得到如下的线程池状态转换图:
RUNNING:接收新的任务并且处理任务队列中的任务;
SHUTDOWN:不接收新的任务,但是仍处理任务队列中的任务;
STOP:不接收新的任务,不处理任务队列中的任务,中断正在执行的任务;
TIDYING:所有的任务已终止,线程池中的工作线程数为零,即将执行 terminated() 钩子方法;
TERMINATED:terminated() 钩子方法执行后的线程池状态。
何时创建?
a. 当线程池接收新的任务,工作线程数小于核心线程数时,会创建核心线程。
b. 当线程池接收新的任务,工作线程数大于核心线程数,且任务队列已满,且工作线程数小于最大线程数时,会创建非核心线程。
何时销毁?
a. 允许核心线程空闲超时的情况下,空闲时间超时的核心线程将被销毁。
b. 超出核心线程数之外的线程,空闲超时将被销毁。
c. 外部调用 shutdown() 方法时,处于空闲状态的线程将被销毁,处于忙碌状态的线程将会在任务队列中的任务被处理完毕后被销毁。
d. 外部调用 shutdownNow() 方法时,处于运行状态的线程将被中断并销毁,空闲的线程将被销毁。