ThreadPoolExecutor是Executor执行框架最重要的一个实现类,提供了线程池管理和任务管理是两个最基本的能力。这篇通过分析ThreadPoolExecutor的源码来看看如何设计和实现一个基于生产者消费者模型的执行器。
生产者消费者模型
生产者消费者模型包含三个角色:生产者,工作队列,消费者。对于ThreadPoolExecutor来说,
1. 生产者是任务的提交者,是外部调用ThreadPoolExecutor的线程
2. 工作队列是一个阻塞队列的接口,具体的实现类可以有很多种。BlockingQueue
3. 消费者是封装了线程的Worker类的集合。HashSet
主要属性
明确了ThreadPoolExecutor的基本执行模型之后,来看下它的几个主要属性:
1. private final AtomicInteger ctl = new AtomicInteger(ctlOf(RUNNING, 0)); 一个32位的原子整形作为线程池的状态控制描述符。低29位作为工作者线程的数量。所以工作者线程最多有2^29 -1个。高3位来保持线程池的状态。ThreadPoolExecutor总共有5种状态:
* RUNNING: 可以接受新任务并执行
* SHUTDOWN: 不再接受新任务,但是仍然执行工作队列中的任务
* STOP: 不再接受新任务,不执行工作队列中的任务,并且中断正在执行的任务
* TIDYING: 所有任务被终止,工作线程的数量为0,会去执行terminated()钩子方法
* TERMINATED: terminated()执行结束
下面是一系列ctl这个变量定义和工具方法
private final AtomicInteger ctl = new AtomicInteger(ctlOf(RUNNING, 0));
private static final int COUNT_BITS = Integer.SIZE - 3;
private static final int CAPACITY = (1 << COUNT_BITS) - 1;
// runState is stored in the high-order bits
private static final int RUNNING = -1 << COUNT_BITS;
private static final int SHUTDOWN = 0 << COUNT_BITS;
private static final int STOP = 1 << COUNT_BITS;
private static final int TIDYING = 2 << COUNT_BITS;
private static final int TERMINATED = 3 << COUNT_BITS;
// Packing and unpacking ctl
private static int runStateOf(int c) { return c & ~CAPACITY; }
private static int workerCountOf(int c) { return c & CAPACITY; }
private static int ctlOf(int rs, int wc) { return rs | wc; }
private static boolean runStateLessThan(int c, int s) {
return c < s;
}
private static boolean runStateAtLeast(int c, int s) {
return c >= s;
}
private static boolean isRunning(int c) {
return c < SHUTDOWN;
}
private boolean compareAndIncrementWorkerCount(int expect) {
return ctl.compareAndSet(expect, expect + 1);
}
private boolean compareAndDecrementWorkerCount(int expect) {
return ctl.compareAndSet(expect, expect - 1);
}
private void decrementWorkerCount() {
do {} while (! compareAndDecrementWorkerCount(ctl.get()));
}
2. private final BlockingQueue
3. private final ReentrantLock mainLock = new ReentrantLock(); 控制ThreadPoolExecutor的全局可重入锁,所有需要同步的操作都要被这个锁保护
4. private final Condition termination = mainLock.newCondition(); mainLock的条件队列,来进行wait()和notify()等条件操作
5. private final HashSet
6. private volatile ThreadFactory threadFactory; 创建线程的工厂,可以自定义线程创建的逻辑
7. private volatile RejectedExecutionHandler handler; 拒绝执行任务的处理器,可以自定义拒绝的策略
8. private volatile long keepAliveTime; 空闲线程的存活时间。可以根据这个存活时间来判断空闲线程是否等待超时,然后采取相应的线程回收操作
9. private volatile boolean allowCoreThreadTimeOut; 是否允许coreThread线程超时回收
10. private volatile int corePoolSize; 可存活的线程的最小值。如果设置了allowCoreThreadTimeOut, 那么corePoolSize的值可以为0。
11. private volatile int maximumPoolSize; 可存活的线程的最大值
工作线程创建和回收策略
ThreadPoolExecutor通过corePoolSize,maximumPoolSize, allowCoreThreadTimeOut,keepAliveTime等几个参数提供一个灵活的工作线程创建和回收的策略。
创建策略:
1. 当工作线程数量小于corePoolSize时,不管其他线程是否空闲,都创建新的工作线程来处理新加入的任务
2. 当工作线程数量大于corePoolSize,小于maximumPoolSize时,只有当工作队列满了,才会创建新的工作线程来处理新加入的任务。当工作队列有空余时,只把新任务加入队列
3. 把corePoolSize和maximumPoolSize 设置成相同的值时,线程池就是一个固定(fixed)工作线程数的线程。
回收策略:
1. keepAliveTime变量设置了空闲工作线程超时的时间,当工作线程数量超过了corePoolSize后,空闲的工作线程等待超过了keepAliveTime后,会被回收。后面会说怎么确定一个工作线程是否“空闲”。
2. 如果设置了allowCoreThreadTimeOut,那么core Thread也可以被回收,即当core thread也空闲时,也可以被回收,直到工作线程集合为0。
工作队列策略
工作队列BlockingQueue
4个基本策略:
1. 当工作线程数量小于corePoolSize时,新提交的任务总是会由新创建的工作线程执行,不入队列
2. 当工作线程数量大于corePoolSize,如果工作队列没满,新提交的任务就入队列
3. 当工作线程数量大于corePoolSize,小于MaximumPoolSize时,如果工作队列满了,新提交的任务就交给新创建的工作线程,不入队列
4. 当工作线程数量大于MaximumPoolSize,并且工作队列满了,那么新提交的任务会被拒绝执行。具体看采用何种拒绝策略
根据不同的阻塞队列的实现类,又有几种额外的策略
1. 采用SynchronousQueue直接将任务传递给空闲的线程执行,不额外存储任务。这种方式需要无限制的MaximumPoolSize,可以创建无限制的工作线程来处理提交的任务。这种方式的好处是任务可以很快被执行,适用于任务到达时间大于任务处理时间的情况。缺点是当任务量很大时,会占用大量线程
2. 采用无边界的工作队列LinkedBlockingQueue。这种情况下,由于工作队列永远不会满,那么工作线程的数量最大就是corePoolSize,因为当工作线程数量达到corePoolSize时,只有工作队列满的时候才会创建新的工作线程。这种方式好处是使用的线程数量是稳定的,当内存足够大时,可以处理足够多的请求。缺点是如果任务直接有依赖,很有可能形成死锁,因为当工作线程被消耗完时,不会创建新的工作现场,只会把任务加入工作队列。并且可能由于内存耗尽引发内存溢出OOM
3. 采用有界的工作队列AraayBlockingQueue。这种情况下对于内存资源是可控的,但是需要合理调节MaximumPoolSize和工作队列的长度,这两个值是相互影响的。当工作队列长度比较小的时,必定会创建更多的线程。而更多的线程会引起上下文切换等额外的消耗。当工作队列大,MaximumPoolSize小的时候,会影响吞吐量,并且会触发拒绝机制。
拒绝执行策略
当Executor处于shutdown状态或者工作线程超过MaximumPoolSize并且工作队列满了之后,新提交的任务将会被拒绝执行。RejectedExecutionHandler接口定义了拒绝执行的策略。具体的策略有
CallerRunsPolicy:由调用者线程来执行被拒绝的任务,属于同步执行
AbortPolicy:中止执行,抛出RejectedExecutionException异常
DiscardPolicy:丢弃任务
DiscardOldestPolicy:丢弃最老的任务
public static class CallerRunsPolicy implements RejectedExecutionHandler {
/**
* Creates a {@code CallerRunsPolicy}.
*/
public CallerRunsPolicy() { }
/**
* Executes task r in the caller's thread, unless the executor
* has been shut down, in which case the task is discarded.
*
* @param r the runnable task requested to be executed
* @param e the executor attempting to execute this task
*/
public void rejectedExecution(Runnable r, ThreadPoolExecutor e) {
if (!e.isShutdown()) {
r.run();
}
}
}
/**
* A handler for rejected tasks that throws a
* {@code RejectedExecutionException}.
*/
public static class AbortPolicy implements RejectedExecutionHandler {
/**
* Creates an {@code AbortPolicy}.
*/
public AbortPolicy() { }
/**
* Always throws RejectedExecutionException.
*
* @param r the runnable task requested to be executed
* @param e the executor attempting to execute this task
* @throws RejectedExecutionException always.
*/
public void rejectedExecution(Runnable r, ThreadPoolExecutor e) {
throw new RejectedExecutionException("Task " + r.toString() +
" rejected from " +
e.toString());
}
}
/**
* A handler for rejected tasks that silently discards the
* rejected task.
*/
public static class DiscardPolicy implements RejectedExecutionHandler {
/**
* Creates a {@code DiscardPolicy}.
*/
public DiscardPolicy() { }
/**
* Does nothing, which has the effect of discarding task r.
*
* @param r the runnable task requested to be executed
* @param e the executor attempting to execute this task
*/
public void rejectedExecution(Runnable r, ThreadPoolExecutor e) {
}
}
/**
* A handler for rejected tasks that discards the oldest unhandled
* request and then retries {@code execute}, unless the executor
* is shut down, in which case the task is discarded.
*/
public static class DiscardOldestPolicy implements RejectedExecutionHandler {
/**
* Creates a {@code DiscardOldestPolicy} for the given executor.
*/
public DiscardOldestPolicy() { }
/**
* Obtains and ignores the next task that the executor
* would otherwise execute, if one is immediately available,
* and then retries execution of task r, unless the executor
* is shut down, in which case task r is instead discarded.
*
* @param r the runnable task requested to be executed
* @param e the executor attempting to execute this task
*/
public void rejectedExecution(Runnable r, ThreadPoolExecutor e) {
if (!e.isShutdown()) {
e.getQueue().poll();
e.execute(r);
}
}
}
工作线程Worker的设计
工作线程没有直接使用Thread,而是采用了Worker类封装了Thread,目的是更好地进行中断控制。Worker直接继承了AbstractQueuedSynchronizer来进行同步操作,它实现了一个不可重入的互斥结构。当它的state属性为0时表示unlock,state为1时表示lock。任务执行时必须在lock状态的保护下,防止出现同步问题。因此当Worker处于lock状态时,表示它正在运行,当它处于unlock状态时,表示它“空闲”。当它空闲超过keepAliveTime时,就有可能被回收。
Worker还实现了Runnable接口, 执行它的线程是Worker包含的Thread对象,在Worker的构造函数可以看到Thread创建时,把Worker对象传递给了它。
private final class Worker
extends AbstractQueuedSynchronizer
implements Runnable
{
/** Thread this worker is running in. Null if factory fails. */
final Thread thread;
/** Initial task to run. Possibly null. */
Runnable firstTask;
/** Per-thread task counter */
volatile long completedTasks;
Worker(Runnable firstTask) {
setState(-1); // inhibit interrupts until runWorker
this.firstTask = firstTask;
// 把Worker对象作为Runnable的实例传递给了新创建Thread对象
this.thread = getThreadFactory().newThread(this);
}
public void run() {
runWorker(this);
}
// Lock methods
//
// The value 0 represents the unlocked state.
// The value 1 represents the locked state.
protected boolean isHeldExclusively() {
return getState() != 0;
}
protected boolean tryAcquire(int unused) {
if (compareAndSetState(0, 1)) {
setExclusiveOwnerThread(Thread.currentThread());
return true;
}
return false;
}
protected boolean tryRelease(int unused) {
setExclusiveOwnerThread(null);
setState(0);
return true;
}
public void lock() { acquire(1); }
public boolean tryLock() { return tryAcquire(1); }
public void unlock() { release(1); }
public boolean isLocked() { return isHeldExclusively(); }
void interruptIfStarted() {
Thread t;
if (getState() >= 0 && (t = thread) != null && !t.isInterrupted()) {
try {
t.interrupt();
} catch (SecurityException ignore) {
}
}
}
}
Worker被它的线程执行时,run方法调用了ThreadPoolExecutor的runWorker方法。
1. wt指向当前执行Worker的run方法的线程,也就是指向了Worker包含的工作线程对象
2. task指向Worker包含的firstTask对象,表示当前要执行的任务
3. 当task不为null或者从工作队列中取到了新任务,那么先加锁w.lock表示正在运行任务。在真正开始执行task.run()之前,先判断线程池的状态是否已经STOP,如果是,就中断Worker的线程。
4. 一旦判断当前线程不是STOP并且工作线程没有中断。那么就开始执行task.run()了。Worker的interruptIfStarted方法可以中断这个Worker的线程,从而中断正在执行任务。
5. beforeExecute(wt, task)和afterExecute(wt,task)是两个钩子方法,支持在任务真正开始执行前就行扩展。
final void runWorker(Worker w) {
Thread wt = Thread.currentThread();
Runnable task = w.firstTask;
w.firstTask = null;
w.unlock(); // allow interrupts
boolean completedAbruptly = true;
try {
while (task != null || (task = getTask()) != null) {
w.lock();
// If pool is stopping, ensure thread is interrupted;
// if not, ensure thread is not interrupted. This
// requires a recheck in second case to deal with
// shutdownNow race while clearing interrupt
if ((runStateAtLeast(ctl.get(), STOP) ||
(Thread.interrupted() &&
runStateAtLeast(ctl.get(), STOP))) &&
!wt.isInterrupted())
wt.interrupt();
try {
beforeExecute(wt, task);
Throwable thrown = null;
try {
task.run();
} catch (RuntimeException x) {
thrown = x; throw x;
} catch (Error x) {
thrown = x; throw x;
} catch (Throwable x) {
thrown = x; throw new Error(x);
} finally {
afterExecute(task, thrown);
}
} finally {
task = null;
w.completedTasks++;
w.unlock();
}
}
completedAbruptly = false;
} finally {
processWorkerExit(w, completedAbruptly);
}
}
工作线程Worker创建和回收的源码
首先看一下ThreadPoolExecutor的execute方法,这个方式是任务提交的入口。可以看到它的逻辑符合之前说的工作线程创建的基本策略
1. 当工作线程数量小于corePoolSize时,通过addWorker(command,true)来新建工作线程处理新建的任务,不入工作队列
2. 当工作线程数量大于等于corePoolSize时,先入队列,使用的是BlockingQueue的offer方法。当工作线程数量为0时,还会通过addWorker(null, false)添加一个新的工作线程
3. 当工作队列满了并且工作线程数量在corePoolSize和MaximumPoolSize之间,就创建新的工作线程去执行新添加的任务。当工作线程数量超过了MaximumPoolSize,就拒绝任务。
public void execute(Runnable command) {
if (command == null)
throw new NullPointerException();
int c = ctl.get();
if (workerCountOf(c) < corePoolSize) {
if (addWorker(command, true))
return;
c = ctl.get();
}
if (isRunning(c) && workQueue.offer(command)) {
int recheck = ctl.get();
if (! isRunning(recheck) && remove(command))
reject(command);
else if (workerCountOf(recheck) == 0)
addWorker(null, false);
}
else if (!addWorker(command, false))
reject(command);
}
可以看到addWorker方法是创建Worker工作线程的所在。
1. retry这个循环判断线程池的状态和当前工作线程数量的边界。如果允许创建工作现场,首先修改ctl变量表示的工作线程的数量
2. 把工作线程添加到workers集合中的操作要在mainLock这个锁的保护下进行。所有和ThreadPoolExecutor状态相关的操作都要在mainLock锁的保护下进行
3. w = new Worker(firstTask); 创建Worker实例,把firstTask作为它当前的任务。firstTask为null时表示先只创建Worker线程,然后去工作队列中取任务执行
4. 把新创建的Worker实例加入到workers集合,修改相关统计变量。
5. 当加入集合成功后,开始启动这个Worker实例。启动的方法是调用Worker封装的Thread的start()方法。之前说了,这个Thread对应的Runnable是Worker本身,会去调用Worker的run方法,然后调用ThreadPoolExecutor的runWorker方法。在runWorker方法中真正去执行任务。
private boolean addWorker(Runnable firstTask, boolean core) {
retry:
for (;;) {
int c = ctl.get();
int rs = runStateOf(c);
// Check if queue empty only if necessary.
if (rs >= SHUTDOWN &&
! (rs == SHUTDOWN &&
firstTask == null &&
! workQueue.isEmpty()))
return false;
for (;;) {
int wc = workerCountOf(c);
if (wc >= CAPACITY ||
wc >= (core ? corePoolSize : maximumPoolSize))
return false;
if (compareAndIncrementWorkerCount(c))
break retry;
c = ctl.get(); // Re-read ctl
if (runStateOf(c) != rs)
continue retry;
// else CAS failed due to workerCount change; retry inner loop
}
}
boolean workerStarted = false;
boolean workerAdded = false;
Worker w = null;
try {
final ReentrantLock mainLock = this.mainLock;
w = new Worker(firstTask);
final Thread t = w.thread;
if (t != null) {
mainLock.lock();
try {
// Recheck while holding lock.
// Back out on ThreadFactory failure or if
// shut down before lock acquired.
int c = ctl.get();
int rs = runStateOf(c);
if (rs < SHUTDOWN ||
(rs == SHUTDOWN && firstTask == null)) {
if (t.isAlive()) // precheck that t is startable
throw new IllegalThreadStateException();
workers.add(w);
int s = workers.size();
if (s > largestPoolSize)
largestPoolSize = s;
workerAdded = true;
}
} finally {
mainLock.unlock();
}
if (workerAdded) {
t.start();
workerStarted = true;
}
}
} finally {
if (! workerStarted)
addWorkerFailed(w);
}
return workerStarted;
}
工作线程回收的方法是processWorkerExit(),它在runWorker方法执行结束的时候被调用。之前说了空闲的工作线程可能会在keepAliveTime时间之后被回收。这个逻辑隐含在runWorker方法和getTask方法中,会在下面说如何从工作队列取任务时说明。processWorkerExit方法单纯只是处理工作线程的回收。
1. 结合runWorker方法看,如果Worker执行task.run()的时候抛出了异常,那么completedAbruptly为true,需要从workers集合中把这个工作线程移除掉。
2. 如果是completedAbruptly为true,并且线程池不是STOP状态,那么就创建一个新的Worker工作线程
3. 如果是completedAbruptly为false,并且线程池不是STOP状态,首先检查是否allowCoreThreadTimeout,如果运行,那么最少线程数可以为0,否则是corePoolSize。如果最少线程数为0,并且工作队列不为空,那么最小值为1。最后检查当前的工作线程数量,如果小于最小值,就创建新的工作线程。
private void processWorkerExit(Worker w, boolean completedAbruptly) {
if (completedAbruptly) // If abrupt, then workerCount wasn't adjusted
decrementWorkerCount();
final ReentrantLock mainLock = this.mainLock;
mainLock.lock();
try {
completedTaskCount += w.completedTasks;
workers.remove(w);
} finally {
mainLock.unlock();
}
tryTerminate();
int c = ctl.get();
if (runStateLessThan(c, STOP)) {
if (!completedAbruptly) {
int min = allowCoreThreadTimeOut ? 0 : corePoolSize;
if (min == 0 && ! workQueue.isEmpty())
min = 1;
if (workerCountOf(c) >= min)
return; // replacement not needed
}
addWorker(null, false);
}
}
任务的获取
工作线程从工作队列中取任务的代码在getTask方法中
1. timed变量表示是否要计时,当计时超过keepAliveTime后还没取到任务,就返回null。结合runWorker方法可以知道,当getTask返回null时,该Worker线程会被回收,这就是如何回收空闲工作线程的方法。
timed变量当allowCoreThreadTimeout为true或者当工作线程数大于corePoolSize时为true。
2. 如果timed为true,就用BlockingQueue.poll(keepAliveTime, TimeUnit.NANOSECONDS)方法来计时从队头取任务,否则直接用take()方法从队头取任务
private Runnable getTask() {
boolean timedOut = false; // Did the last poll() time out?
retry:
for (;;) {
int c = ctl.get();
int rs = runStateOf(c);
// Check if queue empty only if necessary.
if (rs >= SHUTDOWN && (rs >= STOP || workQueue.isEmpty())) {
decrementWorkerCount();
return null;
}
boolean timed; // Are workers subject to culling?
for (;;) {
int wc = workerCountOf(c);
timed = allowCoreThreadTimeOut || wc > corePoolSize;
if (wc <= maximumPoolSize && ! (timedOut && timed))
break;
if (compareAndDecrementWorkerCount(c))
return null;
c = ctl.get(); // Re-read ctl
if (runStateOf(c) != rs)
continue retry;
// else CAS failed due to workerCount change; retry inner loop
}
try {
Runnable r = timed ?
workQueue.poll(keepAliveTime, TimeUnit.NANOSECONDS) :
workQueue.take();
if (r != null)
return r;
timedOut = true;
} catch (InterruptedException retry) {
timedOut = false;
}
}
}
线程池的关闭
线程池有SHUTDOWN, STOP, TIDYING, TERMINATED这几个状态和线程池关闭相关。通常我们把关闭分为优雅的关闭和强制立刻关闭。
所谓优雅的关闭就是调用shutdown()方法,线程池进入SHUTDOWN状态,不在接收新的任务,会把工作队列的任务执行完毕后再结束。
强制立刻关闭就是调用shutdownNow()方法,线程池直接进入STOP状态,会中断正在执行的工作线程,清空工作队列。
1. 在shutdown方法中,先设置线程池状态为SHUTDOWN,然后先去中断空闲的工作线程,再调用onShutdown钩子方法。最后tryTerminate()
2. 在shutdownNow方法中,先设置线程池状态为STOP,然后先中断所有的工作线程,再清空工作队列。最后tryTerminate()。这个方法会把工作队列中的任务返回给调用者处理。
public void shutdown() {
final ReentrantLock mainLock = this.mainLock;
mainLock.lock();
try {
checkShutdownAccess();
advanceRunState(SHUTDOWN);
interruptIdleWorkers();
onShutdown(); // hook for ScheduledThreadPoolExecutor
} finally {
mainLock.unlock();
}
tryTerminate();
}
public List
List
final ReentrantLock mainLock = this.mainLock;
mainLock.lock();
try {
checkShutdownAccess();
advanceRunState(STOP);
interruptWorkers();
tasks = drainQueue();
} finally {
mainLock.unlock();
}
tryTerminate();
return tasks;
}
interruptIdleWorkers方法会去中断空闲的工作线程,所谓空闲的工作线程即没有上锁的Worker。
而interruptWorkers方法直接去中断所有的Worker,调用Worker.interruptIfStarted()方法
private void interruptIdleWorkers(boolean onlyOne) {
final ReentrantLock mainLock = this.mainLock;
mainLock.lock();
try {
for (Worker w : workers) {
Thread t = w.thread;
if (!t.isInterrupted() && w.tryLock()) {
try {
t.interrupt();
} catch (SecurityException ignore) {
} finally {
w.unlock();
}
}
if (onlyOne)
break;
}
} finally {
mainLock.unlock();
}
}
private void interruptWorkers() {
final ReentrantLock mainLock = this.mainLock;
mainLock.lock();
try {
for (Worker w : workers)
w.interruptIfStarted();
} finally {
mainLock.unlock();
}
}
void interruptIfStarted() {
Thread t;
if (getState() >= 0 && (t = thread) != null && !t.isInterrupted()) {
try {
t.interrupt();
} catch (SecurityException ignore) {
}
}
}
tryTerminate方法会尝试终止线程池,根据线程池的状态,在相应状态会中断空闲工作线程,调用terminated()钩子方法,设置状态为TERMINATED。
final void tryTerminate() {
for (;;) {
int c = ctl.get();
if (isRunning(c) ||
runStateAtLeast(c, TIDYING) ||
(runStateOf(c) == SHUTDOWN && ! workQueue.isEmpty()))
return;
if (workerCountOf(c) != 0) { // Eligible to terminate
interruptIdleWorkers(ONLY_ONE);
return;
}
final ReentrantLock mainLock = this.mainLock;
mainLock.lock();
try {
if (ctl.compareAndSet(c, ctlOf(TIDYING, 0))) {
try {
terminated();
} finally {
ctl.set(ctlOf(TERMINATED, 0));
termination.signalAll();
}
return;
}
} finally {
mainLock.unlock();
}
// else retry on failed CAS
}
}
最后说明一下,JVM的守护进程只有当所有派生出来的线程都结束后才会退出,使用ThreadPoolExecutor线程池时,如果有的任务一直执行,并且不响应中断,那么会一直占用线程,那么JVM也会一直工作,不会退出。