由于最近在写个爬虫相关的,所以对线程池相关的了解的一下。结合之前的使用以及书本上看的一些东西,在这儿做一些总结。顺便吐槽一下功能欠缺的Future。
线程池:
JDK自己也有提供一个线程池工具类java.util.concurrent.Executors
这个类中实现了如下一些方法:
如图中都是构建线程池的方法。其中分两个类。
newWorkStealingPool是通过调用ForkJoinPool来实现的。
其余的构造是调用ThreadPoolExecutor来实现的。
ThreadPoolExecutor
接下来看一下这集万千宠爱于一身的类的构造方法:
/**
* Creates a new {@code ThreadPoolExecutor} with the given initial
* parameters.
*
* @param corePoolSize the number of threads to keep in the pool, even
* if they are idle, unless {@code allowCoreThreadTimeOut} is set
* @param maximumPoolSize the maximum number of threads to allow in the
* pool
* @param keepAliveTime when the number of threads is greater than
* the core, this is the maximum time that excess idle threads
* will wait for new tasks before terminating.
* @param unit the time unit for the {@code keepAliveTime} argument
* @param workQueue the queue to use for holding tasks before they are
* executed. This queue will hold only the {@code Runnable}
* tasks submitted by the {@code execute} method.
* @param threadFactory the factory to use when the executor
* creates a new thread
* @param handler the handler to use when execution is blocked
* because the thread bounds and queue capacities are reached
* @throws IllegalArgumentException if one of the following holds:
* {@code corePoolSize < 0}
* {@code keepAliveTime < 0}
* {@code maximumPoolSize <= 0}
* {@code maximumPoolSize < corePoolSize}
* @throws NullPointerException if {@code workQueue}
* or {@code threadFactory} or {@code handler} is null
*/
public ThreadPoolExecutor(int corePoolSize,
int maximumPoolSize,
long keepAliveTime,
TimeUnit unit,
BlockingQueue workQueue,
ThreadFactory threadFactory,
RejectedExecutionHandler handler) {
if (corePoolSize < 0 ||
maximumPoolSize <= 0 ||
maximumPoolSize < corePoolSize ||
keepAliveTime < 0)
throw new IllegalArgumentException();
if (workQueue == null || threadFactory == null || handler == null)
throw new NullPointerException();
this.corePoolSize = corePoolSize;
this.maximumPoolSize = maximumPoolSize;
this.workQueue = workQueue;
this.keepAliveTime = unit.toNanos(keepAliveTime);
this.threadFactory = threadFactory;
this.handler = handler;
}
参数名 | 参数说明 |
---|---|
corePoolSize | 核心线程池大小 |
maximumPoolSize | 最大线程池大小,当core用尽,queue排满,就会根据max来创建新的临时的线程(临时工) |
keepAliveTime | 线程池中超过corePoolSize的线程数的线程的最大空闲存活时间 |
unit | 上一个时间属性的单位 |
workQueue | 当core线程用完了,会先把任务塞进该阻塞队列 |
threadFactory | 线程创建工厂类 |
handler | 看着名字就知道这是拒绝策略,当core线程池满了,线程队列满了,最大线程池大小也满了的时候,就会触发拒绝策略 |
拒绝策略 RejectedExecutionHandler
JDK自带的拒绝策略有:
ThreadPoolExecutor.AbortPolicy
线程池中的数量等于最大线程数时、直接抛出RejectedExecutionException
ThreadPoolExecutor.CallerRunsPolicy
重试执行当前的任务,交由调用者线程来执行任务
ThreadPoolExecutor.DiscardOldestPolicy
抛弃线程池中最后一个要执行的任务,并执行新传入的任务
ThreadPoolExecutor.DiscardPolicy
看着名字就知道,直接抛弃
偶尔我们可能也有自己的拒绝策略,比如实现当满了的时候等待。就可以如笔者下列创建的线程池这样写。
private ExecutorService synExecutorPool = new ThreadPoolExecutor(5, 5, 60, TimeUnit.SECONDS, new LinkedBlockingQueue<>(1), new ThreadFactory() {
@Override
public Thread newThread(Runnable r) {
return new Thread(r, "synExecutor Thread : " + (threadNum++));
}
},
//拒绝策略
(Runnable r, ThreadPoolExecutor executor) -> {
if (!executor.isShutdown()) {
try {
//阻塞添加该任务到queue,直到有资源被空出来
executor.getQueue().put(r);
} catch (InterruptedException e) {
logger.error(e.toString(), e);
Thread.currentThread().interrupt();
}
}
});
ThreadPoolExcutor Worker
任务的执行,一般都是调用方法
public void execute(Runnable command)
OR
public Future submit(Callable task)
execute方法在执行的时候就会去判断corePoolSize Queue 以及maximumPoolSize 来决定是否添加新的worker来执行,或者入队,或者添加新的thread,或者应该reject 。
这里我们就要说到Worker了。
Worker是ThreadPoolExecutor的一个内部类,实现了AbstractQueuedSynchronizer抽象类。
/**
* Class Worker mainly maintains interrupt control state for
* threads running tasks, along with other minor bookkeeping.
* This class opportunistically extends AbstractQueuedSynchronizer
* to simplify acquiring and releasing a lock surrounding each
* task execution. This protects against interrupts that are
* intended to wake up a worker thread waiting for a task from
* instead interrupting a task being run. We implement a simple
* non-reentrant mutual exclusion lock rather than use
* ReentrantLock because we do not want worker tasks to be able to
* reacquire the lock when they invoke pool control methods like
* setCorePoolSize. Additionally, to suppress interrupts until
* the thread actually starts running tasks, we initialize lock
* state to a negative value, and clear it upon start (in
* runWorker).
*/
private final class Worker extends AbstractQueuedSynchronizer implements Runnable
通过该类的描述,我们可以知道这个类是主要控制线程执行任务时候的interrupt操作。它集成了AQS,实现了非重入锁,以此保护一个正在执行任务的worker不被打断。为啥要不直接使用ReentrantLock,是因为不想Worker task在setCorePoolSize这种线程池控制方法调用时能重新获取到锁。
构造方法
Worker(Runnable firstTask) {
setState(-1); // inhibit interrupts until runWorker
this.firstTask = firstTask;
this.thread = getThreadFactory().newThread(this);
}
Run
/** Delegates main run loop to outer runWorker */
public void run() {
runWorker(this);
}
final void runWorker(Worker w) {
Thread wt = Thread.currentThread();
Runnable task = w.firstTask;
w.firstTask = null;
w.unlock(); // allow interrupts
boolean completedAbruptly = true;
try {
//循环获取任务。注意getTask是一个阻塞调用。
while (task != null || (task = getTask()) != null) {
w.lock();
// If pool is stopping, ensure thread is interrupted;
// if not, ensure thread is not interrupted. This
// requires a recheck in second case to deal with
// shutdownNow race while clearing interrupt
if ((runStateAtLeast(ctl.get(), STOP) ||
(Thread.interrupted() &&
runStateAtLeast(ctl.get(), STOP))) &&
!wt.isInterrupted())
wt.interrupt();
try {
beforeExecute(wt, task);
Throwable thrown = null;
try {
//执行线程的run方法
task.run();
} catch (RuntimeException x) {
thrown = x; throw x;
} catch (Error x) {
thrown = x; throw x;
} catch (Throwable x) {
thrown = x; throw new Error(x);
} finally {
afterExecute(task, thrown);
}
} finally {
task = null;
w.completedTasks++;
w.unlock();
}
}
completedAbruptly = false;
} finally {
processWorkerExit(w, completedAbruptly);
}
}
RUN方法调用的是ThreadPoolExecutor的runWorker方法。其中while循环的条件调用getTask()获取任务。
线程池核心状态ctl
读ThreadPoolExecutor源码之前,先了解一下ctl。它是ThreadPoolExecutor中的一个属性。
private final AtomicInteger ctl = new AtomicInteger(ctlOf(RUNNING, 0));
他是个AtomicInteger, Integer,32位。低29位记录线程池中线程数,通过高3位表示线程池的运行状态:
1、RUNNING:-1 << COUNT_BITS,即高3位为111,该状态的线程池会接收新任务,并处理阻塞队列中的任务;
2、SHUTDOWN: 0 << COUNT_BITS,即高3位为000,该状态的线程池不会接收新任务,但会处理阻塞队列中的任务;
3、STOP : 1 << COUNT_BITS,即高3位为001,该状态的线程不会接收新任务,也不会处理阻塞队列中的任务,而且会中断正在运行的任务;
4、TIDYING : 2 << COUNT_BITS,即高3位为010, 所有的任务都已经终止;
5、TERMINATED: 3 << COUNT_BITS,即高3位为011, terminated()方法已经执行完成
getTask
private Runnable getTask() {
boolean timedOut = false; // Did the last poll() time out?
for (;;) {
//例行检查 线程池状态。
int c = ctl.get();
int rs = runStateOf(c);
// Check if queue empty only if necessary.
if (rs >= SHUTDOWN && (rs >= STOP || workQueue.isEmpty())) {
decrementWorkerCount();
return null;
}
int wc = workerCountOf(c);
//判断是否需要剔除worker
boolean timed = allowCoreThreadTimeOut || wc > corePoolSize;
if ((wc > maximumPoolSize || (timed && timedOut))
&& (wc > 1 || workQueue.isEmpty())) {
//通过CAS减少ctl的值,也就是更新worker的数量
if (compareAndDecrementWorkerCount(c))
return null;
continue;
}
try {
//从队列中获取任务 返回。如果设定是可以有删除的worker,就poll keepAliveTIme的时候,看是否有任务。如果没有任务就在下一轮for循环中删除
Runnable r = timed ?
workQueue.poll(keepAliveTime, TimeUnit.NANOSECONDS) :
workQueue.take();
if (r != null)
return r;
timedOut = true;
} catch (InterruptedException retry) {
timedOut = false;
}
}
}
在getTask中通过poll和take从workQueue中获取任务,顺便判断是否需要减少coreSize的数量,以及判断空闲时间是否达到了需要减少maxSize的数量。
worker何时被调用的呢
其实从一开始,worker就已经被启用了。在调用submit 方法的时候,就有调用方法addWorker,添加一个新的worker。
private boolean addWorker(Runnable firstTask, boolean core) {
retry:
for (;;) {
//例行检查线程池状态
int c = ctl.get();
int rs = runStateOf(c);
// Check if queue empty only if necessary.
if (rs >= SHUTDOWN &&
! (rs == SHUTDOWN &&
firstTask == null &&
! workQueue.isEmpty()))
return false;
for (;;) {
int wc = workerCountOf(c);
if (wc >= CAPACITY ||
wc >= (core ? corePoolSize : maximumPoolSize))
return false;
//CAS增加数量
if (compareAndIncrementWorkerCount(c))
break retry;
c = ctl.get(); // Re-read ctl
if (runStateOf(c) != rs)
continue retry;
// else CAS failed due to workerCount change; retry inner loop
}
}
//上面的是一些判断,校验逻辑,下面的才是worker生成,运行
boolean workerStarted = false;
boolean workerAdded = false;
Worker w = null;
try {
//new一个新的worker,加入firstTask
w = new Worker(firstTask);
//拿到创建worker时候创建的线程
final Thread t = w.thread;
if (t != null) {
final ReentrantLock mainLock = this.mainLock;
mainLock.lock();
try {
// Recheck while holding lock.
// Back out on ThreadFactory failure or if
// shut down before lock acquired.
int rs = runStateOf(ctl.get());
if (rs < SHUTDOWN ||
(rs == SHUTDOWN && firstTask == null)) {
if (t.isAlive()) // precheck that t is startable
throw new IllegalThreadStateException();
workers.add(w);
int s = workers.size();
if (s > largestPoolSize)
largestPoolSize = s;
workerAdded = true;
}
} finally {
mainLock.unlock();
}
if (workerAdded) {
//开启线程。也就是执行worker.run方法
t.start();
workerStarted = true;
}
}
} finally {
if (! workerStarted)
addWorkerFailed(w);
}
return workerStarted;
}
到这儿所有的事情都串起来了。
ThreadPoolExecutor.submit -> addWorker() -> Worker.thread.start()-> ThreadPoolExecutor.runWorker()->getTask()->workQueue.take()->task.run()
在task.run()的前后还有两个空实现方法
beforeExecute 和 afterExecute 提供给用户实现自己的线程池的时候进行扩展
问题
有同学问我为啥在调用getTask函数的时候还会有wc > maximumPoolSize的判断。当时我也懵逼了一下。然后我发现有个线程池完成初始化之后是可以调用set函数来重置corePoolSize和maximumSize的。
public void setCorePoolSize(int corePoolSize) {
if (corePoolSize < 0)
throw new IllegalArgumentException();
int delta = corePoolSize - this.corePoolSize;
this.corePoolSize = corePoolSize;
if (workerCountOf(ctl.get()) > corePoolSize)
interruptIdleWorkers();
else if (delta > 0) {
// We don't really know how many new threads are "needed".
// As a heuristic, prestart enough new workers (up to new
// core size) to handle the current number of tasks in
// queue, but stop if queue becomes empty while doing so.
int k = Math.min(delta, workQueue.size());
while (k-- > 0 && addWorker(null, true)) {
if (workQueue.isEmpty())
break;
}
}
}
public void setMaximumPoolSize(int maximumPoolSize) {
if (maximumPoolSize <= 0 || maximumPoolSize < corePoolSize)
throw new IllegalArgumentException();
this.maximumPoolSize = maximumPoolSize;
if (workerCountOf(ctl.get()) > maximumPoolSize)
interruptIdleWorkers();
}
对吧。这就很好理解了撒。在调用setMaximumPoolSize的时候会就会出现wc>maximumPoolSize的情况。然后会调用interruptIdleWorkers来中断回收一些空闲的workers。