Jdk线程池ThreadPoolExecutor源码解析

文章目录

  • 一、基础模型
    • 1.1 线程池任务的抽象 FutureTask
      • 1.1.1 FutureTask类层次结构
      • 1.1.2 FutureTask源码
    • 1.2 线程池的抽象: 从Excutor、ExecutorService到AbstractExecutorService
      • 1.2.1 线程池类层次结构
      • 1.2.2 线程池接口定义
      • 1.2.3 小结
  • 二、线程池工作原理
    • 2.1 线程池工作机制概述
    • 2.2 任务提交、执行流程
  • 三、ThreadPoolExecutor源码解析
    • 3.1 线程池的数据模型
    • 3.2 线程池的任务执行
  • 四、总结

一、基础模型

1.1 线程池任务的抽象 FutureTask

1.1.1 FutureTask类层次结构

Jdk线程池ThreadPoolExecutor源码解析_第1张图片
可以将RunnableFuture看做线程池对任务的统一抽象,这样线程池内部的执行逻辑就不用特意区分Runnable和Callable

1.1.2 FutureTask源码

FutureTask内部持有一个Callable对象,如果任务是Runable就通过RunnableAdapter适配成Callable;故无论提交的任务是Runable还是Callable,FutureTask都能Hold住。

public class FutureTask<V> implements RunnableFuture<V> {

    /**
     * The run state of this task, initially NEW.  The run state
     * transitions to a terminal state only in methods set,
     * setException, and cancel.  During completion, state may take on
     * transient values of COMPLETING (while outcome is being set) or
     * INTERRUPTING (only while interrupting the runner to satisfy a
     * cancel(true)). Transitions from these intermediate to final
     * states use cheaper ordered/lazy writes because values are unique
     * and cannot be further modified.
     *
     * Possible state transitions:
     * NEW -> COMPLETING -> NORMAL
     * NEW -> COMPLETING -> EXCEPTIONAL
     * NEW -> CANCELLED
     * NEW -> INTERRUPTING -> INTERRUPTED
     */
    private volatile int state;
    private static final int NEW          = 0;
    private static final int COMPLETING   = 1;
    private static final int NORMAL       = 2;
    private static final int EXCEPTIONAL  = 3;
    private static final int CANCELLED    = 4;
    private static final int INTERRUPTING = 5;
    private static final int INTERRUPTED  = 6;

    /** The underlying callable; nulled out after running */
    private Callable<V> callable;

    /** The result to return or exception to throw from get() */
    private Object outcome; // non-volatile, protected by state reads/writes
    /** The thread running the callable; CASed during run() */
    private volatile Thread runner;
    /** Treiber stack of waiting threads */
    private volatile WaitNode waiters;


    // 包装Callable
    public FutureTask(Callable<V> callable) {
        if (callable == null)
            throw new NullPointerException();
        this.callable = callable;
        this.state = NEW;       // ensure visibility of callable
    }

    // 包装Runnable: 通过RunnableAdapter将Runnable适配成Callable
    public FutureTask(Runnable runnable, V result) {
        this.callable = Executors.callable(runnable, result);
        this.state = NEW;       // ensure visibility of callable
    }
}

// 代码见 java.util.concurrent.Executors.RunnableAdapter
static final class RunnableAdapter<T> implements Callable<T> {
    final Runnable task;
    final T result;
    RunnableAdapter(Runnable task, T result) {
        this.task = task;
        this.result = result;
    }
    public T call() {
        task.run();
        return result;
    }
}

1.2 线程池的抽象: 从Excutor、ExecutorService到AbstractExecutorService

1.2.1 线程池类层次结构

Jdk线程池ThreadPoolExecutor源码解析_第2张图片

1.2.2 线程池接口定义

  • Executor
    顶层接口,定义了最基本的execute(Runnable task)方法
public interface Executor {

    void execute(Runnable command);
}
  • ExecutorService
    1、定义shutdown()、shutdownNow()等生命周期管理方法,让客户端能对线程池生命周期进行管控;
    2、在Executor基础上扩充了线程池任务提交方式,如submit(Callable task)、submit(Runable task);不再局限于只能提交Runnable任务。
public interface ExecutorService extends Executor {

    void shutdown();
    List<Runnable> shutdownNow();

    boolean isShutdown();

    boolean isTerminated();

    boolean awaitTermination(long timeout, TimeUnit unit)
        throws InterruptedException;

    <T> Future<T> submit(Callable<T> task);

    <T> Future<T> submit(Runnable task, T result);

    Future<?> submit(Runnable task);

    <T> List<Future<T>> invokeAll(Collection<? extends Callable<T>> tasks)
        throws InterruptedException;

    <T> List<Future<T>> invokeAll(Collection<? extends Callable<T>> tasks,
                                  long timeout, TimeUnit unit)
        throws InterruptedException;

    <T> T invokeAny(Collection<? extends Callable<T>> tasks)
        throws InterruptedException, ExecutionException;

    <T> T invokeAny(Collection<? extends Callable<T>> tasks,
                    long timeout, TimeUnit unit)
        throws InterruptedException, ExecutionException, TimeoutException;
}
  • AbstractExecutorService
    1、封装并实现对Runnable和Callable两种类型任务的适配逻辑: 对提交的Runnable任务和Callable任务,统一包装成RunnableFuture
    2、定义了关心返回值类型的任务提交模板方法:先适配成RunnableFuture类型的task,在交给子类去execute(task)
public abstract class AbstractExecutorService implements ExecutorService {

    protected <T> RunnableFuture<T> newTaskFor(Runnable runnable, T value) {
        // 将Runnable包装成FutureTask, 对Runnable的包装比对Callable的包装稍复杂一点,用到的RunnableAdapter将Runnable适配为返回值为Void的Callable
        // FutureTask本身实现了Runable接口
        return new FutureTask<T>(runnable, value);
    }

    protected <T> RunnableFuture<T> newTaskFor(Callable<T> callable) {
        // 将Callable包装成FutureTask
        return new FutureTask<T>(callable);
    }

    public Future<?> submit(Runnable task) {
        if (task == null) throw new NullPointerException();
        RunnableFuture<Void> ftask = newTaskFor(task, null);
        // execute()来源于Executor接口定义的方法void execute(Runnable command); 本类并没有实现,而是交给子类去实现。妥妥滴模板方法设计模式!
        execute(ftask);
        return ftask;
    }

    public <T> Future<T> submit(Runnable task, T result) {
        if (task == null) throw new NullPointerException();
        RunnableFuture<T> ftask = newTaskFor(task, result);
        execute(ftask);
        return ftask;
    }

    public <T> Future<T> submit(Callable<T> task) {
        if (task == null) throw new NullPointerException();
        // 将Callable包装成RunnableFuture
        RunnableFuture<T> ftask = newTaskFor(task);
        execute(ftask);
        return ftask;
    }
}

1.2.3 小结

从 AbstractExecutorService 类的任务提交方式可以看出,线程池对Callable和Runnable的任务提交,二者传入的参数实际类型都是FutureTask,最终都是交给execute(Runnable runnable)来执行;

在Jdk8中通过CompletableFuture.supplyAsync()提交任务到线程池,稍微有点不一样,没有封装成FutureTask。传入参数实际类型为AsyncSupply,它也是Runnable的子类。不过最终执行的也是 execute(Runnable runnable);故后面看线程池任务执行流程,重点关注ThreadPoolExecutor的execute方法即可。

public static <U> CompletableFuture<U> supplyAsync(Supplier<U> supplier) {
    return asyncSupplyStage(asyncPool, supplier);
}

static <U> CompletableFuture<U> asyncSupplyStage(Executor e,
                                                 Supplier<U> f) {
    if (f == null) throw new NullPointerException();
    CompletableFuture<U> d = new CompletableFuture<U>();
    e.execute(new AsyncSupply<U>(d, f));
    return d;
}

二、线程池工作原理

2.1 线程池工作机制概述

线程池的作用是将任务提交与任务执行进行解耦,怎么理解?我们先从整体上把握下线程池执行流程

  • 1、客户端直接提交任务给线程池
  • 2、线程池获取到任务并执行任务
  • 3、如果客户端关心返回值,线程池将任务执行结果返回给Client
    Jdk线程池ThreadPoolExecutor源码解析_第3张图片
    1和3比较直观,没什么逻辑。复杂的地方在第二步执行任务:任务可以被Worker直接执行,也可以放到任务队列workQueue存起来缓冲起来,甚至可以由指定的饱和策略RejectedExecutionHandler来决定怎么执行。那什么条件对应这三条执行分支呢?带着这个疑问,看下线程池从任务提交到执行具体干了什么。

2.2 任务提交、执行流程

Jdk线程池ThreadPoolExecutor源码解析_第4张图片
从上图可以看出,当客户端提交一个新任务到线程池时,线程池按如下处理流程执行:

  • 1.首先,判断核心线程数是否达到上限?否,则新建工作线程直接执行任务;是,则进入下个流程。
  • 2.其次,判断线程池判断任务队列是否已满?否,则将新提交的任务缓冲到队列;是,则进入下个流程。
  • 3.最后,判断线程数是否已达线程池最大限制?否,则新建工作线程直接执行任务;是,则交给饱和策略来处理这个任务。

Q1: 假如coreSize=3,那么线程池创建完成的时候就已经new了3个线程吗?
A1: No,是0个。线程池比较懒,一开始是有任务提交进来才开始新建线程的。

Q2: 客户端提交的任务,执行顺序是不是先提交就先执行?
A2: NO,当队列满了之后,如果为达到线程上限,那之后再提交的任务,不入队而是直接由新起的线程处理,这样就出现客户端后提交的任务反而先被执行。


三、ThreadPoolExecutor源码解析

3.1 线程池的数据模型

  • a.线程池状态相关属性
// 线程池控制状态变量,这个ctl变量,包含两层含义:高3位表示线程池运行状态、低29位表示有效的线程个数
// RUNNING状态是 111 00000 00000000 00000000 00000000 , 所以初始化的ctl是 111 00000 00000000 00000000 00000000
private final AtomicInteger ctl = new AtomicInteger(ctlOf(RUNNING, 0));
// 29
private static final int COUNT_BITS = Integer.SIZE - 3;

// 线程池理论上的最大容量(536870911=2^29 -1)
// 左移29位,再减一,得到 000 11111 11111111 11111111 11111111 (536870911=2^29 -1) 
private static final int CAPACITY   = (1 << COUNT_BITS) - 1;

// 高3位用来表示运行状态
// runState is stored in the high-order bits

// -1的原码为 100 00000 00000000 00000000 00000001
// -1的反码为 111 11111 11111111 11111111 11111110 (负数的反码为: 对原码除符号位之外,其余位取反)
// -1的补码为 111 11111 11111111 11111111 11111111 (负数的补码为: 对反码+1)
// 左移29位,故RUNNING的值为: 111 00000 00000000 00000000 00000000
private static final int RUNNING    = -1 << COUNT_BITS;
// 000 00000 00000000 00000000 00000000
private static final int SHUTDOWN   =  0 << COUNT_BITS;
// 001 00000 00000000 00000000 00000000
private static final int STOP       =  1 << COUNT_BITS;
// 010 00000 00000000 00000000 00000000
private static final int TIDYING    =  2 << COUNT_BITS;
// 011 00000 00000000 00000000 00000000
private static final int TERMINATED =  3 << COUNT_BITS;


// 实时获取线程池运行状态runState:CAPACITY 的高三位都是0,先取反得到111 00000 00000000 00000000 00000000 ;故方法的实质就是取ctl变量的高3位
// ctl的初始值为RUNNING状态,值为 111 00000 00000000 00000000 00000000
// Packing and unpacking ctl
private static int runStateOf(int c)     { return c & ~CAPACITY; }

// 获取线程池当前有效线程数: CAPACITY 的高三位都是0,再与ctl做且运算;故方法的实质就是取ctl变量的低29位
private static int workerCountOf(int c)  { return c & CAPACITY; }

// 原子整型变量ctl的初始化方法
private static int ctlOf(int rs, int wc) { return rs | wc; }

源代码里面出现线程池运行状态判断非常多,一幅图掌握下线程池运行状态的状态机(来源于源码作者写的状态机注释)
Jdk线程池ThreadPoolExecutor源码解析_第5张图片

  • b.其他属性
    这里比较重要的是工作队列workQueue和工作线程的集合两个属性workers;工作队列是BlockingQueue,本身是线程安全的;而workers仅仅是HashSet,非线程安全,故添加删除worker时,都需要加锁。源码里面是使用ReentrantLock来做同步操作的。
// 工作队列
private final BlockingQueue<Runnable> workQueue;


/**
 * Lock held on access to workers set and related bookkeeping.
 * While we could use a concurrent set of some sort, it turns out
 * to be generally preferable to use a lock. Among the reasons is
 * that this serializes interruptIdleWorkers, which avoids
 * unnecessary interrupt storms, especially during shutdown.
 * Otherwise exiting threads would concurrently interrupt those
 * that have not yet interrupted. It also simplifies some of the
 * associated statistics bookkeeping of largestPoolSize etc. We
 * also hold mainLock on shutdown and shutdownNow, for the sake of
 * ensuring workers set is stable while separately checking
 * permission to interrupt and actually interrupting.
 */
// 操作workers时必须持有这个锁;workers、largestPoolSize、completedTaskCount 这些变量的修改都需要持有这把锁
private final ReentrantLock mainLock = new ReentrantLock();

// 工作线程的集合
/**
 * Set containing all worker threads in pool. Accessed only when
 * holding mainLock.
 */
private final HashSet<Worker> workers = new HashSet<Worker>();

/**
 * Wait condition to support awaitTermination
 */
private final Condition termination = mainLock.newCondition();

/**
 * Tracks largest attained pool size. Accessed only under
 * mainLock.
 */
private int largestPoolSize;

/**
 * Counter for completed tasks. Updated only on termination of
 * worker threads. Accessed only under mainLock.
 */
private long completedTaskCount;

// 线程工厂
private volatile ThreadFactory threadFactory;

/**
 * Handler called when saturated or shutdown in execute.
 */
private volatile RejectedExecutionHandler handler;

/**
 * Timeout in nanoseconds for idle threads waiting for work.
 * Threads use this timeout when there are more than corePoolSize
 * present or if allowCoreThreadTimeOut. Otherwise they wait
 * forever for new work.
 */
private volatile long keepAliveTime;

/**
 * If false (default), core threads stay alive even when idle.
 * If true, core threads use keepAliveTime to time out waiting
 * for work.
 */
// 允许核心线程超时,默认为false
private volatile boolean allowCoreThreadTimeOut;

/**
 * Core pool size is the minimum number of workers to keep alive
 * (and not allow to time out etc) unless allowCoreThreadTimeOut
 * is set, in which case the minimum is zero.
 */
private volatile int corePoolSize;

/**
 * Maximum pool size. Note that the actual maximum is internally
 * bounded by CAPACITY.
 */
private volatile int maximumPoolSize;

/**
 * The default rejected execution handler
 */
private static final RejectedExecutionHandler defaultHandler =
    new AbortPolicy();
  • c.构造方法
    ThreadPoolExecutor提供了多个构造函数来创建一个线程,但最终都会走到下面这个带7个参数的构造函数,看参数名就能知道每个参数含义。
public ThreadPoolExecutor(int corePoolSize,
                          int maximumPoolSize,
                          long keepAliveTime,
                          TimeUnit unit,
                          BlockingQueue<Runnable> workQueue,
                          ThreadFactory threadFactory,
                          RejectedExecutionHandler handler) {
    if (corePoolSize < 0 ||
        maximumPoolSize <= 0 ||
        maximumPoolSize < corePoolSize ||
        keepAliveTime < 0)
        throw new IllegalArgumentException();
    if (workQueue == null || threadFactory == null || handler == null)
        throw new NullPointerException();
    this.acc = System.getSecurityManager() == null ?
            null :
            AccessController.getContext();
    this.corePoolSize = corePoolSize;
    this.maximumPoolSize = maximumPoolSize;
    this.workQueue = workQueue;
    this.keepAliveTime = unit.toNanos(keepAliveTime);
    this.threadFactory = threadFactory;
    this.handler = handler;
}

3.2 线程池的任务执行

  • execute
    线程池最核心的方法,就是execute方法,整个执行流程包含了任务调度、线程池新建/回收线程等逻辑。
public void execute(Runnable command) {
    if (command == null)
        throw new NullPointerException();
    /*
     * Proceed in 3 steps:
     *
     * 1. If fewer than corePoolSize threads are running, try to
     * start a new thread with the given command as its first
     * task.  The call to addWorker atomically checks runState and
     * workerCount, and so prevents false alarms that would add
     * threads when it shouldn't, by returning false.
     *
     * 2. If a task can be successfully queued, then we still need
     * to double-check whether we should have added a thread
     * (because existing ones died since last checking) or that
     * the pool shut down since entry into this method. So we
     * recheck state and if necessary roll back the enqueuing if
     * stopped, or start a new thread if there are none.
     *
     * 3. If we cannot queue task, then we try to add a new
     * thread.  If it fails, we know we are shut down or saturated
     * and so reject the task.
     */
    int c = ctl.get();
    // 若工作线程数 < 核心线程数
    if (workerCountOf(c) < corePoolSize) {
        // 【核心点1-核心线程未满,则 新起线程】添加任务到worker集合,addWorker返回值为线程是否启动成功
        if (addWorker(command, true))
            // 成功则返回
            return;
        // 失败了,再次获取控线程池制状态;失败的原因是什么??线程池状态变更了,处于SHUTDOWN中,主动抛出了异常
        c = ctl.get();
    }

    // 【核心点2-任务入队】如果线程池是运行状态,且任务能放入阻塞队列(表明队列未满)
    if (isRunning(c) && workQueue.offer(command)) {
        int recheck = ctl.get();
        // 再次检查线程池状态,如果线程池未运行,则从队列移除任务,交给饱和策略去处理
        if (! isRunning(recheck) && remove(command))
            reject(command);
        // workerCount = 0;线程池里没有线程,则创建新的线程去执行获取阻塞队列的任务执行
        else if (workerCountOf(recheck) == 0)
            // 注意,这里人为设置任务为null,所以这个线程不是立即去执行任务,而是要从队列取
            addWorker(null, false);
    }
    // 【核心点3-队列满了,但未达到线程池最大线程数,则新开线程直接处理】任务放入阻塞队列失败(表明队列满了),则尝试以线程池最大线程数新开线程去执行该任务
    // 新开的线程直接处理当前任务,而不是把任务丢队列
    else if (!addWorker(command, false))
        // 新起worker失败(失败原因是线程数量达到线程池最大线程数),则交给饱和策略去处理
        reject(command);
}
  • addWorker
    见名知意,用来判断是否需要新增线程去处理任务;如果需要,则新建线程并且后续会由该线程直接处理当前任务,任务无需入队。注意入参core=true或false,代表是按核心线程数上限(true)还是按线程池最大数量(false)上限来约束线程的创建。
private boolean addWorker(Runnable firstTask, boolean core) {
    retry:
    for (;;) {
        int c = ctl.get();
        int rs = runStateOf(c);

        // 如果线程池runState至少已经是SHUTDOWN(注:初始状态是RUNNING,是小于SHUTDOWN的)
        // Check if queue empty only if necessary.
        if (rs >= SHUTDOWN &&
            ! (rs == SHUTDOWN &&
               firstTask == null &&
               ! workQueue.isEmpty()))
            return false;

        /** 线程池处于RUNNING 状态 */
        // 内层循环
        for (;;) {
            // 获取Worker数量
            int wc = workerCountOf(c);
            // worker数量超出 核心线程数 或 线程池数量上限,返回false;表明此时不允许新建线程了
            if (wc >= CAPACITY ||
                wc >= (core ? corePoolSize : maximumPoolSize))
                return false;
            // CAS操作,对workerCount数量+1,成功则跳出循环回到retry标记
            if (compareAndIncrementWorkerCount(c))
                break retry;
            
            // CAS操作失败,再次获取线程池的控制状态
            c = ctl.get();  // Re-read ctl
            // 如果当前runState不等于刚开始获取的runState,则跳出内层循环,继续外层循环
            if (runStateOf(c) != rs) 
                continue retry;
            // else CAS failed due to workerCount change; retry inner loop
        }
    }

       // 通过以上循环,对workerCount成功+1了; 但仅仅只做了对ctl的值更改,还没真正新增worker,新增worker并启动就是下面代码的逻辑
    boolean workerStarted = false;
    boolean workerAdded = false;
    Worker w = null;
    try {
        // new一个worker,worker实现了Runnable接口,每个Worker都持有一个线程
        w = new Worker(firstTask);
        final Thread t = w.thread;
        if (t != null) {
            final ReentrantLock mainLock = this.mainLock;
            // 加锁
            mainLock.lock();
            try {
                // Recheck while holding lock.
                // Back out on ThreadFactory failure or if
                // shut down before lock acquired.
                int rs = runStateOf(ctl.get());

                if (rs < SHUTDOWN ||
                    (rs == SHUTDOWN && firstTask == null)) {
                    
                    if (t.isAlive()) // precheck that t is startable
                        // 线程是活的,但是,线程不是刚new出来还没调start吗,这种情况肯定不对所以要报错!
                        throw new IllegalThreadStateException();
              
                    // 【核心】将worker加入到集合中,该集合是一个HashSet,为什么不是用List,是为了方便对线程做删除操作!!
                    workers.add(w);
                    int s = workers.size();
                    // largestPoolSize 用来记录线程池历史上达到过的最大值
                    if (s > largestPoolSize)
                        largestPoolSize = s;
                    workerAdded = true;
                }
            } finally {
                // 解锁
                mainLock.unlock();
            }
            if (workerAdded) {
                // 如果Worker添加成功,则启动线程
                t.start();
                // 启动没异常,则标记启动成功
                workerStarted = true;
            }
        }
    } finally {
        // 如果worker没有启动成功(或线程为null?)
        if (! workerStarted)
            // worker计数-1(对最开始的woker计数+1的回滚操作)
            addWorkerFailed(w);
    }
    // 返回worker是否启动的标记
    return workerStarted;
}
  • Woker对象
    Woker对象继承了AQS,为何要用到AQS?Worker的本职工作就是用来执行Runnable类型的的task,但task本身不一定是线程安全的,故需要通过在执行task前后加锁、解锁,来保障同一个时刻只一个线程在执行task,做到了task的执行是线程安全的。
    另外,观察tryAcquire和tryRelease两个方法可知,传入的参数都没用到,Jdk大师是故意将Worker对象设计成不可重入的。作用是不希望在调用setCorePoolSize等线程池控制方法时能让worker锁重入。(如果重入会怎样??这里还没理解透,mark下)
private final class Worker
    extends AbstractQueuedSynchronizer
    implements Runnable
{
    /**
     * This class will never be serialized, but we provide a
     * serialVersionUID to suppress a javac warning.
     */
    private static final long serialVersionUID = 6138294804551838833L;

    // 工作线程,线程工厂创建失败的时候为null
    /** Thread this worker is running in.  Null if factory fails. */
    final Thread thread;
    
    // 初始化任务,有可能为null(人为设置的null)
    /** Initial task to run.  Possibly null. */
    Runnable firstTask;
     
    // 已完成的任务计数(无论成功失败都算已完成?)
    /** Per-thread task counter */
    volatile long completedTasks;

    /**
     * Creates with given first task and thread from ThreadFactory.
     * @param firstTask the first task (null if none)
     */
    Worker(Runnable firstTask) {
        // 初始化AQS的同步状态为-1;在runWorker()方法一开始会做一次unlock将同步状态重置为0,这些都是为了在线程真正开始运行任务之前禁止线程中断!!??
        setState(-1); // inhibit interrupts until runWorker
        // 初始化第一个任务,这个任务有可能是null
        this.firstTask = firstTask;
        // 创建线程,因为自身就是Runnalbe,故传入 this
        this.thread = getThreadFactory().newThread(this);
    }

    /** Delegates main run loop to outer runWorker  */
    public void run() {
        // 线程start后调用run方法,run方法内部调用ThreadPoolExecutor的runWorker方法
        runWorker(this);
    }

    // Lock methods
    //
    // The value 0 represents the unlocked state.
    // The value 1 represents the locked state.
    // 代表是否独占锁
    protected boolean isHeldExclusively() {
        return getState() != 0;
    }
    
    // 重写AQS的tryAcquire方法尝试获取锁
    protected boolean tryAcquire(int unused) {
        if (compareAndSetState(0, 1)) {
            // CAS操作成功,则将当前当前线程记录为独占模式线程
            setExclusiveOwnerThread(Thread.currentThread());
            return true;
        }
        return false;
    }

    // 重写AQS的tryRelease尝试释放锁
    protected boolean tryRelease(int unused) {
        // 设置为null,表示清空当前独占模式线程记录
        setExclusiveOwnerThread(null);
        // 不管传入的入参unused是多少,AQS同步状态统一置为0
        setState(0);
        return true;
    }

    public void lock()        { acquire(1); }
    public boolean tryLock()  { return tryAcquire(1); }
    public void unlock()      { release(1); }
    // 是否被独占
    public boolean isLocked() { return isHeldExclusively(); }

    void interruptIfStarted() {
        Thread t;
        if (getState() >= 0 && (t = thread) != null && !t.isInterrupted()) {
            try {
                t.interrupt();
            } catch (SecurityException ignore) {
            }
        }
    }
}
  • runWorker
    真正执行Runnable任务,调用其run方法;任务的来源有2个途径:一个是直接提交给worker的,一个是worker监听任务队列,从队列拉取得到的。任务的执行需要Worker对象加锁、解锁来保障线程安全。注意,这里面有个while循环调用getTask()方法从任务队列获取任务,榨干线程让其不断执行任务干到天荒地老。除非任务执行过程中出现了异常或线程数超过核心线程数被清退,否则线程会一直常驻。
final void runWorker(Worker w) {
    Thread wt = Thread.currentThread();
    // 获取worker里面的任务
    Runnable task = w.firstTask;
    // 将worker的任务清掉;(worker的任务已经记录到task了,清掉只是断开了w.firstTask对任务的引用)
    w.firstTask = null;
    // unlock方法会调用AQS的release方法, 将AQS状态置为0;(cgx: 上来先解锁,而不是加锁,目的是对Worker的同步状态做一次初始化??)
    w.unlock(); // allow interrupts
    // 是否突然完成。(cgx: 发生了中断、异常?)
    boolean completedAbruptly = true;
    try {
        // worker实例的task不为null;如果为null,则通过getTask获取task;这里是个whie循环,所以线程会一直死死盯着工作队列看有没有任务
        while (task != null || (task = getTask()) != null) {
            // 获取锁;(cgx: 这个锁是为了保护Task同一个时刻只能有一个线程在调用其run方法,保证了Task的线程安全)
            w.lock();
            // If pool is stopping, ensure thread is interrupted;
            // if not, ensure thread is not interrupted.  This
            // requires a recheck in second case to deal with
            // shutdownNow race while clearing interrupt
            if ((runStateAtLeast(ctl.get(), STOP) ||
                 (Thread.interrupted() &&
                  runStateAtLeast(ctl.get(), STOP))) &&
                !wt.isInterrupted())
                // 设置线程中断标识;如果此时线程刚好处于TIMED_WAITING或WAITING状态,则会抛出InterruptedException
                wt.interrupt();
      
           // 若Runnalbe任务执行过程中出现异常会进入到最外层finally代码块的processWorkerExit方法来退出任务
           try {
                beforeExecute(wt, task);
                Throwable thrown = null;
                try {
                    task.run();
                } catch (RuntimeException x) {
                    thrown = x; throw x;
                } catch (Error x) {
                    thrown = x; throw x;
                } catch (Throwable x) {
                    thrown = x; throw new Error(x);
                } finally {
                    afterExecute(task, thrown);
                }
            } finally {
                // task设置为null;已完成任务数+1;解锁
                task = null;
                w.completedTasks++;                
                // 解锁
                w.unlock();
            }
        }
        // 能走到这里,说明task和getTask()都拿不到任务了,队列已经空了,线程变得多余。退出了while循环,走到finally代码块去做线程回收
        completedAbruptly = false;
    } finally {
        // 退出当前Worker(走到这里,表明while循环退出了,要么是任务光了,要么是执行任务的线程异常了)
        processWorkerExit(w, completedAbruptly);
    }
}
  • getTask
    工作线程从任务队列阻塞获取任务。这里面还有个非常重要的逻辑:线程空闲超过一定时间就会被清理。具体而言,当线程数>核心线程数时,采用超时时间为keepAliveTime的poll方法从队列获取任务。只有一种情况会取不到任务,那就是队列已经空了。那么问题来了,队列都空了,线程数又大于核心线程数,那肯定要对超出的工作线程进行清退了,清退逻辑见processWorkerExit方法。
private Runnable getTask() {
    boolean timedOut = false; // Did the last poll() time out?

    // 注意,这里又是自旋操作!!
    for (;;) {
        int c = ctl.get();
        int rs = runStateOf(c);
        // 若runState大于等于SHUTDOWN状态 且 runState大于等于STOP或者阻塞队列为空,则执行workerCount-1并返回null
        // Check if queue empty only if necessary.
        if (rs >= SHUTDOWN && (rs >= STOP || workQueue.isEmpty())) {
            decrementWorkerCount();
            return null;
        }

        int wc = workerCountOf(c);
        // 是否允许core Thread超时,默认false
        // Are workers subject to culling?
        boolean timed = allowCoreThreadTimeOut || wc > corePoolSize;

        // worker数大于线程池最大个数 或 超时(此时是timeOut=true, corePoolSize < wc <=maximumPoolSize)或 队列为空
        if ((wc > maximumPoolSize || (timed && timedOut))
            && (wc > 1 || workQueue.isEmpty())) {
            // 执行CAS操作进行 worker数-1
            if (compareAndDecrementWorkerCount(c))
                return null;
            // 操作失败,则继续循环
            continue;
        }

        try {
            // 若 wc大于核心线程数,则执行poll(),最多阻塞等待keepAliveTime;获取不到就返回null,然后timeOut=true, 下一轮循环用到timeOut, 就会对work计数减1
            // 若 wc <= 核心线程数,则执行take(), 死等任务到来!
            Runnable r = timed ?
                workQueue.poll(keepAliveTime, TimeUnit.NANOSECONDS) :
                workQueue.take();
            if (r != null)
                return r;
            // 过了keepAliveTime这么长时间都没没获得到任务,就认为超时了!
            timedOut = true;
        } catch (InterruptedException retry) {
            timedOut = false;
        }
    }
}
  • processWorkerExit
    线程执行任务,会有全部任务已执行完或者出现执行异常的时候,这时就会触发对线程的清理
private void processWorkerExit(Worker w, boolean completedAbruptly) {
    // 如果是突然完成的(如发生了线程中断),需执行CAS对worker计数减1;否则,不需要,因为已经在getTask()做了减1操作
    if (completedAbruptly) // If abrupt, then workerCount wasn't adjusted
        decrementWorkerCount();

    final ReentrantLock mainLock = this.mainLock;
    // 操作workers集合都要加锁
    mainLock.lock();
    try {
        // 整个线程池完成任务计数加上worker完成的任务数
        completedTaskCount += w.completedTasks;
        // 将worker从HashSet删除
        workers.remove(w);
    } finally {
        mainLock.unlock();
    }

    // 上面操作是删除work,所以顺带尝试下是否可以终止线程池
    tryTerminate();

    int c = ctl.get();
    // 判断runState是否小于STOP,即是否处于RUNNING或SHUTDOWN;如果是,代表没有成功终止线程池
    if (runStateLessThan(c, STOP)) {
        // 是否突然完成,如果不是,代表已经没有任务可获取完成,因为getTask当中是用while循环死等任务
        if (!completedAbruptly) {
            int min = allowCoreThreadTimeOut ? 0 : corePoolSize;
            // 如果队列不为空,则至少保留一个线程断后
            if (min == 0 && ! workQueue.isEmpty())
                min = 1;
            // 如果workerCount>=min,则表示满足所需,可以直接返回;
            if (workerCountOf(c) >= min)
                return; // replacement not needed
        }
        // 如果是突然完成,添加一个空任务的worker线程 ???
        addWorker(null, false);
    }
}

四、总结

因为线程池不是孤立存在的,肯定要配合其他数据结构发挥作用。故本文在讲解ThreadPoolExecutor之前,先从线程池周边比较重要的数据结构FutureTask讲起,接着论述线程池相关接口定义,最后通过流程图 + 源码注释的方式,详细剖析线程池内部数据结构及任务执行流程。但是,还有核心知识点(比如线程池的关闭,线程池大小动态调整、饱和处理策略等),楼主遗漏了未曾讲到,等有时间在补上笔记!

线程池源码涉及的知识点还是很多的,以下是楼主认为比较重要的,本文没有细讲。但如果读者能提前掌握下,相信看ThreadPoolExecutor源码时会轻松很多。

  • Future、Runnable、Callable层次关系
  • AQS框架机制
  • Unsafe类的CAS操作
  • 线程中断

全文完~

你可能感兴趣的:(java)