关于ForkJoinPool使用ManagedBlocker防线程阻塞而降低吞吐量的说明

ForkJoinPool适合执行计算密集型且可进行拆分任务并汇总结果(类似MapReduce)的任务,执行这种任务可以充分利用多核处理器优势提高任务处理速度,实际上ForkJoinPool内部的工作窃取队列的高性能(远高于普通线程池的BlockingQueue)也决定其适用于执行大量的简短的小任务。

对于IO阻塞型任务,当然可以设置较大的parallelism参数保证并发数防止任务IO阻塞耗时的时候没有空闲工作线程来执行新提交的IO阻塞任务导致CPU空闲,从而线程池的吞吐率不高,参见下方例子,例子中创建30个IO阻塞任务并同时提交到ForkJoinPool执行,ForkJoinPool内部的静态ForkJoinPool实例作为全局线程池,默认parallelism参数为:处理器核数-1(测试笔记本为4核因此parallelism=3)。


public class ManagedBlockerTest {
    static String threadDateTimeInfo() {
        return DateTimeFormatter.ISO_TIME.format(LocalTime.now()) + Thread.currentThread().getName();
    }

    static void test1() {
        List<RecursiveTask<String>> tasks = Stream.generate(() -> new RecursiveTask<String>() {
            @Override
            protected String compute() {
                System.out.println(threadDateTimeInfo() + ":simulate io task blocking for 2 seconds···");
                try {
                    //线程休眠2秒模拟IO调用阻塞
                    TimeUnit.SECONDS.sleep(2);
                } catch (InterruptedException e) {
                    throw new Error(e);
                }
                return threadDateTimeInfo() + ": io blocking task returns successfully";
            }
        }).limit(30).collect(Collectors.toList());
        tasks.forEach(e -> e.fork());
        tasks.forEach(e -> {
            try {
                System.out.println(e.get());
            } catch (Exception ex) {
                ex.printStackTrace();
            }
        });
    }

    public static void main(String[] args) {
        test1();
    }
}

测试结果如下:由于每个任务固定阻塞2秒,而线程池的parallelism=3,因此总的耗时时间为(30 / 3) * 2 = 20秒

18:30:25.991ForkJoinPool.commonPool-worker-1:simulate io task blocking for 2 seconds···
18:30:25.991ForkJoinPool.commonPool-worker-2:simulate io task blocking for 2 seconds···
18:30:25.991ForkJoinPool.commonPool-worker-3:simulate io task blocking for 2 seconds···
18:30:27.999ForkJoinPool.commonPool-worker-1: io blocking task returns successfully
18:30:27.999ForkJoinPool.commonPool-worker-2: io blocking task returns successfully
18:30:27.999ForkJoinPool.commonPool-worker-3: io blocking task returns successfully
···
忽略重复中间打印内容
···
18:30:44.08ForkJoinPool.commonPool-worker-2:simulate io task blocking for 2 seconds···
18:30:44.08ForkJoinPool.commonPool-worker-3:simulate io task blocking for 2 seconds···
18:30:44.08ForkJoinPool.commonPool-worker-1:simulate io task blocking for 2 seconds···
18:30:46.089ForkJoinPool.commonPool-worker-2: io blocking task returns successfully
18:30:46.089ForkJoinPool.commonPool-worker-3: io blocking task returns successfully
18:30:46.089ForkJoinPool.commonPool-worker-1: io blocking task returns successfully

实际上我们希望任务阻塞时线程池能够提供额外的工作线程来执行新的IO任务而不是阻塞等待,最终实现30个线程全部提交并阻塞并在2秒后全部返回。这样吞吐量增加了10倍,那如何实现此种需求?new ForkJoinPool(100)创建100个工作线程显然可以满足要求,但是在这种情况下对于纯计算的任务由于线程切换也会导致cpu效率下降。更有效方法是对IO阻塞型任务提供一个ManagedBlocker让线程池知道当前任务即将阻塞,因此需要创建新的补偿工作线程来执行新的提交任务,上述列子更改后如下:

public class ManagedBlockerTest {
    static String threadDateTimeInfo() {
        return DateTimeFormatter.ISO_TIME.format(LocalTime.now()) + Thread.currentThread().getName();
    }

    static void test2() {
        List<IOBlockerTask<String>> tasks = Stream.generate(() -> new IOBlockerTask<String>(() -> {
            System.out.println(threadDateTimeInfo() + ":simulate io task blocking for 2 seconds···");
            try {
                TimeUnit.SECONDS.sleep(2);
            } catch (InterruptedException e) {
                throw new Error(e);
            }
            return threadDateTimeInfo() + ": io blocking task returns successfully";
        })).limit(30).collect(Collectors.toList());
        tasks.forEach(e -> e.fork());
        tasks.forEach(e -> {
            try {
                System.out.println(e.get());
            } catch (Exception ex) {
                ex.printStackTrace();
            }
        });

    }

    public static void main(String[] args) {
        test2();
    }
}

class IOBlockerTask<T> extends RecursiveTask<T> {
    private MyManagedBlockerImpl blocker;

    public IOBlockerTask(Supplier<T> supplier) {
        this.blocker = new MyManagedBlockerImpl<T>(supplier);
    }

    static class MyManagedBlockerImpl<T> implements ForkJoinPool.ManagedBlocker {
        private Supplier<T> supplier;
        private T result;

        public MyManagedBlockerImpl(Supplier<T> supplier) {
            this.supplier = supplier;
        }

        @Override
        public boolean block() throws InterruptedException {
            result = supplier.get();
            return true;
        }

        @Override
        public boolean isReleasable() {
            return false;
        }
    }

    @Override
    protected T compute() {
        try {
            ForkJoinPool.managedBlock(blocker);
            setRawResult((T) blocker.result);
            return getRawResult();
        } catch (InterruptedException e) {
            throw new Error(e);
        }
    }
}

这里简单说明以下:IOBlockerTask继承了RecursiveTask并实现了compute方法,因为compute方法是final修饰,不能修改,因此我们将自定义阻塞任务的细节放在了Supplier中(Supplier将被传入ManagedBlocker中最终在block()方法中被执行),而compute方法仅仅是执行ForkJoinPool.managedBlock()并将ManagedBlocker的结果设置到Task中。
测试结果如下:结果符合预期要求,30个任务均在2秒时间内返回

18:54:56.817ForkJoinPool.commonPool-worker-17:simulate io task blocking for 2 seconds···
18:54:56.813ForkJoinPool.commonPool-worker-11:simulate io task blocking for 2 seconds···
18:54:56.816ForkJoinPool.commonPool-worker-7:simulate io task blocking for 2 seconds···
18:54:56.815ForkJoinPool.commonPool-worker-20:simulate io task blocking for 2 seconds···
18:54:56.814ForkJoinPool.commonPool-worker-23:simulate io task blocking for 2 seconds···
18:54:56.814ForkJoinPool.commonPool-worker-18:simulate io task blocking for 2 seconds···
···
省略重复打印内容
···
18:54:58.824ForkJoinPool.commonPool-worker-17: io blocking task returns successfully
18:54:58.824ForkJoinPool.commonPool-worker-29: io blocking task returns successfully
18:54:58.824ForkJoinPool.commonPool-worker-22: io blocking task returns successfully
18:54:58.824ForkJoinPool.commonPool-worker-28: io blocking task returns successfully
18:54:58.824ForkJoinPool.commonPool-worker-21: io blocking task returns successfully
18:54:58.824ForkJoinPool.commonPool-worker-16: io blocking task returns successfully

ForkJoinPool.managedBlock源码如下:

    public static void managedBlock(ManagedBlocker blocker)
        throws InterruptedException {
        ForkJoinPool p;
        ForkJoinWorkerThread wt;
        Thread t = Thread.currentThread();
        //如何当前线程是工作任务线程
        if ((t instanceof ForkJoinWorkerThread) &&
            (p = (wt = (ForkJoinWorkerThread)t).pool) != null) {
            WorkQueue w = wt.workQueue;
            //判断当前线程需要释放,isReleasable返回true则说明没有阻塞,显然我们需要执行阻塞任务因此固定返回false
            while (!blocker.isReleasable()) {
            	//创建补偿线程
                if (p.tryCompensate(w)) {
                    try {
                    	//循环判断当前线程是否需要被释放,自定义的Supplier在这里被执行,因为自定义任务为io阻塞不考虑中断要求,
                    	//因此只要block中执行完成任务后立即返回true不在阻塞
                        do {} while (!blocker.isReleasable() &&
                                     !blocker.block());
                    } finally {
                    	//线程池的活动线程数+1
                        U.getAndAddLong(p, CTL, AC_UNIT);
                    }
                    break;
                }
            }
        }
        //如果非工作任务线程
        else {
            do {} while (!blocker.isReleasable() &&
                         !blocker.block());
        }
    }

源码很简单,似乎只要ManagedBlocker的isReleasable返回false就会触发创建补偿线程,于是我尝试将上述代码更改为如下,在compute中执行阻塞调用前调用ForkJoinPool.managedBlock来创建补偿线程,这样一来就可以很灵活的创建运用了,对于可能阻塞的操作我只需要在调用前执行 ForkJoinPool.managedBlock(CommonBlocker.commonBlocker)即可。

public class ManagedBlockerTest {
    static String threadDateTimeInfo() {
        return DateTimeFormatter.ISO_TIME.format(LocalTime.now()) + Thread.currentThread().getName();
    }
    static void test3() {
        List<RecursiveTask<String>> tasks = Stream.generate(() -> new RecursiveTask<String>() {
            @Override
            protected String compute() {
                System.out.println(threadDateTimeInfo() + ":simulate io task blocking for 2 seconds···");
                try {
                    ForkJoinPool.managedBlock(CommonBlocker.commonBlocker);
                    //线程休眠2秒模拟IO调用阻塞
                    TimeUnit.SECONDS.sleep(2);
                } catch (InterruptedException e) {
                    throw new Error(e);
                }
                return threadDateTimeInfo() + ": io blocking task returns successfully";
            }
        }).limit(30).collect(Collectors.toList());
        tasks.forEach(e -> e.fork());
        tasks.forEach(e -> {
            try {
                System.out.println(e.get());
            } catch (Exception ex) {
                ex.printStackTrace();
            }
        });
    }
    public static void main(String[] args) {
        test3();
    }
}

class CommonBlocker implements ForkJoinPool.ManagedBlocker {
    public static ForkJoinPool.ManagedBlocker commonBlocker = new CommonBlocker();
    @Override
    public boolean block() throws InterruptedException {
        return true;
    }

    @Override
    public boolean isReleasable() {
        return false;
    }
}

实际测试结果却不符合预期,又回到了测试1中的情况,补偿线程并没有创建,我尝试修改上述CommonBlocker.block()方法在返回true前休眠100毫秒,没想到居然有用!似乎block方法的执行时间将影响补偿线程的创建,在细看源码果然发现有所不同,tryCompensate方法将尝试创建补偿线程,其源码如下(这里不对该方法详细说明,仅对关键处进行说明):

    private boolean tryCompensate(WorkQueue w) {
        boolean canBlock;
        WorkQueue[] ws; long c; int m, pc, sp;
        if (w == null || w.qlock < 0 ||           // caller terminating
            (ws = workQueues) == null || (m = ws.length - 1) <= 0 ||
            (pc = config & SMASK) == 0)           // parallelism disabled
            canBlock = false;
        else if ((sp = (int)(c = ctl)) != 0)      // release idle worker
            canBlock = tryRelease(c, ws[sp & m], 0L);
        else {
            int ac = (int)(c >> AC_SHIFT) + pc;
            int tc = (short)(c >> TC_SHIFT) + pc;
            int nbusy = 0;                        // validate saturation
            for (int i = 0; i <= m; ++i) {        // two passes of odd indices
                WorkQueue v;
                if ((v = ws[((i << 1) | 1) & m]) != null) {
                    if ((v.scanState & SCANNING) != 0)
                        break;
                    ++nbusy;
                }
            }
            if (nbusy != (tc << 1) || ctl != c)
                canBlock = false;                 // unstable or stale
            /*通过测试发现上述代码最终都走了该分支,因此没有创建新的补偿线程,
            * 该条件为:总线程数 >= parallelism && 活动线程数 > 1 && 当前工作线程的工作队列为空,
            * 第一个条件核第三个条件显然成立,关键在于第二个条件,在该方法结尾处通过createWorker方法创建补偿线程但并没有增加活动线程数,
            * 通过查看该方法的调用位置发现活动线程数的维护在调用该方法的后面维护(调用block方法之后),也就是说本方法并发时如果会在此处
            * 将活动线程数-1,在block方法后加1保持前后不变,因此如果block()方法耗时太短则ac在此处-1后很短时间内将被+1恢复,
            * 因此该条件总是成立,相反如果block()耗时较长,则并发情况下多个线程在此执行ac--操作后总又一个线程会得到ac <= 1的情况从而执行
            * 创建补偿线程操作
            */
            else if (tc >= pc && ac > 1 && w.isEmpty()) {
                long nc = ((AC_MASK & (c - AC_UNIT)) |
                           (~AC_MASK & c));       // uncompensated
                canBlock = U.compareAndSwapLong(this, CTL, c, nc);
            }
            else if (tc >= MAX_CAP ||
                     (this == common && tc >= pc + commonMaxSpares))
                throw new RejectedExecutionException(
                    "Thread limit exceeded replacing blocked worker");
            else {                                // similar to tryAddWorker
                boolean add = false; int rs;      // CAS within lock
                long nc = ((AC_MASK & c) |
                           (TC_MASK & (c + TC_UNIT)));
                if (((rs = lockRunState()) & STOP) == 0)
                    add = U.compareAndSwapLong(this, CTL, c, nc);
                unlockRunState(rs, rs & ~RSLOCK);
                canBlock = add && createWorker(); // throws on exception
            }
        }
        return canBlock;
    }

关于tryCompensate方法的说明,该方法在两个地方被调用,上述的managedBlock和awaitJoin方法中,后者通常用于当前任务需要等待子任务结束的情况下被调用,因此当前工作线程执行完helpComplete或helpStealer或tryRemoveAndExec后当前任务任然没有完成(可能是被窃取的任务发生了阻塞导致)时此时当前工作线程需要尝试创建补偿线程进入阻塞等待子任务完成。上述的阻塞等待是无限等待直到被唤醒,因此大概率会触发补偿线程真正的被创建。

你可能感兴趣的:(并发,ForkJoinPool,forkjoin,线程池,java)