在JDK1.7引入了一种新的并行编程模式“fork-join”,它是实现了“分而治之”思想的Java并发编程框架。网上关于此框架的各种介绍很多,本文从框架特点出发,通过几个例子来进行实用性的介绍。
fork-join框架对新手来说很难理解,因此先从它的特点说起,它有几个特点:
了解以上思路后,来看看fork-join框架提供的几个工具类:
先来看一个使用RecursiveAction的例子,这段代码的目的是计算一个大型数组中每个元素x的一个公式的值,这个公式是sin(x)+cos(x)+tan(x):
public class RecursiveActionExam {
private final static int NUMBER = 10000000;
public static void main(String[] args) {
double[] array = new double[NUMBER];
for (int i = 0; i < NUMBER; i++) {
array[i] = i;
}
long startTime = System.currentTimeMillis();
System.out.println(Runtime.getRuntime().availableProcessors());
ForkJoinPool forkJoinPool = new ForkJoinPool();
forkJoinPool.invoke(new ComputeTask(array, 0, array.length));
long endTime = System.currentTimeMillis();
System.out.println("Time span = " + (endTime - startTime));
}
}
class ComputeTask extends RecursiveAction {
final double[] array;
final int lo, hi;
ComputeTask(double[] array, int lo, int hi) {
this.array = array;
this.lo = lo;
this.hi = hi;
}
protected void compute() {
if (hi - lo < 2) {
for (int i = lo; i < hi; ++i)
array[i] = Math.sin(array[i]) + Math.cos(array[i]) + Math.tan(array[i]);
} else {
int mid = (lo + hi) >>> 1;
invokeAll(new ComputeTask(array, lo, mid),
new ComputeTask(array, mid, hi));
}
}
}
再看看单线程的情况:
public class RecursiveSequenceExam {
private final static int NUMBER = 10000000;
public static void main(String[] args) {
double[] array = new double[NUMBER];
for (int i = 0; i < NUMBER; i++) {
array[i] = i;
}
long startTime = System.currentTimeMillis();
for (int i = 0; i < NUMBER; i++) {
array[i] = Math.sin(array[i]) + Math.cos(array[i]) + Math.tan(array[i]);
}
long endTime = System.currentTimeMillis();
System.out.println("Time span = " + (endTime - startTime));
}
}
运行结果是Time span = 12030。
由于我的CPU是4核的,再看看4线程的情况:
public class Recusive4ThreadExam {
private final static int NUMBER = 10000000;
public static void main(String[] args) throws InterruptedException {
double[] array = new double[NUMBER];
for (int i = 0; i < NUMBER; i++) {
array[i] = i;
}
long startTime = System.currentTimeMillis();
ExecutorService service = Executors.newFixedThreadPool(4);
service.execute(new ArrayTask(array, 0, NUMBER / 4));
service.execute(new ArrayTask(array, NUMBER / 4, NUMBER / 2));
service.execute(new ArrayTask(array, NUMBER / 2, NUMBER*3 / 4));
service.execute(new ArrayTask(array, NUMBER*3 / 4, NUMBER ));
service.shutdown();
service.awaitTermination(1,TimeUnit.DAYS);
long endTime = System.currentTimeMillis();
System.out.println("Time span = " + (endTime - startTime));
}
}
class ArrayTask implements Runnable {
final double[] array;
final int lo, hi;
ArrayTask(double[] array, int lo, int hi) {
this.array = array;
this.lo = lo;
this.hi = hi;
}
@Override
public void run() {
for (int i = lo; i < hi; ++i)
array[i] = Math.sin(array[i]) + Math.cos(array[i]) + Math.tan(array[i]);
}
}
运行结果是Time span = 4064。可以看出由于fork-join框架采用了任务偷取算法,比普通4线程快了一点点。
下面来看一个更有意义的场景,寻找一个大型数组的最小值,这里我使用RecursiveTask(其实使用RecursiveAction也行,在它内部用一个成员变量保存结果即可)。代码如下:
public class RecursiveFindMax {
private static Random rand = new Random(47);
private static final int NUMBER = 1000000;
public static void main(String[] args) {
double[] array = new double[NUMBER];
for (int i = 0; i < NUMBER; i++) {
array[i] = rand.nextDouble();
}
ForkJoinPool pool = new ForkJoinPool();
TaskFindMax task = new TaskFindMax(0, array.length - 1, array);
pool.invoke(task);
System.out.println("MaxValue = " + task.join());
}
}
class TaskFindMax extends RecursiveTask<Double> {
private final int lo;
private final int hi;
private final double[] array;
//you can change THRESHOLD to get better efficiency
private final static int THRESHOLD = 10;
TaskFindMax(int lo, int hi, double[] array) {
this.lo = lo;
this.hi = hi;
this.array = array;
}
@Override
protected Double compute() {
if ((hi - lo) < THRESHOLD) {
double max = array[lo];
for (int i = lo; i < hi; i++) {
max = Math.max(max, array[i + 1]);
}
return max;
} else {
int mid = (lo + hi) >>> 1;
TaskFindMax lhs = new TaskFindMax(lo, mid, array);
TaskFindMax rhs = new TaskFindMax(mid, hi, array);
invokeAll(lhs, rhs);
return Math.max(lhs.join(), rhs.join());
}
}
}
pool.invoke(task)将一个最初的任务扔进了线程池执行,这个任务将会执行它的compute()方法。在此方法中,若满足某个条件(例如数组上界和下界只差小于阈值THRESHOLD)则直接在这一段数组中查找最大值;若不满足条件,则找出中值mid,然后new出两个子任务lhs(left hand side)和rhs(right hand side),并调用invokeAll(lhs, rhs)将这两个子任务都扔进线程池执行。任务的join()方法会得到返回值,若任务尚未执行完毕则会在此阻塞。
通过这种编程模式,很好的将递归思想用到了多线程领域。值得注意的是,通过调整THRESHOLD可以增加或减少任务的个数,从而极大的影响线程的执行。在很多情况下,使用fork-join框架并不会比普通的多线程效率更高,甚至比单线程运行效率更低。因此,必须找到适合的场景,然后进行多次调优,才能获得性能的改进。
执行者与线程池的引入是因为Concurrency包的设计者想将线程的创建、执行和调度分离,从而使得用户能够更加专注于业务逻辑;Callable接口和Future接口使得异步执行结果的获取更加简单;ScheduledExecutorService取代Timer成为了线程重复和延迟执行的新标准;TimeUnit类的引入简化了时间段的表达工作;包中提供的五种线程池可以极大的满足程序员的各种需求,极端情况下还可以利用ThreadPoolExecutor类自己定制线程池。最后,从JDK1.7后引入的Fork-Join框架将“分而治之”的递归思想实现到线程池中,并应用“work-steal”算法实现了任务执行效率的提升