Java 使用多线程对超大数列进行累加计算

实验目的:验证Java的大数据多线程的处理能力
实验原理:建立一个长为2G (Java中最大长度为2G) 的整形数组,分成n份,然后让n个线程来对每一块进行求和,最后将各块累计起来统计时间。
实验设备:Xeon E5-2600v2 @ 2.2 GHz (10 cores, Hyper-threading off) X 2, DDR3 1600MHZ 128 GB, SSD 512GB.
实验结果:如下所示:

线程数 所用时间
1 43,710 ms
2 21,891 ms
4 11,036 ms
8 6,160 ms
16 3,158 ms
20 2,434 ms

实验结论:在并行计算时,线程的数量与处理速度基本成正比。但是随着线程数的增加,仍然会有额外的时间支出,造成处理速度略低于线性。

原始数据

C:\data>java -Xmx20g -Xms20g -jar tests.jar 2000 1
DataSize: 2000 M, # of threads: 1.
Initializing time: 6906 ms
Coping time: 7988 ms
Computing time: 43710 ms
Result: -4429320714305072004.
C:\data>java -Xmx20g -Xms20g -jar tests.jar 2000 2
DataSize: 2000 M, # of threads: 2.
Initializing time: 6983 ms
Coping time: 7856 ms
Computing time: 21891 ms
Result: -4429320714305072004.
C:\data>java -Xmx20g -Xms20g -jar tests.jar 2000 4
DataSize: 2000 M, # of threads: 4.
Initializing time: 6982 ms
Coping time: 9571 ms
Computing time: 11036 ms
Result: -4429320714305072004.
C:\data>java -Xmx20g -Xms20g -jar tests.jar 2000 8
DataSize: 2000 M, # of threads: 8.
Initializing time: 6992 ms
Coping time: 8872 ms
Computing time: 6160 ms
Result: -4429320714305072004.
C:\data>java -Xmx20g -Xms20g -jar tests.jar 2000 16
DataSize: 2000 M, # of threads: 16.
Initializing time: 6932 ms
Coping time: 9919 ms
Computing time: 3158 ms
Result: -4429320714305072004.
C:\data>java -Xmx20g -Xms20g -jar tests.jar 2000 20
DataSize: 2000 M, # of threads: 20.
Initializing time: 6956 ms
Coping time: 9869 ms
Computing time: 2434 ms
Result: -4429320714305072004.

计算过程类。

package com.hao.thread;


public class MutiArraySum {

	public static void main(String[] args) throws Exception {
		int datasize = args.length > 0 ? Integer.parseInt(args[0]) : 100;
		int threadCount = args.length > 1 ? Integer.parseInt(args[1]) : 10;

		System.out.printf("DataSize: %d M, # of threads: %d.\n", datasize, threadCount);
		long t1 = System.currentTimeMillis();
		BigData bd = new BigData(datasize, 1024 * 1024);
		bd.initialize();
		long t2 = System.currentTimeMillis();

		ParallelProcessor[] pps = new ParallelProcessor[threadCount];
		int size = bd.arr.length / pps.length;
		for (int i = 0; i < pps.length; i++) {
			int start = i * size;
			int end = i < pps.length - 1 ? i * size + size : bd.arr.length;
			pps[i] = new ParallelProcessor(bd.arr, start, end);
		}
		
		for (int i = 0; i < pps.length; i++)
			pps[i].start();

		long t3 = System.currentTimeMillis();
		while (true) {
			boolean allDone = true;
			for (ParallelProcessor pp : pps) {
				if (!pp.done) {
					allDone = false;
					break;
				}
			}
			if (allDone)
				break;
		}

		long t4 = System.currentTimeMillis();

		long result = 0;
		for (int i = 0; i < pps.length; i++)
			result += pps[i].result;

		System.out.printf("Initializing time: %d ms\n", t2 - t1);
		System.out.printf("Coping time: %d ms\n", t3 - t2);
		System.out.printf("Computing time: %d ms\n", t4 - t3);
		System.out.printf("Result: %d.", result);
	}
}

ParallelProcessor 为并行处理类,实现对给定数列的累加。

package com.hao.thread;

public class ParallelProcessor extends Thread{
	long result;
	int[] data; 
	public boolean done;
	
		@Override 
	public void run() {
		done = false;
		result = 0;
		for(int i = 0; i < data.length; i++) {
			result += f(data[i]);
		}
		done = true;
	}

	// 模拟一个耗时的函数计算。
	private long f(int v) { 
		double dv = (Math.sqrt((v + 1) * (v + 2) * Math.E * Math.pow(v + 3, 2)));
		dv *= (Math.sqrt((v + 3) * (v + 4) * Math.E * Math.pow(v + 5, 2)));
		dv = Math.sqrt(dv);
		return (long) dv;
	}
	
}

BigData 这个类用于产生一个超大数据及数据生成功能。

package com.hao.thread;

import java.util.Random;

public class BigData {
	public int unit = 1024*1024;

	static Random rand = new Random();
	public int[] arr;
	public BigData(int size, int unit) {
		arr = new int[size * unit];
	}
	
	public void initialize() {
		for(int i = 0; i < arr.length; i++)
			arr[i] = i;
	}
	
	public long sum(int start, int end) {
		long result = 0;
		for(int i = start; i < end; i++)
			result += arr[i];
		return result;
	}
}

你可能感兴趣的:(性能测试,并行计算,Java程序设计)