JMH是Java Micro Benchmark Harness的简写,是专门用于代码微基准测试的工具集。
JMH由实现Java虚拟你的团队开发,现代JVM已经变的越来越智能,在Java文件的编译阶段、类的加载阶段,以及运行阶段都可能进行了不同程度的优化,因此开发者编写的代码未必会像自己所预期的那样具有相同的性能体现,JMH能够让普通开发者能够了解自己所编写的代码运行的情况。
JMH GItHub
首先我们要将JMH的依赖加入我们的工程之中,这里使用最新的1.36版本
<dependency>
<groupId>org.openjdk.jmhgroupId>
<artifactId>jmh-coreartifactId>
<version>1.36version>
dependency>
<dependency>
<groupId>org.openjdk.jmhgroupId>
<artifactId>jmh-generator-annprocessartifactId>
<version>1.36version>
dependency>
package com.myf.concurrent.wwj2.jmh;
import org.openjdk.jmh.annotations.*;
import org.openjdk.jmh.runner.Runner;
import org.openjdk.jmh.runner.RunnerException;
import org.openjdk.jmh.runner.options.Options;
import org.openjdk.jmh.runner.options.OptionsBuilder;
import org.openjdk.jmh.runner.options.TimeValue;
import java.util.ArrayList;
import java.util.LinkedList;
import java.util.List;
import java.util.concurrent.TimeUnit;
/**
* @author myf
*/
@BenchmarkMode(Mode.SampleTime)
@OutputTimeUnit(TimeUnit.MICROSECONDS)
@State(Scope.Thread)
public class JmhExample01 {
private final static String DATA = "DUMMY DATA";
private List<String> arrayList;
private List<String> linkedList;
@Setup(Level.Invocation)
public void setUp(){
this.arrayList = new ArrayList<>();
this.linkedList = new LinkedList<>();
}
@Benchmark
public List<String> arrayListAdd(){
this.arrayList.add(DATA);
return arrayList;
}
@Benchmark
public List<String> linkedListAdd(){
this.linkedList.add(DATA);
return linkedList;
}
public static void main(String[] args) throws RunnerException {
final Options options = new OptionsBuilder()
.include(JmhExample01.class.getSimpleName())
.forks(1)
.measurementIterations(10)
.measurementTime(TimeValue.seconds(1))
.warmupIterations(10)
.warmupTime(TimeValue.seconds(1))
.build();
new Runner(options).run();
}
}
上面的程序中,我们使用了一些基本的JMH API,也许目前你还不太了解他们的用法,我们后面会逐一进行讲解。
# JMH version: 1.36
# VM version: JDK 17.0.2, Java HotSpot(TM) 64-Bit Server VM, 17.0.2+8-LTS-86
# VM invoker: /Library/Java/JavaVirtualMachines/jdk-17.0.2.jdk/Contents/Home/bin/java
# VM options: -javaagent:/Applications/IntelliJ IDEA.app/Contents/lib/idea_rt.jar=56684:/Applications/IntelliJ IDEA.app/Contents/bin -Dfile.encoding=UTF-8
# Blackhole mode: compiler (auto-detected, use -Djmh.blackhole.autoDetect=false to disable)
# Warmup: 10 iterations, 100 ms each
# Measurement: 10 iterations, 100 ms each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: com.myf.concurrent.wwj2.jmh.JmhExample01.arrayListAdd
# Run progress: 0.00% complete, ETA 00:00:04
# Fork: 1 of 1
# Warmup Iteration 1: 63.109 ns/op
# Warmup Iteration 2: 64.557 ns/op
# Warmup Iteration 3: 29.116 ns/op
# Warmup Iteration 4: 23.864 ns/op
# Warmup Iteration 5: 23.078 ns/op
# Warmup Iteration 6: 23.269 ns/op
# Warmup Iteration 7: 24.339 ns/op
# Warmup Iteration 8: 37.417 ns/op
# Warmup Iteration 9: 37.715 ns/op
# Warmup Iteration 10: 33.296 ns/op
Iteration 1: 28.710 ns/op
Iteration 2: 23.029 ns/op
Iteration 3: 23.790 ns/op
Iteration 4: 34.254 ns/op
Iteration 5: 31.454 ns/op
Iteration 6: 30.230 ns/op
Iteration 7: 21.331 ns/op
Iteration 8: 25.501 ns/op
Iteration 9: 29.648 ns/op
Iteration 10: 38.189 ns/op
Result "com.myf.concurrent.wwj2.jmh.JmhExample01.arrayListAdd":
28.614 ±(99.9%) 8.007 ns/op [Average]
(min, avg, max) = (21.331, 28.614, 38.189), stdev = 5.296
CI (99.9%): [20.606, 36.621] (assumes normal distribution)
# JMH version: 1.36
# VM version: JDK 17.0.2, Java HotSpot(TM) 64-Bit Server VM, 17.0.2+8-LTS-86
# VM invoker: /Library/Java/JavaVirtualMachines/jdk-17.0.2.jdk/Contents/Home/bin/java
# VM options: -javaagent:/Applications/IntelliJ IDEA.app/Contents/lib/idea_rt.jar=56684:/Applications/IntelliJ IDEA.app/Contents/bin -Dfile.encoding=UTF-8
# Blackhole mode: compiler (auto-detected, use -Djmh.blackhole.autoDetect=false to disable)
# Warmup: 10 iterations, 100 ms each
# Measurement: 10 iterations, 100 ms each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: com.myf.concurrent.wwj2.jmh.JmhExample01.linkedListAdd
# Run progress: 50.00% complete, ETA 00:00:03
# Fork: 1 of 1
# Warmup Iteration 1: 21.183 ns/op
# Warmup Iteration 2: 15.947 ns/op
# Warmup Iteration 3: 16.283 ns/op
# Warmup Iteration 4: 17.095 ns/op
# Warmup Iteration 5: 16.899 ns/op
# Warmup Iteration 6: 16.618 ns/op
# Warmup Iteration 7: 16.415 ns/op
# Warmup Iteration 8: 16.034 ns/op
# Warmup Iteration 9: 16.170 ns/op
# Warmup Iteration 10: 17.569 ns/op
Iteration 1: 16.021 ns/op
Iteration 2: 16.734 ns/op
Iteration 3: 16.535 ns/op
Iteration 4: 17.296 ns/op
Iteration 5: 21.086 ns/op
Iteration 6: 16.149 ns/op
Iteration 7: 16.183 ns/op
Iteration 8: 16.088 ns/op
Iteration 9: 16.659 ns/op
Iteration 10: 16.116 ns/op
Result "com.myf.concurrent.wwj2.jmh.JmhExample01.linkedListAdd":
16.887 ±(99.9%) 2.311 ns/op [Average]
(min, avg, max) = (16.021, 16.887, 21.086), stdev = 1.528
CI (99.9%): [14.576, 19.197] (assumes normal distribution)
# Run complete. Total time: 00:00:05
REMEMBER: The numbers below are just data. To gain reusable insights, you need to follow up on
why the numbers are the way they are. Use profilers (see -prof, -lprof), design factorial
experiments, perform baseline and negative tests that provide experimental control, make sure
the benchmarking environment is safe on JVM/OS/HW level, ask for reviews from the domain experts.
Do not assume the numbers tell you what you want them to tell.
NOTE: Current JVM experimentally supports Compiler Blackholes, and they are in use. Please exercise
extra caution when trusting the results, look into the generated code to check the benchmark still
works, and factor in a small probability of new VM bugs. Additionally, while comparisons between
different JVMs are already problematic, the performance difference caused by different Blackhole
modes can be very significant. Please make sure you use the consistent Blackhole mode for comparisons.
Benchmark Mode Cnt Score Error Units
JmhExample01.arrayListAdd avgt 10 28.614 ± 8.007 ns/op
JmhExample01.linkedListAdd avgt 10 16.887 ± 2.311 ns/op
我们使用JMH基准测试,输出的信息很多,目前我们先去查看输出的最后两行,我们从这两行的信息可以发现arrayListAdd方法调用平均响应时间是28.614纳秒,误差在8.007纳秒,linkedListAdd方法的调用平均响应时间为16.887纳秒,误差在2.311纳秒,测试结果显示后者的性能是高于前者的。
与Junit4中@Test注解标记测试方法一样,JMH对基准测试的方法需要使用@Benchmark注解进行标记。
如果一个类中没有任何基准测试方法,那么对其进行基准测试将会出现异常。
删掉上述代码中两个基准测试注解将会报如下错误:
Exception in thread "main" No benchmarks to run; check the include/exclude regexps.
at org.openjdk.jmh.runner.Runner.internalRun(Runner.java:258)
at org.openjdk.jmh.runner.Runner.run(Runner.java:209)
at com.myf.concurrent.wwj2.jmh.JmhExample02.main(JmhExample02.java:53)
Warmup可以直译为“预热”的意思,Warmup做的就是在基准测试代表正式度量之前,先对其进行预热,使得度量代码的执行是经过了类的早期优化、JVM运行期编译、JIT优化之后的最终状态。
Measurement是真正的度量操作,只有度量的轮次会被计入统计之中。
通过构造Options时设置全局的配置,也可以通过类上的注解进行设置
Options options = new OptionsBuilder()
.include(JmhExample02.class.getSimpleName())
.forks(1)
// 进行迭代测试的次数(10批次不是方法调用10次)
.measurementIterations(10)
// 每次迭代需要的时间,默认是10S
.measurementTime(TimeValue.milliseconds(100))
// 进行迭代预热的次数
.warmupIterations(10)
// 每次迭代需要的时间,默认是10S
.warmupTime(TimeValue.milliseconds(100))
.build();
通过注解指定预热和度量的配置
@Measurement(iterations = 10, time = 100, timeUnit = TimeUnit.MILLISECONDS)
@Warmup(iterations = 10, time = 100, timeUnit = TimeUnit.MILLISECONDS)
public class JmhExample01 {
注解也可以设置在基准测试方法上,只对当前方法生效。
@Benchmark
@Measurement(iterations = 5, time = 100, timeUnit = TimeUnit.MILLISECONDS)
@Warmup(iterations = 5, time = 100, timeUnit = TimeUnit.MILLISECONDS)
public List<String> arrayListAdd() {
this.arrayList.add(DATA);
return arrayList;
}
经测试,当前1.36版本基准测试方法上的注解可以覆盖类上的注解配置,无法覆盖Option中的配置
# 使用的JMH版本
# JMH version: 1.36
# 下面是JDK的版本信息
# VM version: JDK 17.0.2, Java HotSpot(TM) 64-Bit Server VM, 17.0.2+8-LTS-86
# Java命令的目录
# VM invoker: /Library/Java/JavaVirtualMachines/jdk-17.0.2.jdk/Contents/Home/bin/java
# JVM运行时指定的参数
# VM options: -javaagent:/Applications/IntelliJ IDEA.app/Contents/lib/idea_rt.jar=54283:/Applications/IntelliJ IDEA.app/Contents/bin -Dfile.encoding=UTF-8
# Blackhole mode: compiler (auto-detected, use -Djmh.blackhole.autoDetect=false to disable)
# 热身的批次为10,每一个批次都将会不断地调用arrayListAdd方法,每一个批次的执行时间为100ms
# Warmup: 10 iterations, 100 ms each
# 真正度量的批次为10,这10个批次的调用产生的性能数据才会真正地纳入统计中,同样每一个批次的度量执行的时间也为100ms
# Measurement: 10 iterations, 100 ms each
# 每一个批次的超时时间
# Timeout: 10 min per iteration
# 执行基准测试的线程数量
# Threads: 1 thread, will synchronize iterations
# Benchmark的Mode,这里表明统计的方法是方法调用一次所耗费的单位时间。
# Benchmark mode: Average time, time/op
# 当前测试方法的绝对路径
# Benchmark: com.myf.concurrent.wwj2.jmh.JmhExample01.arrayListAdd
# 执行进度
# Run progress: 0.00% complete, ETA 00:00:04
# Fork: 1 of 1
# 执行十个批次的热身,第一批次调用方法的平均耗时为35.756纳秒,第二批次调用方法的平均耗时为24.443纳秒
# Warmup Iteration 1: 35.756 ns/op
# Warmup Iteration 2: 24.443 ns/op
# Warmup Iteration 3: 28.334 ns/op
# Warmup Iteration 4: 35.592 ns/op
# Warmup Iteration 5: 22.711 ns/op
# Warmup Iteration 6: 25.595 ns/op
# Warmup Iteration 7: 28.407 ns/op
# Warmup Iteration 8: 37.375 ns/op
# Warmup Iteration 9: 39.011 ns/op
# Warmup Iteration 10: 33.058 ns/op
# 执行十个批次的度量
Iteration 1: 26.739 ns/op
Iteration 2: 22.633 ns/op
Iteration 3: 24.204 ns/op
Iteration 4: 20.832 ns/op
Iteration 5: 20.575 ns/op
Iteration 6: 20.881 ns/op
Iteration 7: 21.091 ns/op
Iteration 8: 20.630 ns/op
Iteration 9: 21.314 ns/op
Iteration 10: 22.101 ns/op
# 。。。略略略。。。
# 最终的统计结果
Benchmark Mode Cnt Score Error Units
JmhExample01.arrayListAdd avgt 10 22.100 ± 3.000 ns/op
JmhExample01.linkedListAdd avgt 10 16.364 ± 1.325 ns/op
JMH使用@BenchmarkMode这个注解来声明使用哪一种模式来运行,JMH为我们提供了四种运行模式。
JMH允许若干个模式同时存在。
@BenchmarkMode注解可设置在类上也可以设置在基准方法上,即准方法上的设置会覆盖类上的设置,也可以在Option中通过.mode()方法设置,基准测试方法无法覆盖Option中的设置。
平均响应时间:输出基准测试方法每次调用所耗费的时间
Benchmark Mode Cnt Score Error Units
JmhExample01.arrayListAdd avgt 10 22.100 ± 3.000 ns/op
JmhExample01.linkedListAdd avgt 10 16.364 ± 1.325 ns/op
方法吞吐量:输出信息表明在单位时间内可以对该方法调用多少次
本次测试我调整了这个为微妙@OutputTimeUnit(TimeUnit.MICROSECONDS)
Benchmark Mode Cnt Score Error Units
JmhExample01.arrayListAdd thrpt 10 48.007 ± 1.863 ops/us
JmhExample01.linkedListAdd thrpt 10 62.169 ± 1.439 ops/us
时间采样:采用一种抽样的方式来统计基准测试方法的性能结果,他会收集所有的性能数据,并将其分布在不同的区间中。
从输出结果可以发现,对arrayListAdd进行了29951次调用,平均响应时间为0.244微妙,总共有29920的数据点落在了0-50微妙这个区间之中
# 。。。略略略。。。
Result "com.myf.concurrent.wwj2.jmh.JmhExample01.arrayListAdd":
N = 29951
mean = 0.244 ±(99.9%) 0.131 us/op
Histogram, us/op:
[ 0.000, 50.000) = 29920
[ 50.000, 100.000) = 15
[100.000, 150.000) = 8
[150.000, 200.000) = 2
[200.000, 250.000) = 0
[250.000, 300.000) = 1
[300.000, 350.000) = 2
[350.000, 400.000) = 1
[400.000, 450.000) = 0
[450.000, 500.000) = 1
[500.000, 550.000) = 0
[550.000, 600.000) = 0
[600.000, 650.000) = 0
Percentiles, us/op:
p(0.0000) = ≈ 0 us/op
p(50.0000) = 0.042 us/op
p(90.0000) = 0.083 us/op
p(95.0000) = 0.125 us/op
p(99.0000) = 0.250 us/op
p(99.9000) = 51.807 us/op
p(99.9900) = 364.157 us/op
p(99.9990) = 679.936 us/op
p(99.9999) = 679.936 us/op
p(100.0000) = 679.936 us/op
# 。。。略略略。。。
Benchmark Mode Cnt Score Error Units
JmhExample01.arrayListAdd sample 29951 0.244 ± 0.131 us/op
JmhExample01.arrayListAdd:arrayListAdd·p0.00 sample ≈ 0 us/op
JmhExample01.arrayListAdd:arrayListAdd·p0.50 sample 0.042 us/op
JmhExample01.arrayListAdd:arrayListAdd·p0.90 sample 0.083 us/op
JmhExample01.arrayListAdd:arrayListAdd·p0.95 sample 0.125 us/op
JmhExample01.arrayListAdd:arrayListAdd·p0.99 sample 0.250 us/op
JmhExample01.arrayListAdd:arrayListAdd·p0.999 sample 51.807 us/op
JmhExample01.arrayListAdd:arrayListAdd·p0.9999 sample 364.157 us/op
JmhExample01.arrayListAdd:arrayListAdd·p1.00 sample 679.936 us/op
JmhExample01.linkedListAdd sample 26853 0.023 ± 0.003 us/op
JmhExample01.linkedListAdd:linkedListAdd·p0.00 sample ≈ 0 us/op
JmhExample01.linkedListAdd:linkedListAdd·p0.50 sample ≈ 0 us/op
JmhExample01.linkedListAdd:linkedListAdd·p0.90 sample 0.042 us/op
JmhExample01.linkedListAdd:linkedListAdd·p0.95 sample 0.042 us/op
JmhExample01.linkedListAdd:linkedListAdd·p0.99 sample 0.042 us/op
JmhExample01.linkedListAdd:linkedListAdd·p0.999 sample 0.125 us/op
JmhExample01.linkedListAdd:linkedListAdd·p0.9999 sample 11.186 us/op
JmhExample01.linkedListAdd:linkedListAdd·p1.00 sample 17.312 us/op
冷测试:无论是Warmup还是Measurement在每一个批次中基准测试方法只会被执行一次,一般我们会讲Warmup的批次设置为0。
# 。。。略略略。。。
Result "com.myf.concurrent.wwj2.jmh.JmhExample01.arrayListAdd":
N = 10
mean = 1.421 ±(99.9%) 0.986 us/op
Histogram, us/op:
[0.000, 0.250) = 0
[0.250, 0.500) = 0
[0.500, 0.750) = 0
[0.750, 1.000) = 4
[1.000, 1.250) = 2
[1.250, 1.500) = 1
[1.500, 1.750) = 0
[1.750, 2.000) = 1
[2.000, 2.250) = 0
[2.250, 2.500) = 1
[2.500, 2.750) = 1
Percentiles, us/op:
p(0.0000) = 0.875 us/op
p(50.0000) = 1.125 us/op
p(90.0000) = 2.638 us/op
p(95.0000) = 2.667 us/op
p(99.0000) = 2.667 us/op
p(99.9000) = 2.667 us/op
p(99.9900) = 2.667 us/op
p(99.9990) = 2.667 us/op
p(99.9999) = 2.667 us/op
p(100.0000) = 2.667 us/op
# 。。。略略略。。。
Benchmark Mode Cnt Score Error Units
JmhExample01.arrayListAdd ss 10 1.421 ± 0.986 us/op
JmhExample01.linkedListAdd ss 10 1.817 ± 0.904 us/op
可以设置多个模式的方法运行基准测试方法。
@BenchmarkMode({Mode.AverageTime, Mode.Throughput})
Benchmark Mode Cnt Score Error Units
JmhExample01.arrayListAdd thrpt 10 44.052 ± 6.503 ops/us
JmhExample01.linkedListAdd thrpt 10 62.376 ± 1.905 ops/us
JmhExample01.arrayListAdd avgt 10 0.027 ± 0.016 us/op
JmhExample01.linkedListAdd avgt 10 0.017 ± 0.002 us/op
甚至可以设置全部Mode
@BenchmarkMode(Mode.All)
Benchmark Mode Cnt Score Error Units
JmhExample01.arrayListAdd thrpt 10 45.572 ± 2.751 ops/us
JmhExample01.linkedListAdd thrpt 10 62.004 ± 2.739 ops/us
JmhExample01.arrayListAdd avgt 10 0.022 ± 0.001 us/op
JmhExample01.linkedListAdd avgt 10 0.016 ± 0.001 us/op
JmhExample01.arrayListAdd sample 24399 0.032 ± 0.014 us/op
JmhExample01.arrayListAdd:arrayListAdd·p0.00 sample ≈ 0 us/op
JmhExample01.arrayListAdd:arrayListAdd·p0.50 sample 0.041 us/op
JmhExample01.arrayListAdd:arrayListAdd·p0.90 sample 0.042 us/op
JmhExample01.arrayListAdd:arrayListAdd·p0.95 sample 0.042 us/op
JmhExample01.arrayListAdd:arrayListAdd·p0.99 sample 0.125 us/op
JmhExample01.arrayListAdd:arrayListAdd·p0.999 sample 0.817 us/op
JmhExample01.arrayListAdd:arrayListAdd·p0.9999 sample 13.082 us/op
JmhExample01.arrayListAdd:arrayListAdd·p1.00 sample 99.328 us/op
JmhExample01.linkedListAdd sample 26468 0.023 ± 0.001 us/op
JmhExample01.linkedListAdd:linkedListAdd·p0.00 sample ≈ 0 us/op
JmhExample01.linkedListAdd:linkedListAdd·p0.50 sample 0.041 us/op
JmhExample01.linkedListAdd:linkedListAdd·p0.90 sample 0.042 us/op
JmhExample01.linkedListAdd:linkedListAdd·p0.95 sample 0.042 us/op
JmhExample01.linkedListAdd:linkedListAdd·p0.99 sample 0.042 us/op
JmhExample01.linkedListAdd:linkedListAdd·p0.999 sample 0.272 us/op
JmhExample01.linkedListAdd:linkedListAdd·p0.9999 sample 3.668 us/op
JmhExample01.linkedListAdd:linkedListAdd·p1.00 sample 6.288 us/op
JmhExample01.arrayListAdd ss 10 0.583 ± 0.165 us/op
JmhExample01.linkedListAdd ss 10 1.100 ± 1.036 us/op
OutputTimeUnit提供了统计结果输出时的时间单位
State用于标记状态对象,状态对象封装了基准测试处理的对象的状态,状态对象的作用域(scope)定义了它在工作线程之间共享的状态。
状态对象通常作为参数注入Benchmark方法,JMH负责状态对象的实例化和共享。状态对象也可以被注入到其他状态对象的Setup和TearDown方法中,以或得阶段初始化。State-s之间的依赖关系图应该是有向无环图。
State对象可以被继承:您可以将State放在一个超类上,并使用子类作为状态。
每一个运行基准测试方法的线程都会持有一个独立的对象实例,该实例既有可能是作为基准测试方法参数传入的,也有可能是运行基准测试方法所在的宿主class(前面测试用例我们在类上就加了相关配置),将State设置为Scope.Thread一般主要是针对非线程安全的类。
package com.myf.concurrent.wwj2.jmh;
import org.openjdk.jmh.annotations.*;
import org.openjdk.jmh.runner.Runner;
import org.openjdk.jmh.runner.RunnerException;
import org.openjdk.jmh.runner.options.Options;
import org.openjdk.jmh.runner.options.OptionsBuilder;
import java.util.concurrent.TimeUnit;
/**
* @author myf
*/
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@Fork(1)
@Warmup(iterations = 5, time = 100, timeUnit = TimeUnit.MILLISECONDS)
@Measurement(iterations = 10, time = 100, timeUnit = TimeUnit.MILLISECONDS)
// 设置5个线程运行基准测试方法
@Threads(5)
public class JmhExample06 {
// 5个运行线程,每一个线程都会持有一个Test实例
@State(Scope.Thread)
public static class Test{
public Test() {
System.out.println("create Test instance");
}
public void method(){
}
}
// JMH将状态对象Test作为参数引入
@Benchmark
public void test(Test test){
test.method();
}
public static void main(String[] args) throws RunnerException {
final Options options = new OptionsBuilder()
.include(JmhExample06.class.getSimpleName())
.build();
new Runner(options).run();
}
}
运行上面的程序,我们会看到"create Test instance"字样出现了5次。即每一个线程都会持有单独的Test实例。
# 。。。略略略。。。
# Benchmark: com.myf.concurrent.wwj2.jmh.JmhExample06.test
# Run progress: 0.00% complete, ETA 00:00:01
# Fork: 1 of 1
# Warmup Iteration 1: create Test instance
create Test instance
create Test instance
create Test instance
create Test instance
599.513 ±(99.9%) 1401.639 ns/op
# Warmup Iteration 2: 7.567 ±(99.9%) 4.938 ns/op
# Warmup Iteration 3: 3.152 ±(99.9%) 2.982 ns/op
# Warmup Iteration 4: 1.401 ±(99.9%) 0.911 ns/op
# Warmup Iteration 5: 1.688 ±(99.9%) 1.052 ns/op
# 。。。略略略。。。
我们需要测试多线程的情况下某个类被不同线程操作时的性能,JMH提供了对县城共享的一种状态Scope.Benchmark
package com.myf.concurrent.wwj2.jmh;
import org.openjdk.jmh.annotations.*;
import org.openjdk.jmh.runner.Runner;
import org.openjdk.jmh.runner.RunnerException;
import org.openjdk.jmh.runner.options.Options;
import org.openjdk.jmh.runner.options.OptionsBuilder;
import java.util.concurrent.TimeUnit;
/**
* @author myf
*/
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@Fork(1)
@Warmup(iterations = 5, time = 100, timeUnit = TimeUnit.MILLISECONDS)
@Measurement(iterations = 10, time = 100, timeUnit = TimeUnit.MILLISECONDS)
// 设置5个线程运行基准测试方法
@Threads(5)
public class JmhExample07 {
// Test实例将会被多个线程共享
@State(Scope.Benchmark)
public static class Test{
public Test() {
System.out.println("create Test instance");
}
public void method(){
}
}
// JMH将状态对象Test作为参数引入
@Benchmark
public void test(Test test){
test.method();
}
public static void main(String[] args) throws RunnerException {
final Options options = new OptionsBuilder()
.include(JmhExample07.class.getSimpleName())
.build();
new Runner(options).run();
}
}
运行上面的程序,我们会看到"create Test instance"字样出现了1次。即Test实例将会被多个线程共享。
# 。。。略略略。。。
# Benchmark: com.myf.concurrent.wwj2.jmh.JmhExample07.test
# Run progress: 0.00% complete, ETA 00:00:01
# Fork: 1 of 1
# Warmup Iteration 1: create Test instance
6.185 ±(99.9%) 11.608 ns/op
# Warmup Iteration 2: 0.833 ±(99.9%) 0.245 ns/op
# Warmup Iteration 3: 1.544 ±(99.9%) 0.418 ns/op
# Warmup Iteration 4: 1.573 ±(99.9%) 0.917 ns/op
# 。。。略略略。。。
前面的实例中,我们所编写的基准测试方法都会被JMH框架根据方法名的字典顺序排序后逐个调用执行,因此不存在两个方法同时运行的情况。
如果我们想测试如下问题,第一,在多线程情况下的单个实例;第二允许一个以上的基准测试方法并行地运行。
这时候我们就要用到Scope.Group
package com.myf.concurrent.wwj2.jmh;
import org.openjdk.jmh.annotations.*;
import org.openjdk.jmh.runner.Runner;
import org.openjdk.jmh.runner.RunnerException;
import org.openjdk.jmh.runner.options.Options;
import org.openjdk.jmh.runner.options.OptionsBuilder;
import java.util.concurrent.TimeUnit;
/**
* @author myf
*/
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@Fork(1)
@Warmup(iterations = 1, time = 100, timeUnit = TimeUnit.MILLISECONDS)
@Measurement(iterations = 5, time = 100, timeUnit = TimeUnit.MILLISECONDS)
public class JmhExample08 {
// Test实例将会被设置为线程组共享的
@State(Scope.Group)
public static class Test{
public Test() {
System.out.println("create Test instance");
}
public void write(){
System.out.println("write");
}
public void read(){
System.out.println("read");
}
}
// 在线程组test中,有三个线程将不断对Test实例的read方法进行调用
@GroupThreads(3)
@Group("test")
@Benchmark
public void testRead(Test test){
test.read();
}
// 在线程组test中,有三个线程将不断对Test实例的write方法进行调用
@GroupThreads(3)
@Group("test")
@Benchmark
public void testWrite(Test test){
test.write();
}
public static void main(String[] args) throws RunnerException {
final Options options = new OptionsBuilder()
.include(JmhExample08.class.getSimpleName())
.build();
new Runner(options).run();
}
}
执行上述代码,我们会发现read和write分别交替输出。
# 。。。略略略。。。
# Timeout: 10 min per iteration
# 总共6个线程会执行基准测试方法,这6个线程在同一个Group中
# Threads: 6 threads (1 group; 3x "testRead", 3x "testWrite" in each group), will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: com.myf.concurrent.wwj2.jmh.JmhExample08.test
# Run progress: 0.00% complete, ETA 00:00:00
# Fork: 1 of 1
# Warmup Iteration 1: create Test instance
# 。。。略略略。。。
# read和write分别交替输出
read
read
read
read
write
write
write
read
read
read
read
# 。。。略略略。。。
Benchmark Mode Cnt Score Error Units
JmhExample08.test avgt 5 84965.982 ± 71625.520 ns/op
JmhExample08.test:testRead avgt 5 92158.539 ± 147051.981 ns/op
JmhExample08.test:testWrite avgt 5 77773.425 ± 111967.836 ns/op
假设你在编写代码的过程中需要用到一个Map容器,第一,需要保证使用过程中线程的安全性,第二,该容器需要有比较好的性能,比如,执行put方法最快,执行get方法最快等。作为Java程序员,JDK可供我们选择的方案其实有不少,比如ConcurrentHashMap、Hashtable、ConcurrentSkipListMap以及SynchronizedMap等,虽然它们都能够保证在多线程操作下的数据一致性,但是各自的性能表现又是怎样的呢?这就需要我们对其进行微基准测试(我们的测试相对来说比较片面,只在多线程的情况下对其进行put操作,也就是说并未涉及读取以及删除的操作)。
根据前面所学知识我们可以轻松写出满足该对比需求的基准测试。
package com.myf.concurrent.wwj2.jmh;
import org.openjdk.jmh.annotations.*;
import org.openjdk.jmh.runner.Runner;
import org.openjdk.jmh.runner.RunnerException;
import org.openjdk.jmh.runner.options.Options;
import org.openjdk.jmh.runner.options.OptionsBuilder;
import java.util.Collections;
import java.util.HashMap;
import java.util.Map;
import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.TimeUnit;
@BenchmarkMode(Mode.AverageTime)
@Fork(1)
@Warmup(iterations = 5, time = 100, timeUnit = TimeUnit.MILLISECONDS)
@Measurement(iterations = 10, time = 100, timeUnit = TimeUnit.MILLISECONDS)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
// 5个线程同事对共享资源进行操作
@Threads(5)
// 将状态对象JmhExample09设置为线程之间共享资源
@State(Scope.Benchmark)
public class JmhExample09 {
private Map<Long, Long> concurrentMap;
private Map<Long, Long> synchnoizedMap;
@Setup
public void setUp(){
concurrentMap = new ConcurrentHashMap<>();
synchnoizedMap = Collections.synchronizedMap(new HashMap<>());
}
@Benchmark
public void testConcurrentMap(){
this.concurrentMap.put(System.nanoTime(), System.nanoTime());
}
@Benchmark
public void testSynchnoizedMap(){
this.synchnoizedMap.put(System.nanoTime(), System.nanoTime());
}
public static void main(String[] args) throws RunnerException {
final Options options = new OptionsBuilder()
.include(JmhExample09.class.getSimpleName())
.build();
new Runner(options).run();
}
}
这里设置测试方法本身为Thread共享State,用五个线程进行测试,输出结果如下:
Benchmark Mode Cnt Score Error Units
JmhExample09.testConcurrentMap avgt 10 1630.001 ± 1260.341 ns/op
JmhExample09.testSynchnoizedMap avgt 10 13360.557 ± 38052.348 ns/op
从结果我们可以看出,ConcurrentHashMap的写性能要优于synchronizedMap。
Java提供的具备线程安全的Map接口实现并非只有ConcurrentHashMap和SynchronizedMap,同样,ConcurrentSkipListMap和Hashtable也可供我们选择,如果我们要对其进行测试,那么这里需要再增加两个不同类型的Map和两个针对这两个Map实现的基准测试方法。但是很显然,这种方式存在大量的代码冗余,因此JMH为我们提供了一个@Param的注解,它使得参数可配置,也就是说一个参数在每一次的基准测试时都会有不同的值与之对应。
package com.myf.concurrent.wwj2.jmh;
import org.openjdk.jmh.annotations.*;
import org.openjdk.jmh.runner.Runner;
import org.openjdk.jmh.runner.RunnerException;
import org.openjdk.jmh.runner.options.Options;
import org.openjdk.jmh.runner.options.OptionsBuilder;
import java.util.Collections;
import java.util.HashMap;
import java.util.Hashtable;
import java.util.Map;
import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.ConcurrentSkipListMap;
import java.util.concurrent.TimeUnit;
@BenchmarkMode(Mode.AverageTime)
@Fork(1)
@Warmup(iterations = 5, time = 100, timeUnit = TimeUnit.MILLISECONDS)
@Measurement(iterations = 10, time = 100, timeUnit = TimeUnit.MILLISECONDS)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
// 5个线程同事对共享资源进行操作
@Threads(5)
// 将状态对象JmhExample09设置为线程之间共享资源
@State(Scope.Benchmark)
public class JmhExample10 {
@Param({"1", "2", "3", "4"})
private int type;
private Map<Long, Long> map;
@Setup
public void setUp() {
switch (type) {
case 1 -> this.map = new ConcurrentHashMap<>();
case 2 -> this.map = new ConcurrentSkipListMap<>();
case 3 -> this.map = new Hashtable<>();
case 4 -> this.map = Collections.synchronizedMap(new HashMap<>());
default -> throw new IllegalArgumentException("Illegal map type.");
}
}
@Benchmark
public void testConcurrentMap() {
this.map.put(System.nanoTime(), System.nanoTime());
}
public static void main(String[] args) throws RunnerException {
final Options options = new OptionsBuilder()
.include(JmhExample10.class.getSimpleName())
.build();
new Runner(options).run();
}
}
引进了@Param对变量的可配置化,因此我们只需要写一个基准测试方法即可,JMH会根据@Param所提供的参数值,对test方法分别进行基准测试的运行与统计,这样我们就不需要为每一个map容器都写一个基准测试方法了。
Benchmark (type) Mode Cnt Score Error Units
JmhExample10.testConcurrentMap 1 avgt 10 1905.870 ± 1855.904 ns/op
JmhExample10.testConcurrentMap 2 avgt 10 918.250 ± 267.272 ns/op
JmhExample10.testConcurrentMap 3 avgt 10 2333.470 ± 689.124 ns/op
JmhExample10.testConcurrentMap 4 avgt 10 4950.044 ± 3697.307 ns/op
运行上面的基准测试,我们会发现输出结果中多了type这样一列信息。
JMH提供了两个注解@Setup和@TearDown用于套件测试。
@Setup会在每一个基准测试方法执行前被调用,通常用于资源的初始化。
@TearDown会在基准测试方法被执行之后被调用,通常用于资源的回收清理工作。
默认情况下@Setup和@TearDown会在一个基准方法的所有批次执行前后分别执行
这里我们只需要对JmhExample10,做出以下调整,观察变化
@Setup
public void setUp() {
System.out.println("setUp执行了" + Thread.currentThread().getName());
switch (type) {
case 1 -> this.map = new ConcurrentHashMap<>();
case 2 -> this.map = new ConcurrentSkipListMap<>();
case 3 -> this.map = new Hashtable<>();
case 4 -> this.map = Collections.synchronizedMap(new HashMap<>());
default -> throw new IllegalArgumentException("Illegal map type.");
}
}
@TearDown
public void tearDown() {
System.out.println("tearDown执行了" + Thread.currentThread().getName());
}
我们会发现每一个基准测试方法的输出发生了如下变化
# Run progress: 0.00% complete, ETA 00:00:06
# Fork: 1 of 1
# Warmup Iteration 1: setUp执行了com.myf.concurrent.wwj2.jmh.JmhExample10.testConcurrentMap-jmh-worker-1
1825.038 ±(99.9%) 2807.819 ns/op
# Warmup Iteration 2: 1423.850 ±(99.9%) 1637.913 ns/op
# Warmup Iteration 3: 1267.709 ±(99.9%) 1272.636 ns/op
# Warmup Iteration 4: 1475.872 ±(99.9%) 2460.539 ns/op
# Warmup Iteration 5: 896.906 ±(99.9%) 737.925 ns/op
Iteration 1: 1433.655 ±(99.9%) 1074.316 ns/op
Iteration 2: 2235.900 ±(99.9%) 3502.619 ns/op
Iteration 3: 2860.355 ±(99.9%) 4995.927 ns/op
Iteration 4: 964.292 ±(99.9%) 1568.305 ns/op
Iteration 5: 3285.678 ±(99.9%) 17619.895 ns/op
Iteration 6: 8562.194 ±(99.9%) 66921.642 ns/op
Iteration 7: 1234.296 ±(99.9%) 1633.239 ns/op
Iteration 8: 5399.636 ±(99.9%) 16028.597 ns/op
Iteration 9: 4187.621 ±(99.9%) 7297.895 ns/op
Iteration 10: tearDown执行了com.myf.concurrent.wwj2.jmh.JmhExample10.testConcurrentMap-jmh-worker-4
3684.771 ±(99.9%) 6960.884 ns/op
如果需要再每一个批次或者每一次基准测试方法调用执行的前后执行对应的套件方法,则需要对@Setup和@TearDown进行简单的配置。
Trial Setup和TearDown默认的配置,该套件方法会在每一个基准测试方法的所有批次执行的前后被执行。
@Setup(Level.Trial)
Iteration:由于我们可以设置Warmup和Measurement,因此每一个基准测试方法都会被执行若干个批次,如果想要在每一个基准测试批次执行的前后调用套件方法,则可以将Level设置为Iteration。
Warmup和Measurement设置的10批次不是方法调用10次
@Setup(Level.Iteration)
Invocation:将Level设置为Invocation意味着在每一个批次的度量过程中,每一次对基准方法的调用前后都会执行套件方法。
Warmup和Measurement设置的10批次不是方法调用10次
@Setup(Level.Invocation)
执行结果
setUp执行了com.myf.concurrent.wwj2.jmh.JmhExample10.testConcurrentMap-jmh-worker-5
setUp执行了com.myf.concurrent.wwj2.jmh.JmhExample10.testConcurrentMap-jmh-worker-5
setUp执行了com.myf.concurrent.wwj2.jmh.JmhExample10.testConcurrentMap-jmh-worker-5
setUp执行了com.myf.concurrent.wwj2.jmh.JmhExample10.testConcurrentMap-jmh-worker-5
setUp执行了com.myf.concurrent.wwj2.jmh.JmhExample10.testConcurrentMap-jmh-worker-5
setUp执行了com.myf.concurrent.wwj2.jmh.JmhExample10.testConcurrentMap-jmh-worker-5
setUp执行了com.myf.concurrent.wwj2.jmh.JmhExample10.testConcurrentMap-jmh-worker-5
setUp执行了com.myf.concurrent.wwj2.jmh.JmhExample10.testConcurrentMap-jmh-worker-5
tearDown执行了com.myf.concurrent.wwj2.jmh.JmhExample10.testConcurrentMap-jmh-worker-2
3740.522 ±(99.9%) 2590.521 ns/op
需要注意的是,套件方法的执行也会产生CPU时间的消耗,但是JMH并不会将这部分时间纳入基准方法的统计之中,这一点更进一步地说明了JMH的严谨之处。