JMH性能测试框架

JMH性能测试框架

JMH介绍

  • JMH,即Java Microbenchmark Harness,这是专门用于进行代码的微基准测试的一套工具API。它是一个由 OpenJDK/Oracle里面那群开发了Java编译器的大牛们所开发的Micro Benchmark Framework。何谓Micro Benchmark呢?简单来说就是在method层面上的benchmark,精度可以精确到微秒级。可以看出JMH主要使用在当你已经找出了热点函数,而需要对热点函数进行进一步的优化时,就可以使用JMH对优化的效果进行定量的分析。

  • 比较典型的使用场景:

    • 想定量的知道某个函数需要执行多长时间,以及执行时间和输入n的相关性
    • 一个函数有两种不同实现(例如实现A使用了FixedThreadPool,实现B使用了ForkJoinPool),不知道哪种实现性能更好
  • 学习使用方法,主要可以看官方提供的Code Sample写的非常浅显易懂。

JMH使用

  • maven引入
    
        1.21
    
    
    
        
            org.openjdk.jmh
            jmh-core
            ${jmh.version}
        
        
            org.openjdk.jmh
            jmh-generator-annprocess
            ${jmh.version}
            provided
        
    
  • 第一个Benchmark示例
@BenchmarkMode({Mode.SampleTime})
@OutputTimeUnit(TimeUnit.MILLISECONDS)
@Warmup(iterations=3, time = 5, timeUnit = TimeUnit.MILLISECONDS)
@Measurement(iterations=1,batchSize = 100000000)
@Threads(2)
@Fork(1)
@State(Scope.Benchmark)
public class MyClass {
    Lock lock = new ReentrantLock();
    long i = 0;
    AtomicLong atomicLong = new AtomicLong(0);
    @Benchmark
    public void measureLock() {
        lock.lock();
        i++;
        lock.unlock();
    }
    @Benchmark
    public void measureCAS() {
        atomicLong.incrementAndGet();
    }
    @Benchmark
    public void measureNoLock() {
        i++;
    }
}
  • 这个示例有三个函数,分别是测试使用不同方式的性能,加锁,使用CAS(先比较再交换)以及无锁状态

  • 对于benchmark有两种测试方式

    • 第一种是直接mvn install生产jar包,在命令行中执行jar包
    • 第二种是写一个main函数,代码如下:
        Options options = new OptionsBuilder()
                .include(MyClass.class.getSimpleName())
                .output("D:/Benchmark.log")
                .build();
        new Runner(options).run();
    
  • 测试结果如下:

# JMH version: 1.21
# VM version: JDK 1.8.0_144, Java HotSpot(TM) 64-Bit Server VM, 25.144-b01
# VM invoker: D:\sdk\java\jdk1.8\jre\bin\java.exe
# VM options: -ea -Didea.test.cyclic.buffer.size=1048576 -javaagent:C:\Program Files\JetBrains\IntelliJ IDEA 2019.1.3\lib\idea_rt.jar=57547:C:\Program Files\JetBrains\IntelliJ IDEA 2019.1.3\bin -Dfile.encoding=UTF-8
# Warmup: 3 iterations, 5 ms each
# Measurement: 1 iterations, 10 s each, 100000000 calls per op
# Timeout: 10 min per iteration
# Threads: 2 threads, will synchronize iterations
# Benchmark mode: Sampling time
# Benchmark: com.ctrip.car.testweb.javacode.MyClass.measureCAS

# Run progress: 0.00% complete, ETA 00:00:30
# Fork: 1 of 1
# Warmup Iteration   1: ≈ 10⁻⁴ ms/op
# Warmup Iteration   2: ≈ 10⁻⁴ ms/op
# Warmup Iteration   3: ≈ 10⁻⁴ ms/op
Iteration   1: 3270.359 ±(99.9%) 538.912 ms/op
                 measureCAS·p0.00:   2805.989 ms/op
                 measureCAS·p0.50:   3296.723 ms/op
                 measureCAS·p0.90:   3527.410 ms/op
                 measureCAS·p0.95:   3527.410 ms/op
                 measureCAS·p0.99:   3527.410 ms/op
                 measureCAS·p0.999:  3527.410 ms/op
                 measureCAS·p0.9999: 3527.410 ms/op
                 measureCAS·p1.00:   3527.410 ms/op



Result "com.ctrip.car.testweb.javacode.MyClass.measureCAS":
  N = 7
  mean =   3270.359 ±(99.9%) 538.912 ms/op

  Histogram, ms/op:
    [2800.000, 2850.000) = 1 
    [2850.000, 2900.000) = 0 
    [2900.000, 2950.000) = 0 
    [2950.000, 3000.000) = 0 
    [3000.000, 3050.000) = 0 
    [3050.000, 3100.000) = 0 
    [3100.000, 3150.000) = 0 
    [3150.000, 3200.000) = 0 
    [3200.000, 3250.000) = 2 
    [3250.000, 3300.000) = 1 
    [3300.000, 3350.000) = 1 
    [3350.000, 3400.000) = 0 
    [3400.000, 3450.000) = 0 
    [3450.000, 3500.000) = 0 
    [3500.000, 3550.000) = 2 

  Percentiles, ms/op:
      p(0.0000) =   2805.989 ms/op
     p(50.0000) =   3296.723 ms/op
     p(90.0000) =   3527.410 ms/op
     p(95.0000) =   3527.410 ms/op
     p(99.0000) =   3527.410 ms/op
     p(99.9000) =   3527.410 ms/op
     p(99.9900) =   3527.410 ms/op
     p(99.9990) =   3527.410 ms/op
     p(99.9999) =   3527.410 ms/op
    p(100.0000) =   3527.410 ms/op


# JMH version: 1.21
# VM version: JDK 1.8.0_144, Java HotSpot(TM) 64-Bit Server VM, 25.144-b01
# VM invoker: D:\sdk\java\jdk1.8\jre\bin\java.exe
# VM options: -ea -Didea.test.cyclic.buffer.size=1048576 -javaagent:C:\Program Files\JetBrains\IntelliJ IDEA 2019.1.3\lib\idea_rt.jar=57547:C:\Program Files\JetBrains\IntelliJ IDEA 2019.1.3\bin -Dfile.encoding=UTF-8
# Warmup: 3 iterations, 5 ms each
# Measurement: 1 iterations, 10 s each, 100000000 calls per op
# Timeout: 10 min per iteration
# Threads: 2 threads, will synchronize iterations
# Benchmark mode: Sampling time
# Benchmark: com.ctrip.car.testweb.javacode.MyClass.measureLock

# Run progress: 33.33% complete, ETA 00:00:28
# Fork: 1 of 1
# Warmup Iteration   1: 0.001 ±(99.9%) 0.001 ms/op
# Warmup Iteration   2: ≈ 10⁻³ ms/op
# Warmup Iteration   3: ≈ 10⁻³ ms/op
Iteration   1: 6339.690 ±(99.9%) 33223.544 ms/op
                 measureLock·p0.00:   3435.135 ms/op
                 measureLock·p0.50:   3940.549 ms/op
                 measureLock·p0.90:   14042.530 ms/op
                 measureLock·p0.95:   14042.530 ms/op
                 measureLock·p0.99:   14042.530 ms/op
                 measureLock·p0.999:  14042.530 ms/op
                 measureLock·p0.9999: 14042.530 ms/op
                 measureLock·p1.00:   14042.530 ms/op



Result "com.ctrip.car.testweb.javacode.MyClass.measureLock":
  N = 4
  mean =   6339.690 ±(99.9%) 33223.544 ms/op

  Histogram, ms/op:
    [    0.000,  1250.000) = 0 
    [ 1250.000,  2500.000) = 0 
    [ 2500.000,  3750.000) = 1 
    [ 3750.000,  5000.000) = 2 
    [ 5000.000,  6250.000) = 0 
    [ 6250.000,  7500.000) = 0 
    [ 7500.000,  8750.000) = 0 
    [ 8750.000, 10000.000) = 0 
    [10000.000, 11250.000) = 0 
    [11250.000, 12500.000) = 0 
    [12500.000, 13750.000) = 0 
    [13750.000, 15000.000) = 1 
    [15000.000, 16250.000) = 0 
    [16250.000, 17500.000) = 0 
    [17500.000, 18750.000) = 0 

  Percentiles, ms/op:
      p(0.0000) =   3435.135 ms/op
     p(50.0000) =   3940.549 ms/op
     p(90.0000) =  14042.530 ms/op
     p(95.0000) =  14042.530 ms/op
     p(99.0000) =  14042.530 ms/op
     p(99.9000) =  14042.530 ms/op
     p(99.9900) =  14042.530 ms/op
     p(99.9990) =  14042.530 ms/op
     p(99.9999) =  14042.530 ms/op
    p(100.0000) =  14042.530 ms/op


# JMH version: 1.21
# VM version: JDK 1.8.0_144, Java HotSpot(TM) 64-Bit Server VM, 25.144-b01
# VM invoker: D:\sdk\java\jdk1.8\jre\bin\java.exe
# VM options: -ea -Didea.test.cyclic.buffer.size=1048576 -javaagent:C:\Program Files\JetBrains\IntelliJ IDEA 2019.1.3\lib\idea_rt.jar=57547:C:\Program Files\JetBrains\IntelliJ IDEA 2019.1.3\bin -Dfile.encoding=UTF-8
# Warmup: 3 iterations, 5 ms each
# Measurement: 1 iterations, 10 s each, 100000000 calls per op
# Timeout: 10 min per iteration
# Threads: 2 threads, will synchronize iterations
# Benchmark mode: Sampling time
# Benchmark: com.ctrip.car.testweb.javacode.MyClass.measureNoLock

# Run progress: 66.67% complete, ETA 00:00:14
# Fork: 1 of 1
# Warmup Iteration   1: ≈ 10⁻⁴ ms/op
# Warmup Iteration   2: ≈ 10⁻⁴ ms/op
# Warmup Iteration   3: ≈ 10⁻⁴ ms/op
Iteration   1: 261.960 ±(99.9%) 4.585 ms/op
                 measureNoLock·p0.00:   183.239 ms/op
                 measureNoLock·p0.50:   262.144 ms/op
                 measureNoLock·p0.90:   272.944 ms/op
                 measureNoLock·p0.95:   278.449 ms/op
                 measureNoLock·p0.99:   281.543 ms/op
                 measureNoLock·p0.999:  281.543 ms/op
                 measureNoLock·p0.9999: 281.543 ms/op
                 measureNoLock·p1.00:   281.543 ms/op



Result "com.ctrip.car.testweb.javacode.MyClass.measureNoLock":
  N = 77
  mean =    261.960 ±(99.9%) 4.585 ms/op

  Histogram, ms/op:
    [180.000, 190.000) = 1 
    [190.000, 200.000) = 0 
    [200.000, 210.000) = 0 
    [210.000, 220.000) = 0 
    [220.000, 230.000) = 0 
    [230.000, 240.000) = 0 
    [240.000, 250.000) = 1 
    [250.000, 260.000) = 25 
    [260.000, 270.000) = 35 
    [270.000, 280.000) = 13 

  Percentiles, ms/op:
      p(0.0000) =    183.239 ms/op
     p(50.0000) =    262.144 ms/op
     p(90.0000) =    272.944 ms/op
     p(95.0000) =    278.449 ms/op
     p(99.0000) =    281.543 ms/op
     p(99.9000) =    281.543 ms/op
     p(99.9900) =    281.543 ms/op
     p(99.9990) =    281.543 ms/op
     p(99.9999) =    281.543 ms/op
    p(100.0000) =    281.543 ms/op


# Run complete. Total time: 00:00:39

REMEMBER: The numbers below are just data. To gain reusable insights, you need to follow up on
why the numbers are the way they are. Use profilers (see -prof, -lprof), design factorial
experiments, perform baseline and negative tests that provide experimental control, make sure
the benchmarking environment is safe on JVM/OS/HW level, ask for reviews from the domain experts.
Do not assume the numbers tell you what you want them to tell.

Benchmark                                      Mode  Cnt      Score       Error  Units
MyClass.measureCAS                           sample    7   3270.359 ±   538.912  ms/op
MyClass.measureCAS:measureCAS·p0.00          sample        2805.989              ms/op
MyClass.measureCAS:measureCAS·p0.50          sample        3296.723              ms/op
MyClass.measureCAS:measureCAS·p0.90          sample        3527.410              ms/op
MyClass.measureCAS:measureCAS·p0.95          sample        3527.410              ms/op
MyClass.measureCAS:measureCAS·p0.99          sample        3527.410              ms/op
MyClass.measureCAS:measureCAS·p0.999         sample        3527.410              ms/op
MyClass.measureCAS:measureCAS·p0.9999        sample        3527.410              ms/op
MyClass.measureCAS:measureCAS·p1.00          sample        3527.410              ms/op
MyClass.measureLock                          sample    4   6339.690 ± 33223.544  ms/op
MyClass.measureLock:measureLock·p0.00        sample        3435.135              ms/op
MyClass.measureLock:measureLock·p0.50        sample        3940.549              ms/op
MyClass.measureLock:measureLock·p0.90        sample       14042.530              ms/op
MyClass.measureLock:measureLock·p0.95        sample       14042.530              ms/op
MyClass.measureLock:measureLock·p0.99        sample       14042.530              ms/op
MyClass.measureLock:measureLock·p0.999       sample       14042.530              ms/op
MyClass.measureLock:measureLock·p0.9999      sample       14042.530              ms/op
MyClass.measureLock:measureLock·p1.00        sample       14042.530              ms/op
MyClass.measureNoLock                        sample   77    261.960 ±     4.585  ms/op
MyClass.measureNoLock:measureNoLock·p0.00    sample         183.239              ms/op
MyClass.measureNoLock:measureNoLock·p0.50    sample         262.144              ms/op
MyClass.measureNoLock:measureNoLock·p0.90    sample         272.944              ms/op
MyClass.measureNoLock:measureNoLock·p0.95    sample         278.449              ms/op
MyClass.measureNoLock:measureNoLock·p0.99    sample         281.543              ms/op
MyClass.measureNoLock:measureNoLock·p0.999   sample         281.543              ms/op
MyClass.measureNoLock:measureNoLock·p0.9999  sample         281.543              ms/op
MyClass.measureNoLock:measureNoLock·p1.00    sample         281.543              ms/op

  • 总结如下,使用lock方式最长需要14042ms,使用CAS方式最长需要3527ms,使用无锁方式最长需要281ms

JMH基本概念

Mode

  • Mode表示JMH进行Benchmark时所使用的模式。通常是测量的维度不同或是测量的方式不同。目前JMH共有四种模式:

    • Throughput:整体吞吐量,例如"1s内可以执行多少次调用"
    • AverageTime:调用的平均时间,例如"每次调用平均耗时XXX毫秒"
    • SampleTime:随机取样,最后输出取样结果的分布,例如"99%的调用在XXX毫秒以内"
    • SingleShotTime:以上模式都是默认一次iteration是1s,唯有SingleShotTime是只运行一次。往往同时把warmup次数设为0,用于测试冷启动时的性能。
  • Iteration

    • Iteration是JMH进行测试的最小单位。在大部分模式下,一次Iteration代表的是一秒,JMH会在这一秒内不断调用需要benchmark的方法,然后根据模式进行采样,计算吞吐量,计算平均执行时间等。
  • Warmup

    • Warmup是指在实际进行benchmark前先进行预热的行为。为什么需要预热?因为JVM的JIT机制的存在,如果某个函数被调用多次以后,JVM会尝试将其编译成机器码从而提高执行速度。所以为了让benchmark的结果更加接近真实情况就需要进行预热。

@BenchmarkMod

  • 基准测试类型。这里选择的是SampleTime

@Warmup

  • 这里就是对预热的轮数以及时间的一些控制

@Measurement

  • 就是一些基本的测试参数
    • iterations 进行的测试轮数
    • time 每轮进行的时长
    • timeUnit 时长单位

@Threads

  • 每个进程中的测试线程,一般是CPU乘以2

@Fork

  • 进行fork的次数,如果fork数是2代表JMH会fork出线程来进行测试

@OutputTimeUnit

  • 基准测试结果的时间类型,可以选择秒、毫秒和微秒

@Benchmark

  • 方法级注解,表示该方法是需要进行benchmark的对象,用法和JUnit的@Test类似

@Param

  • 属性级注解,@Param可以用来指定某项参数的多种情况。特别适合用来测试一个函数在不同的参数输入的情况下的性能

@Setup

  • 方法级注解,作用是我们需要在测试之前进行一些准备工作,不如对数据的一些初始化之类的

@TearDown

  • 方法级注解,作用就是测试后的一些结束工作,比如关闭线程池,数据库连接等

@State

  • 当使用@Setup注解的时候,必须在类上加这个参数
  • State 用于声明某个类是一个“状态”,然后接受一个 Scope 参数用来表示该状态的共享范围。 因为很多 benchmark 会需要一些表示状态的类,JMH 允许你把这些类以依赖注入的方式注入到 benchmark 函数里。Scope 主要分为三种。
    • Thread:该状态为每个线程独享
    • Group:该状态为同一个组里面所有线程共享
    • Benchmark:该状态在所有线程间共享

你可能感兴趣的:(java)