前面的话

我们在工作过程中，肯定会遇到性能调优及内存溢出的问题，本篇文章会通过几个小例子来粗略的介绍性能定位的思路及工具的使用。

性能问题分类

我们经常遇到的服务端的性能问题一般有如下几种：

1、接口时延过高，TPS不达标
2、内存溢出

栗子说明

本文栗子为使用 springboot 快速开发了两个 http 接口，一个是列表排序栗子，模拟耗时操作，一个是往一个全局列表中不停的插入数据达到内存溢出的效果。
关于列表排序，这里使用两种排序方式，一种是简单的冒泡排序，一种是 jdk 里列表的排序方式：加强型多路归并排序，用两个排序算法主要为了说明 JHM 的使用方式。

TPS 不达标问题分析

对于此类问题，则一般是在性能测试阶段就能发现。此时调优一般在性能测试环境上进行。
如何找出耗时操作呢，JDK 已经给我们提供了一系列的工具来定位该问题了，这里我们使用Java VisualVM来诊断接口性能。
首先在启动脚本里打开 JVM 的 JMX 端口，打开方式为-Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Djava.rmi.server.hostname=10.234.196.199
启动之后，我们就可以通过Java VisualVM来监控我们的 JVM 了。
如图：

性能监控1.png

打开抽样器，进行 CPU 抽样，统计各个接口消耗 CPU 时间。
使用压测工具，持续的压测有性能问题的接口。这里使用 jmeter 进行压测。压测一段时间后，打印 CPU 快照，如图：

性能监控2.png

这里发现我们在调用 getPerf 接口时，进一步调用了 process1 接口，这个接口里有 bubbleSort 方法和 jdk 自带的 sort 两个调用。这两个都是对列表排序，发现大部分时间都耗在冒泡排序上。这里 bubbleSort 就是需要优化的地方。排序算法有很多，不同的数据量，不同的排序方法耗时也不一样。这里需要用 JMH 来评估算法的耗时。
JMH 相关介绍可以参考JMH,这里有相关的例子可以参考

一般对于耗时操作的优化，可以有如下方式：
1、优化自身算法，降低算法的时间复杂度
2、同步操作异步化。
对于异步化操作，又有如下方式：

1、异步线程
2、线程池，线程复用（线程池的大小如何确定，CPU 密集型和 IO 密集型）
3、发布订阅（消息队列或者 spring 的 event 机制）

3、使用缓存机制【多级缓存，问题：缓存一致性，缓存防并发，防雪崩----一个大专题】
4、业务流程上进行优化，提供专门的接口，只做当前业务，不考虑复用性。
5、如果是数据库查询慢，则需要优化数据库【这又是一个大专题】。sql 优化？？？表优化，如果有联表查询，则可以考虑不满足 3 范式，拉平表结构。

如果无法在测试环境上复现，则可以试用 arthas 工具，attach 到相关进程，通过 arthas 命令大致查看每个请求的耗时。关于 arthas 的用法，可以参考arthas

内存溢出问题分析

为什么内存溢出会出现接口时延过高呢？
我们服务端一般是 JAVA 语言开发，如果 JVM 虚拟机内存不足时，会触发 FullGC，FullGC 会吃大量的 CPU 时间。如果我们的内存一直不足，频繁的 GC，则会 STW，CPU 居高不下，留给业务的 CPU 时间就降低，导致业务接口时延上升。
内存溢出的例子代码如下

 public void process2() {
        String name = "The Spring Framework provides a comprehensive programming and configuration model for" +
                "modern Java-based enterprise applications - on any kind of deployment platform" +
                "A key element of Spring is infrastructural support at the application level: Spring focuses on the" +
                "Complete set of java.time based setters on HttpHeaders, CacheControl, CorsConfiguration.\n" +
                "@RequestMapping has enhanced produces condition support such that if a media type is declared with a specific parameter, and the requested media types (e.g. from \"Accept\" header) also has that parameter, the parameter values must match. This can be used for example to differentiate methods producing ATOM feeds \"application/atom+xml;type=feed\" vs ATOM entries \"application/atom+xml;type=entry\".\n" +
                "CORS revision that adds Vary header for non CORS requests on CORS enabled endpoints and avoid considering same-origin requests with an Origin header as a CORS request.\n" +
                "Upgrade to Jackson 2.10\n" +
                "Spring Web MVC\n" +
                "New \"WebMvc.fn\" programming model, analogous to the existing \"WebFlux.fn\":\n" +
                "A functional alternative to annotated controllers built on the Servlet API.\n" +
                "WebMvc.fn Kotlin DSL.\n" +
                "Request mapping performance optimizations through caching of the lookup path per HandlerMapping, and pre-computing frequently used data in RequestCondition implementations.\n" +
                "Improved, compact logging of request mappings on startup.\n" +
                "Spring WebFlux\n" +
                "Refinements to WebClient API to make the retrieve() method useful for most common cases, specifically adding the ability to retrieve status and headers and addition to the body. The exchange() method is only for genuinely advanced cases, and when using it, applications can now rely on ClientResponse#createException to simplify selective handling of exceptions.\n" +
                "Support for Kotlin Coroutines.\n" +
                "Server and client now use Reactor checkpoints to insert information about the request URL being processed,sce or the handler used, that is then inserted into exceptions and logged below the exception stacktrace.\n" +
                "Request mapping performance optimizations through pre-computing frequently used data in RequestCondition implementations.\n" +
                "Header management performance optimizations by wrapping rather than copying server headers, and caching parsed representations of media types. Available from 5.1.1, see issue #21783 and commits under \"Issue Links\".\n" +
                "Improved, compact logging of request mappings on startup.\n" +
                "Add ServerWebExchangeContextFilter to expose the Reactor Context as an exchange attribute.\n" +
                "Add FreeMarker macros support.\n" +
                "MultipartBodyBuilder improvements to allow Publisher and Part as input along with option to specify the filename to use for a part.";
        list.add(name + System.currentTimeMillis());

    }

这里往一个全局的 list 中添加一个字符串，每次请求时，添加一个字符串。
-Xms200m -Xmx200m这里把 jvm 堆内存大小设置为 200m。
对于内存溢出，则需要 gc log 和内存快照。gc log 可以在https://gceasy.io上面分析，可以看到相关的fullgc和yong gc 的情况。gc 分析如图：

gc分析.png

该图表明发生 GC 之后，对大小并没有明显的减少，可能是堆内存不太够用。图左边的每个按钮对应一个分析。

定位出内存不足后，就要看内存中哪些对象回收不掉，这时需要使用到 jmap 命令，dump 出内存快照。命令如下：
jmap -dump:format=b,file=heapdump.hprof pid,
获取到内存快照可以使用 mat 进行分析。
使用 mat 打开快照文件，如下：

mat_preview.png

这里看到最大一块内存是 81.7M,点击饼图进入如下页面：

list.png

上图可以看到在类 businessServiceImpl 中有个 list，该 list 共有 16081 个元素，每个元素大小 5296 个字节。共有 82M。
点击 value，可以查看 list 中具体的值。如图：

list_value_detail.png

发现正是我们代码里插入的字符串。

CPU 高

使用 top 命令，查看哪个进程 CPU 高，通过 top -p -H 查看哪个线程消耗 CPU。使用 jstack 命令打印出 java 进程的线程堆栈，通过线程号找到相应的 java 线程，结合 java 代码，一般可以找出系统的耗 CPU 代码。
相关操作可以参考如下文章：
谁偷走了你的服务器性能

写在最后

这里只是通过一些栗子说明了性能工具的使用方法，只是一个引子，随后会进一步介绍如何进行性能的优化。

性能问题定位套路