使用Arthas排查性能问题

使用Arthas排查性能问题_第1张图片      Arthas 是一款线上监控诊断产品,通过全局视角实时查看应用 load、内存、gc、线程的状态信息,并能在不修改应用代码的情况下,对业务问题进行诊断,包括查看方法调用的出入参、异常,监测方法执行耗时,类加载信息等,大大提升线上问题排查效率

1.问题背景

今天一到公司开发人员就因线上的性能问题寻求帮助,售后反馈系统升级后物理产品更换业务处理十分缓慢,有时卡主10几秒。

我先尝试从开发环境复现问题,发现开发环境处理并不慢。考虑到线上环境存在性能问题而开发环境没有这个问题,一般情况下是两边数据量差引起的。我们的开发环境中的数据比现场环境低几个数量级。可能某些SQL在线上执行比较慢引发了问题。我尝试打印慢SQL,发现日志中没有执行比较慢的SQL以及可疑日志。 这个问题变得十分棘手。

2.问题分析

在毫无头绪时,只能借助一些工具来协助分析。本问题主要用到了Arthas 的trace命令

trace :获取方法内部调用路径,并输出方法路径上的每个节点上耗时

参考: trace | arthas

排查过程

1.运行arthas ,并选择进程20408,20408是系统客户端使用的进程

D:\staibossCliente\arthas-bin>
D:\staibossCliente\arthas-bin>java -jar arthas-boot.jar
[INFO] arthas-boot version: 3.5.4
[INFO] Process 20408 already using port 3658
[INFO] Process 20408 already using port 8563
[INFO] Found existing java process, please choose one and input the serial number of the process, eg : 1. Then hit ENTER.
* [1]: 20408 D:\staibossCliente\staribossclient\stariboss-new-client.exe
1
[INFO] arthas home: D:\staibossCliente\arthas-bin
[INFO] The target process already listen port 3658, skip attach.
[INFO] arthas-client connect 127.0.0.1 3658
  ,---.  ,------. ,--------.,--.  ,--.  ,---.   ,---.
 /  O  \ |  .--. ''--.  .--'|  '--'  | /  O  \ '   .-'
|  .-.  ||  '--'.'   |  |   |  .--.  ||  .-.  |`.  `-.
|  | |  ||  |\  \    |  |   |  |  |  ||  | |  |.-'    |
`--' `--'`--' '--'   `--'   `--'  `--'`--' `--'`-----'
 
 
wiki       https://arthas.aliyun.com/doc
tutorials  https://arthas.aliyun.com/doc/arthas-tutorials.html
version    3.5.4
main_class
pid        20408
time       2023-11-20 17:38:39
 

2. 根据业务行为查看方法内部的处理时间

[arthas@20408]$ trace -E com.star.sms.richclient.accept.view.accept.acceptEditor2.ExchangePhysicalProductEditor initOptionalPhysicalProducts
Affect(class count: 1 , method count: 0) cost in 61 ms, listenerId: 10
No class or method is affected, try:
1. Execute `sm CLASS_NAME METHOD_NAME` to make sure the method you are tracing actually exists (it might be in your parent class).
2. Execute `options unsafe true`, if you want to enhance the classes under the `java.*` package.
3. Execute `reset CLASS_NAME` and try again, your method body might be too large.
4. Check arthas log: C:\Users\95225/logs/arthas/arthas.log
5. Visit https://github.com/alibaba/arthas/issues/47 for more details.
[arthas@20408]$
[arthas@20408]$
[arthas@20408]$
[arthas@20408]$ trace -E com.star.sms.richclient.accept.view.accept.acceptEditor2.ExchangePhysicalProductEditor subscriberChanged
Press Q or Ctrl+C to abort.
Affect(class count: 1 , method count: 3) cost in 152 ms, listenerId: 12
`---ts=2023-11-20 17:57:00;thread_name=AWT-EventQueue-0;id=f;is_daemon=false;priority=6;TCCL=com.star.sms.launch.LaunchURLClassLoader@17a8a02
    `---[12855.2434ms] com.star.sms.richclient.accept.view.accept.acceptEditor2.ExchangePhysicalProductEditor:subscriberChanged()
        +---[0.0065ms] org.apache.commons.logging.Log:debug() #115
        +---[0.0061ms] com.star.sms.richclient.accept.view.accept.acceptEditor2.ExchangePhysicalProductEditor:getSelectedSubscriber() #116
        +---[0.0033ms] com.star.sms.richclient.accept.view.accept2.page.displayer.ExchangePhyProductDisplayer:setSubscriber() #117
        +---[min=0.0032ms,max=0.0032ms,total=0.0064ms,count=2] com.star.sms.richclient.accept.view.accept.acceptEditor2.ExchangePhysicalProductEditor:getCurrentAction() #122
        +---[0.0031ms] com.star.sms.richclient.accept.view.accept.acceptEditor2.ExchangePhysicalProductEditor:getCurrentCustomer() #122
        +---[5947.6627ms] com.star.sms.richclient.accept.view.accept.acceptEditor2.ExchangePhysicalProductEditor:getPhyProducts() #122    // getPhyProducts消耗了6S
        +---[6907.4171ms] com.star.sms.richclient.accept.view.accept2.page.displayer.ExchangePhyProductDisplayer:onSubscriberChanged() #122  // getPhyProducts消耗了7S
        +---[0.0056ms] com.star.sms.richclient.accept.view.accept.acceptEditor2.ExchangePhysicalProductEditor:getCurrentCustomer() #126
        `---[0.0048ms] com.star.sms.richclient.accept.view.accept2.page.displayer.ExchangePhyProductDisplayer:setCurrentCustomer() #126
  

3.分析日志

从日志中可见, getPhyProducts 消耗了6秒, onSubscriberChanged消耗了7S。定位2个性能慢的方法后,进一步分析

[arthas@20408]$ trace -E com.star.sms.richclient.accept.view.accept2.page.displayer.ExchangePhyProductDisplayer onSubscriberChanged|getExchangableInfos|initOptionalPhysicalProducts
Press Q or Ctrl+C to abort.
Affect(class count: 1 , method count: 3) cost in 132 ms, listenerId: 14
`---ts=2023-11-20 17:59:13;thread_name=AWT-EventQueue-0;id=f;is_daemon=false;priority=6;TCCL=com.star.sms.launch.LaunchURLClassLoader@17a8a02
    `---[0.4848ms] com.star.sms.richclient.accept.view.accept2.page.displayer.ExchangePhyProductDisplayer:onSubscriberChanged()
        `---[0.1634ms] com.star.sms.richclient.accept.view.accept2.page.displayer.ExchangePhyProductDisplayer:clearTable() #195
 
`---ts=2023-11-20 17:59:19;thread_name=AWT-EventQueue-0;id=f;is_daemon=false;priority=6;TCCL=com.star.sms.launch.LaunchURLClassLoader@17a8a02
    `---[6979.1925ms] com.star.sms.richclient.accept.view.accept2.page.displayer.ExchangePhyProductDisplayer:onSubscriberChanged()
        +---[342.3417ms] com.star.sms.richclient.accept.view.accept2.page.displayer.ExchangePhyProductDisplayer:getExchangableInfos() #198
com.star.sms.model.order.dto.ExchangeInfo:setReturnResourceState() #1082
        +---[93.7515ms] com.star.sms.richclient.accept.view.accept2.page.displayer.ExchangePhyProductDisplayer:setTableDatas() #198
        `---[6543.0383ms] com.star.sms.richclient.accept.view.accept2.page.displayer.ExchangePhyProductDisplayer:initOptionalPhysicalProducts() #200
            `---[6542.9974ms] com.star.sms.richclient.accept.view.accept2.page.displayer.ExchangePhyProductDisplayer:initOptionalPhysicalProducts()    //initOptionalPhysicalProducts 消耗时间7S
                +---[0.0147ms] org.apache.commons.logging.Log:debug() #907
                +---[min=0.0015ms,max=0.0175ms,total=0.7956ms,count=202] com.star.sms.model.product.core.catalog.PhysicalProduct:getResourceDirectory() #911
                +---[min=0.0017ms,max=0.0306ms,total=0.6809ms,count=202] com.star.sms.model.product.other.ResourceDirectory:getId() #911
                +---[min=29.9191ms,max=55.5523ms,total=6528.053ms,count=202] com.star.sms.outerInterface.resource.service.IResourceExService:getResCatalogById() #911
                +---[min=0.0017ms,max=0.0447ms,total=1.0478ms,count=202] com.star.sms.model.resource.core.resparam.ResourceCatalog:getResourceType() #913
                +---[min=0.0034ms,max=0.0327ms,total=1.4611ms,count=202] com.star.sms.richclient.accept.view.accept2.page.displayer.ExchangePhyProductDisplayer:setPhysicalProductResourceType() #914
                +---[min=0.0023ms,max=0.0376ms,total=0.9837ms,count=202] org.apache.commons.logging.Log:isDebugEnabled() #929
                `---[0.0128ms] org.apache.commons.logging.Log:debug() #934

继续查看执行时间,发现程序中initOptionalPhysicalProducts 方法用时最长,是一个程序瓶颈,此方法中使用了循环,循环中IResourceExService:getResCatalogById方法调用了202次。

查看程序代码,这是一个典型的N+1此查询性能问题。 循环中调用了N此业务查询获取ResourceCatalog信息。

  private void initOptionalPhysicalProducts() {
        logger.debug("<== Begin to initialize the optional physical product... ==>");

        for (PhysicalProduct physicalProduct : physicalProducts) {
            ResourceCatalog resourceCatalog = resourceExService.getResCatalogById(physicalProduct .getResourceDirectory().getId());   // 这里存在N+1次查询, 其中N=202
           
            //省略代码 dosomething

            if (logger.isDebugEnabled()) {
                logger.debug("<-- Get a product,name is:" + physicalProduct.getName()
                        + "; and its resource type Id is :" + resourceType.getId() + " -->");
            }
        }
        logger.debug("<~~The end of the physical product to initialize the optional~~>");
    }

3.解决方法

对于N+1此查询可以通过id集合查询+分组方式来处理,修改后的代码

    private void initOptionalPhysicalProducts() {
        logger.debug("<== Begin to initialize the optional physical product... ==>");
        
        //1.获取ID集合
		List catalogids = new ArrayList();
		for (PhysicalProduct physicalProduct : physicalProducts) {
			catalogids.add(physicalProduct.getResourceDirectory().getId());
		}

        //2.只查询1次
		List catalogs = resourceExService.getResCatalogByIds(catalogids);
        
        //3.分组
		Map map = new HashMap();
		for (ResourceCatalog resourceCatalog : catalogs) {
			map.put(resourceCatalog.getId(), resourceCatalog);
		}
        
        for (PhysicalProduct physicalProduct : physicalProducts) {
            ResourceCatalog resourceCatalog = map.get(physicalProduct.getResourceDirectory().getId()); //内存里获取
           
            //省略代码 dosomething
            
            if (logger.isDebugEnabled()) {
                logger.debug("<-- Get a product,name is:" + physicalProduct.getName()
                        + "; and its resource type Id is :" + resourceType.getId() + " -->");
            }
        }
        logger.debug("<~~The end of the physical product to initialize the optional~~>");
    }

按照此处理方式,优化处理getPhyProducts 后问题解决,修正后业务处理时间在1秒内

上一篇:Jboss启动报错Unrecognized VM option PermSize=128m

你可能感兴趣的:(Java,java,性能优化)