排查记录:负载过高、系统变慢

背景

系统忽然变慢,页面刷新加载时间长,同一台机器上部署了n多个java服务,不要问我为什么,问就是历史原因。

一、top查看系统负载

Linux:top命令 – 实时显示系统运行状态
排查记录:负载过高、系统变慢_第1张图片

4核16GB内存的服务器,一分钟负载飙到24,有点吓人,页面加载时长20秒向上了

二、pidstat命令查看各个进程的CPU使用情况

pidstat -u 5 显示各个进程的cpu使用统计,5秒刷新一次

#显示各个进程的cpu使用统计,5秒刷新一次
[root@srv-100-50-88-6 logs]# pidstat -u 5
Linux 3.10.0-327.el7.x86_64 (srv-100-50-88-6)     08/14/2022     _x86_64_    (4 CPU)

12:29:06 AM   UID       PID    %usr %system  %guest    %CPU   CPU  Command
12:29:12 AM     0         7    0.00    0.20    0.00    0.20     0  migration/0
12:29:12 AM     0       169    0.00    0.60    0.00    0.60     0  rcu_sched
12:29:12 AM     0       171    0.00    0.20    0.00    0.20     2  rcuos/1
12:29:12 AM     0       172    0.00    0.20    0.00    0.20     3  rcuos/2
12:29:12 AM     0       173    0.00    0.40    0.00    0.40     1  rcuos/3
12:29:12 AM     0       343    0.00    5.75    0.00    5.75     3  ksoftirqd/3
12:29:12 AM     0      2341    0.20    0.00    0.00    0.20     0  ds_agent
12:29:12 AM     0     14799    6.94    0.60    0.00    7.54     0  java
12:29:12 AM     0     59866    0.00    0.20    0.00    0.20     2  qemu-ga
12:29:12 AM     0     69419    0.00    2.38    0.00    2.38     1  java
12:29:12 AM     0     72129    0.00   10.12    0.00   10.12     0  java
12:29:12 AM     0     86118    4.96    5.36    0.00   10.32     2  java
12:29:12 AM     0     95136    0.00    7.34    0.00    7.34     1  java
12:29:12 AM     0     97574    0.20    0.00    0.00    0.20     0  dockerd
12:29:12 AM     0    101854    0.00    3.37    0.00    3.37     3  java
12:29:12 AM     0    147597    0.60    1.19    0.00    1.79     2  node_exporter
12:29:12 AM     0    148215   30.95   12.30    0.00   43.25     0  java
12:29:12 AM     0    150973    0.00    0.20    0.00    0.20     2  kworker/2:1
12:29:12 AM     0    152459    0.20    0.79    0.00    0.99     0  pidstat
12:29:12 AM     0    152902    0.00    5.16    0.00    5.16     2  java
12:29:12 AM     0    153166    0.00    6.35    0.00    6.35     3  java

明显观察到148215进程占用系统cpu时长达到了43.25%。

三、定位java线程

# 定位java线程
[root@srv-100-50-88-6 logs]# ps H -eo pid,tid,%cpu --sort=%cpu |grep 148215
148215 149006  0.0
148215 149062  0.0
148215 150694  0.0
148215 151011  0.0
148215 151615  0.0
148215 152570  0.0
148215 148217  0.1
148215 148218  0.1
148215 148219  0.1
148215 148220  0.1
148215 148313  0.1
148215 148331  0.1
148215 148531  0.1
148215 148893  0.1
148215 148895  0.1
148215 148243  0.2
148215 148310  0.2
148215 148343  0.2
148215 148539  0.2
148215 148540  0.2
148215 148541  0.2
148215 148890  0.2
148215 148891  0.2
148215 148892  0.2
148215 148894  0.2
148215 148896  0.2
148215 148734  0.4
148215 148221  0.8
148215 148227  0.8
148215 148377  0.9
148215 148378  0.9
148215 150403  0.9
148215 148329  1.2
148215 148328  1.3
148215 148376  1.3
148215 148385  1.3
148215 148567  1.6
148215 148327  1.7
148215 148333  1.7
148215 148216  2.1
148215 148225  3.1
148215 148226  3.1
148215 151191  3.6
# 将线程TID转换为16进制,为后面查找 jstack 日志做准备;这里转换了倒排的七个线程。
[root@srv-100-50-88-6 logs]# printf "0x%x\n" 151191
0x24e97
[root@srv-100-50-88-6 logs]# printf "0x%x\n" 148226
0x24302
[root@srv-100-50-88-6 logs]# printf "0x%x\n" 148225
0x24301
[root@srv-100-50-88-6 logs]# printf "0x%x\n" 148216
0x242f8
[root@srv-100-50-88-6 logs]# printf "0x%x\n" 148333
0x2436d
[root@srv-100-50-88-6 logs]# printf "0x%x\n" 148327
0x24367
[root@srv-100-50-88-6 logs]# printf "0x%x\n" 148567
0x24457

四、定位代码块

#使用jstack生成虚拟机当前时刻的线程快照
[root@srv-100-50-88-6 logs]# jstack -l 148215 >> jstacklog.out
#将当前堆栈信息保存为文件,通过16进制的TID查找问题所在的代码块;搜索得到对应的线程信息,可直观的看到代码异常信息。
[root@srv-100-50-88-6 logs]# sz jstacklog.out 

通过0x24e97这样的线程16进程,从jstacklog.out中找出具体的堆栈。


"Thread-166" #520 prio=10 os_prio=0 tid=0x00007f3290016800 nid=0x24e97 runnable [0x00007f32582db000]
   java.lang.Thread.State: RUNNABLE
    at java.net.SocketInputStream.socketRead0(Native Method)
    at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
    at java.net.SocketInputStream.read(SocketInputStream.java:171)
    at java.net.SocketInputStream.read(SocketInputStream.java:141)
    at sun.security.ssl.InputRecord.readFully(InputRecord.java:465)
    at sun.security.ssl.InputRecord.read(InputRecord.java:503)
    at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:975)
    - locked <0x00000006d212c5a0> (a java.lang.Object)
    at sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:933)
    at sun.security.ssl.AppInputStream.read(AppInputStream.java:105)
    - locked <0x00000006d2136e38> (a sun.security.ssl.AppInputStream)
    at java.io.FilterInputStream.read(FilterInputStream.java:133)
    at com.mysql.cj.protocol.FullReadInputStream.readFully(FullReadInputStream.java:64)
    at com.mysql.cj.protocol.a.SimplePacketReader.readHeader(SimplePacketReader.java:63)
    at com.mysql.cj.protocol.a.SimplePacketReader.readHeader(SimplePacketReader.java:45)
    at com.mysql.cj.protocol.a.TimeTrackingPacketReader.readHeader(TimeTrackingPacketReader.java:52)
    at com.mysql.cj.protocol.a.TimeTrackingPacketReader.readHeader(TimeTrackingPacketReader.java:41)
    at com.mysql.cj.protocol.a.MultiPacketReader.readHeader(MultiPacketReader.java:54)
    at com.mysql.cj.protocol.a.MultiPacketReader.readHeader(MultiPacketReader.java:44)
    at com.mysql.cj.protocol.a.NativeProtocol.readMessage(NativeProtocol.java:533)
    at com.mysql.cj.protocol.a.NativeProtocol.checkErrorMessage(NativeProtocol.java:703)
    at com.mysql.cj.protocol.a.NativeProtocol.sendCommand(NativeProtocol.java:642)
    at com.mysql.cj.protocol.a.NativeProtocol.sendCommand(NativeProtocol.java:128)
    at com.mysql.cj.NativeSession.sendCommand(NativeSession.java:311)
    at com.mysql.cj.NativeSession.queryServerVariable(NativeSession.java:1008)
    at com.mysql.cj.jdbc.ConnectionImpl.isReadOnly(ConnectionImpl.java:1406)
    at com.mysql.cj.jdbc.ConnectionImpl.isReadOnly(ConnectionImpl.java:1399)
    at com.mysql.cj.jdbc.ClientPreparedStatement.checkReadOnlySafeStatement(ClientPreparedStatement.java:307)
    - locked <0x00000006d21934e0> (a com.mysql.cj.jdbc.ConnectionImpl)
    at com.mysql.cj.jdbc.ClientPreparedStatement.execute(ClientPreparedStatement.java:317)
    - locked <0x00000006d21934e0> (a com.mysql.cj.jdbc.ConnectionImpl)
    at com.zaxxer.hikari.pool.ProxyPreparedStatement.execute(ProxyPreparedStatement.java:44)
    at com.zaxxer.hikari.pool.HikariProxyPreparedStatement.execute(HikariProxyPreparedStatement.java)
    at sun.reflect.GeneratedMethodAccessor501.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.ibatis.logging.jdbc.PreparedStatementLogger.invoke(PreparedStatementLogger.java:59)
    at com.sun.proxy.$Proxy637.execute(Unknown Source)
    at org.apache.ibatis.executor.statement.PreparedStatementHandler.update(PreparedStatementHandler.java:47)
    at org.apache.ibatis.executor.statement.RoutingStatementHandler.update(RoutingStatementHandler.java:74)
    at sun.reflect.GeneratedMethodAccessor900.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.ibatis.plugin.Plugin.invoke(Plugin.java:63)
    at com.sun.proxy.$Proxy635.update(Unknown Source)
    at com.baomidou.mybatisplus.core.executor.MybatisSimpleExecutor.doUpdate(MybatisSimpleExecutor.java:56)
    at org.apache.ibatis.executor.BaseExecutor.update(BaseExecutor.java:117)
    at sun.reflect.GeneratedMethodAccessor774.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.ibatis.plugin.Invocation.proceed(Invocation.java:49)
    at com.baomidou.mybatisplus.extension.plugins.MybatisPlusInterceptor.intercept(MybatisPlusInterceptor.java:83)
    at org.apache.ibatis.plugin.Plugin.invoke(Plugin.java:61)
    at com.sun.proxy.$Proxy634.update(Unknown Source)
    at org.apache.ibatis.session.defaults.DefaultSqlSession.update(DefaultSqlSession.java:197)
    at sun.reflect.GeneratedMethodAccessor898.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.mybatis.spring.SqlSessionTemplate$SqlSessionInterceptor.invoke(SqlSessionTemplate.java:426)
    at com.sun.proxy.$Proxy159.update(Unknown Source)
    at org.mybatis.spring.SqlSessionTemplate.update(SqlSessionTemplate.java:287)
    at com.baomidou.mybatisplus.core.override.MybatisMapperMethod.execute(MybatisMapperMethod.java:65)
    at com.baomidou.mybatisplus.core.override.MybatisMapperProxy$PlainMethodInvoker.invoke(MybatisMapperProxy.java:148)
    at com.baomidou.mybatisplus.core.override.MybatisMapperProxy.invoke(MybatisMapperProxy.java:89)
    at com.sun.proxy.$Proxy418.updateById(Unknown Source)
    at sun.reflect.GeneratedMethodAccessor905.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:344)
    at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:198)
    at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)
    at org.springframework.dao.support.PersistenceExceptionTranslationInterceptor.invoke(PersistenceExceptionTranslationInterceptor.java:139)
    at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
    at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:212)
    at com.sun.proxy.$Proxy419.updateById(Unknown Source)
    at com.baomidou.mybatisplus.extension.service.IService.updateById(IService.java:148)
    at com.baomidou.mybatisplus.extension.service.IService$$FastClassBySpringCGLIB$$f8525d18.invoke()
    at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:218)
    at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:687)
    at com.segmentfault.business.service.impl.stat.StatQualityAnalysisServiceImpl$$EnhancerBySpringCGLIB$$eb9ef323.updateById()
    at com.segmentfault.business.task.service.StatQualityAnalysisService.lambda$saveOrUpdateAnalysis$23(StatQualityAnalysisService.java:1260)
    at com.segmentfault.business.task.service.StatQualityAnalysisService$$Lambda$1908/872329305.accept(Unknown Source)
    at java.util.ArrayList.forEach(ArrayList.java:1257)
    at com.segmentfault.business.task.service.StatQualityAnalysisService.saveOrUpdateAnalysis(StatQualityAnalysisService.java:1246)
    at com.segmentfault.business.task.service.StatQualityAnalysisService.collectTodayQualityInfo(StatQualityAnalysisService.java:138)
    at com.segmentfault.business.task.StatQualityAnalysisJob.collectTodayQualityInfo(StatQualityAnalysisJob.java:33)
    at sun.reflect.GeneratedMethodAccessor2274.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at com.xxl.job.core.handler.impl.MethodJobHandler.execute(MethodJobHandler.java:29)
    at com.xxl.job.core.thread.JobThread.run(JobThread.java:152)

   Locked ownable synchronizers:
    - None


"C2 CompilerThread1" #6 daemon prio=9 os_prio=0 tid=0x00007f3310182800 nid=0x24302 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

   Locked ownable synchronizers:
    - None

"C2 CompilerThread0" #5 daemon prio=9 os_prio=0 tid=0x00007f331017f800 nid=0x24301 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

   Locked ownable synchronizers:
    - None


"DestroyJavaVM" #180 prio=5 os_prio=0 tid=0x00007f3310009800 nid=0x242f8 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

   Locked ownable synchronizers:
    - None


"com.alibaba.nacos.naming.beat.sender" #55 daemon prio=5 os_prio=0 tid=0x00007f3311ad9000 nid=0x2436d runnable [0x00007f326faf9000]
   java.lang.Thread.State: TIMED_WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x00000006c3640658> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
    at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
    at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1093)
    at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809)
    at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

   Locked ownable synchronizers:
    - None

"com.alibaba.nacos.naming.beat.sender" #49 daemon prio=5 os_prio=0 tid=0x00007f3310f93000 nid=0x24367 waiting on condition [0x00007f32a81d8000]
   java.lang.Thread.State: WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x00000006c3640658> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
    at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1088)
    at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809)
    at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

   Locked ownable synchronizers:
    - None            


"Thread-74" #185 prio=10 os_prio=0 tid=0x00007f32dc00c800 nid=0x24457 waiting on condition [0x00007f325cd21000]
   java.lang.Thread.State: WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x00000007a4ac0b90> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
    at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
    at org.apache.dubbo.common.threadpool.ThreadlessExecutor.waitAndDrain(ThreadlessExecutor.java:91)
    at org.apache.dubbo.rpc.AsyncRpcResult.get(AsyncRpcResult.java:179)
    at org.apache.dubbo.rpc.protocol.AsyncToSyncInvoker.invoke(AsyncToSyncInvoker.java:61)
    at org.apache.dubbo.rpc.listener.ListenerInvokerWrapper.invoke(ListenerInvokerWrapper.java:78)
    at org.apache.dubbo.monitor.support.MonitorFilter.invoke(MonitorFilter.java:91)
    at org.apache.dubbo.rpc.protocol.FilterNode.invoke(FilterNode.java:61)
    at org.apache.dubbo.rpc.protocol.dubbo.filter.FutureFilter.invoke(FutureFilter.java:52)
    at org.apache.dubbo.rpc.protocol.FilterNode.invoke(FilterNode.java:61)
    at com.segmentfault.common.user.interceptor.filter.ConsumerContextFilter.invoke(ConsumerContextFilter.java:56)
    at org.apache.dubbo.rpc.protocol.FilterNode.invoke(FilterNode.java:61)
    at org.apache.dubbo.rpc.filter.ConsumerContextFilter.invoke(ConsumerContextFilter.java:69)
    at org.apache.dubbo.rpc.protocol.FilterNode.invoke(FilterNode.java:61)
    at org.apache.dubbo.rpc.protocol.InvokerWrapper.invoke(InvokerWrapper.java:56)
    at org.apache.dubbo.rpc.cluster.support.FailoverClusterInvoker.doInvoke(FailoverClusterInvoker.java:79)
    at org.apache.dubbo.rpc.cluster.support.AbstractClusterInvoker.invoke(AbstractClusterInvoker.java:265)
    at org.apache.dubbo.rpc.cluster.interceptor.ClusterInterceptor.intercept(ClusterInterceptor.java:47)
    at org.apache.dubbo.rpc.cluster.support.wrapper.AbstractCluster$InterceptorInvokerNode.invoke(AbstractCluster.java:92)
    at org.apache.dubbo.rpc.cluster.support.wrapper.MockClusterInvoker.invoke(MockClusterInvoker.java:93)
    at org.apache.dubbo.registry.client.migration.MigrationInvoker.invoke(MigrationInvoker.java:170)
    at org.apache.dubbo.rpc.proxy.InvokerInvocationHandler.invoke(InvokerInvocationHandler.java:96)
    at org.apache.dubbo.common.bytecode.proxy12.getAreaNameByCode(proxy12.java)
    at com.segmentfault.business.system.api.utils.AreaInfoUtil.getAreaNameByCode(AreaInfoUtil.java:39)
    at com.segmentfault.business.service.impl.WhiteItemRemindConfigServiceImpl.remind(WhiteItemRemindConfigServiceImpl.java:164)
    at com.segmentfault.business.service.impl.WhiteItemRemindConfigServiceImpl$$FastClassBySpringCGLIB$$bcecf521.invoke()
    at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:218)
    at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:687)
    at com.segmentfault.business.service.impl.WhiteItemRemindConfigServiceImpl$$EnhancerBySpringCGLIB$$2071afb8.remind()
    at com.segmentfault.business.task.WhiteItemStateJob.lambda$whiteStateHandle$0(WhiteItemStateJob.java:63)
    at com.segmentfault.business.task.WhiteItemStateJob$$Lambda$1783/2031958241.accept(Unknown Source)
    at java.util.ArrayList.forEach(ArrayList.java:1257)
    at com.segmentfault.business.task.WhiteItemStateJob.whiteStateHandle(WhiteItemStateJob.java:51)
    at sun.reflect.GeneratedMethodAccessor1505.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at com.xxl.job.core.handler.impl.MethodJobHandler.execute(MethodJobHandler.java:29)
    at com.xxl.job.core.thread.JobThread.run(JobThread.java:152)

   Locked ownable synchronizers:
    - None    

两个xxljob任务、两个C2 CompilerThread、两个nacos进程、一个DestroyJavaVM守护进程。

五、引用

Java应用CPU占用过高问题排查

pidstat 命令详解

你可能感兴趣的:(服务器linuxjava)