Java进程CPU占用率过高问题排查

问题发现

使用top命令:

cpu占用率过高的线程

6355 root 20 0 3624776 931128 7544 S 198.0 24.9 4643:34 java

可以看到进程PID为:6355的进程

此时的cpu占用率为:198%,内存使用率是:24.9%

问题排查

查看该进程的线程情况

根据cpu占比从高到底排列

ps -mp 6355 -o THREAD,tid,time | sort -rn

问题线程

此时,可以看出6613-6619的线程cpu占用率都比较高

查看问题线程堆栈

我们以TID为:6613的线程为例,来分析:

1、先将其线程ID转为16进制
2、使用jstack命令打印线程堆栈信息

将TID转换为16进制

命令:
printf "%x\n" 6613

image.png

使用jstack命令打印线程堆栈信息

命令:jstack pid |grep tid -A 30
Pid为进程号,tid为16进制的线程号
这里我们对应的就是:jstack 6355 |grep 19d5 -A 30

线程堆栈信息

"org.springframework.kafka.KafkaListenerEndpointContainer#0-0-C-1" #250 prio=5 os_prio=0 tid=0x00007f106dbf0800 nid=0x19d5 runnable [0x00007f10146e4000]
   java.lang.Thread.State: RUNNABLE
        at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
        at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
        at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93)
        at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
        - locked <0x00000000c9c22c18> (a sun.nio.ch.Util$3)
        - locked <0x00000000c9c22c08> (a java.util.Collections$UnmodifiableSet)
        - locked <0x00000000c9c1a748> (a sun.nio.ch.EPollSelectorImpl)
        at sun.nio.ch.SelectorImpl.selectNow(SelectorImpl.java:105)
        at org.apache.kafka.common.network.Selector.select(Selector.java:845)
        at org.apache.kafka.common.network.Selector.poll(Selector.java:469)
        at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:549)
        at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:262)
        at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:233)
        at org.apache.kafka.clients.consumer.KafkaConsumer.pollForFetches(KafkaConsumer.java:1308)
        at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1248)
        at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1216)
        at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.doPoll(KafkaMessageListenerContainer.java:1091)
        at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.pollAndInvoke(KafkaMessageListenerContainer.java:1047)
        at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.run(KafkaMessageListenerContainer.java:972)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.lang.Thread.run(Thread.java:748)

"ThreadPoolTaskScheduler-1" #249 prio=5 os_prio=0 tid=0x00007f106dadb800 nid=0x19d4 waiting on condition [0x00007f10147e5000]
   java.lang.Thread.State: TIMED_WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x00000000c9c41688> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)

可以看到这里是kafka的消费者线程造成的。

导出问题线程的堆栈信息到文件中

jstack -l 6355 >> /temp/6355.dump

问题处理

由于是由于kafka造成的,我们只需分析项目的kafka的使用情况就可以了

关于jstack命令

jstack命令

jstack命令可用于输出java进程的线程堆栈信息。

[root@iZ8vb698vy6k1v365g0tioZ publish]# jstack -help
Usage:
    jstack [-l] 
        (to connect to running process)
    jstack -F [-m] [-l] 
        (to connect to a hung process)
    jstack [-m] [-l]  
        (to connect to a core file)
    jstack [-m] [-l] [server_id@]
        (to connect to a remote debug server)

Options:
    -F  to force a thread dump. Use when jstack  does not respond (process is hung)
    -m  to print both java and native frames (mixed mode)
    -l  long listing. Prints additional information about locks
    -h or -help to print this help message

你可能感兴趣的:(Java进程CPU占用率过高问题排查)