程序运行卡的时候,我们经常需要定位到底程序在执行什么代码最消耗CPU,这里有个我经常用的小技巧,分享一下。
首先,Linux下面top命令可以直接找出最消耗CPU的进程
top
可以看到如下显示,我们看到前面有三个都是100% CPU占用的
top - 19:53:23 up 118 days, 6:09, 4 users, load average: 3.00, 2.68, 1.68
Tasks: 446 total, 1 running, 438 sleeping, 7 stopped, 0 zombie
Cpu(s): 8.6%us, 4.2%sy, 0.0%ni, 87.2%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 132103700k total, 76585104k used, 55518596k free, 424052k buffers
Swap: 16383996k total, 0k used, 16383996k free, 36290728k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
156888 hdfs 20 0 34.9g 544m 11m S 100.8 0.4 9:44.08 java
156928 hdfs 20 0 34.9g 545m 11m S 100.8 0.4 9:42.84 java
156769 hdfs 20 0 34.9g 544m 11m S 100.2 0.4 9:53.26 java
153980 hdfs 20 0 64.0g 24g 55m S 1.7 19.6 5879:56 presto-server
105 root 20 0 0 0 0 S 0.3 0.0 45:26.08 events/6
106 root 20 0 0 0 0 S 0.3 0.0 4:41.29 events/7 代码片
接下来,我们一个个分析,先找第一个最消耗CPU的,我们使用命令
top -Hp 156888
这个命令可以展示进程内最消耗CPU的线程,可以得到如下结果
top - 20:00:25 up 118 days, 6:16, 2 users, load average: 3.00, 2.91, 2.14
Tasks: 39 total, 1 running, 38 sleeping, 0 stopped, 0 zombie
Cpu(s): 9.3%us, 3.5%sy, 0.0%ni, 87.2%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 132103700k total, 76578608k used, 55525092k free, 424052k buffers
Swap: 16383996k total, 0k used, 16383996k free, 36290948k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
156926 hdfs 20 0 34.9g 544m 11m R 99.8 0.4 16:43.39 java
156901 hdfs 20 0 34.9g 544m 11m S 0.3 0.4 0:00.16 java
156888 hdfs 20 0 34.9g 544m 11m S 0.0 0.4 0:00.00 java
156889 hdfs 20 0 34.9g 544m 11m S 0.0 0.4 0:00.12 java
156890 hdfs 20 0 34.9g 544m 11m S 0.0 0.4 0:00.16 java
156891 hdfs 20 0 34.9g 544m 11m S 0.0 0.4 0:00.17 java
156892 hdfs 20 0 34.9g 544m 11m S 0.0 0.4 0:00.18 java
156893 hdfs 20 0 34.9g 544m 11m S 0.0 0.4 0:00.17 java
排名第一的便是我们这个进程下面的线程号了。
接下来,我们使用一个java中的jstack命令去定位哪一行:
jstack 156888
这样子直接获取信息内容是这样子的:
"Service Thread" #17 daemon prio=9 os_prio=0 tid=0x00002b47080f6800 nid=0x264fc runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C1 CompilerThread11" #16 daemon prio=9 os_prio=0 tid=0x00002b47080f3800 nid=0x264fb waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C1 CompilerThread10" #15 daemon prio=9 os_prio=0 tid=0x00002b47080f1000 nid=0x264fa waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C1 CompilerThread9" #14 daemon prio=9 os_prio=0 tid=0x00002b47080ef000 nid=0x264f9 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C1 CompilerThread8" #13 daemon prio=9 os_prio=0 tid=0x00002b47080ed000 nid=0x264f8 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
我们大致读得出这个线程号,但是我们需要定位最消耗cpu的那个线程,我们看到nid项其实就是线程号,不过是16进制输出的,所以我们也要把我们的线程号转换成16进制,可以直接用计算器,没问题,我这里用linux下面打印命令:
printf "%x\n" 156926
结果为:
264fe
这个时候我们可以用这个数字结合我们的jstack命令去查询了:
jstack 156888 |grep 264fe
我们这样子可以搜索到这个线程,但是只有一行结果
"thread1" #18 prio=5 os_prio=0 tid=0x00002b4708167000 nid=0x264fe runnable [0x00002b4723545000]
我们可以在grep前后增加要显示的行数,可以看到上下文的内容,完整命令如下:
jstack 156888 |grep -A 20 -B 10 264fe
我们可以输出我们要的结果
"thread1" #18 prio=5 os_prio=0 tid=0x00002b4708167000 nid=0x264fe runnable [0x00002b4723545000]
java.lang.Thread.State: RUNNABLE
at java.io.FileOutputStream.writeBytes(Native Method)
at java.io.FileOutputStream.write(FileOutputStream.java:326)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
- locked <0x0000000080208ea8> (a java.io.BufferedOutputStream)
at java.io.PrintStream.write(PrintStream.java:482)
- locked <0x00000000802048c0> (a java.io.PrintStream)
at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221)
at sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:291)
at sun.nio.cs.StreamEncoder.flushBuffer(StreamEncoder.java:104)
- locked <0x00000000802048a8> (a java.io.OutputStreamWriter)
at java.io.OutputStreamWriter.flushBuffer(OutputStreamWriter.java:185)
at java.io.PrintStream.write(PrintStream.java:527)
- eliminated <0x00000000802048c0> (a java.io.PrintStream)
at java.io.PrintStream.print(PrintStream.java:669)
at java.io.PrintStream.println(PrintStream.java:806)
- locked <0x00000000802048c0> (a java.io.PrintStream)
at Demo1_16.lambda$main$0(Demo1_16.java:6)
at Demo1_16$$Lambda$1/834600351.run(Unknown Source)
从内容上面看,我们尅看到是在第6行代码上面有个打印,我把完整代码贴出来:
public class Demo1_16 {
public static void main(String[] args) {
new Thread(()->{
while (true){
System.out.println("hi!");
}
},"thread1").start();
}
}
可以看到是有个死循环导致的,至于为什么有三个占用最高的,那是因为我同时启动了三个这个程序。
我们把命令可以直接连起来:
jstack 156888 |grep `printf "%x" 264fe` -C6 --color
要定位最消耗的代码,我们的思路是缩小范围,顺序是进程->线程->代码,最后再看实际代码去解决问题