第一步 :执行top命令,查出当前机器线程情况
top - 09:14:36 up 146 days, 20:24, 1 user, load average: 0.31, 0.37, 0.45
Tasks: 338 total, 1 running, 337 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.4%us, 0.4%sy, 0.0%ni, 99.2%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 16331456k total, 14017124k used, 2314332k free, 316724k buffers
Swap: 16506876k total, 36200k used, 16470676k free, 5151404k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
6107 root 20 0 9724m 1.3g 15m S 8.0 8.4 624:22.34 java
26678 root 20 0 9643m 498m 17m S 8.0 3.1 61:34.58 java
10168 root 20 0 12.9g 4.7g 12m S 1.7 30.0 329:11.58 java
11291 root 20 0 9654m 1.2g 15m S 0.7 7.7 35:17.04 java
28662 root 20 0 15168 1436 940 R 0.7 0.0 0:00.12 top
61 root 20 0 0 0 0 S 0.3 0.0 4:10.55 ksoftirqd/14
68 root 20 0 0 0 0 S 0.3 0.0 119:58.11 events/1
2987 root 20 0 0 0 0 S 0.3 0.0 31:22.27 flush-253:2
1 root 20 0 19344 1204 980 S 0.0 0.0 0:03.71 init
第二步:通过jstack命令dump当前内存中线程信息
jstack 26678
[root@172_16_1_177 logs]# jstack 26678
2017-08-30 10:34:29
Full thread dump Java HotSpot(TM) 64-Bit Server VM (24.79-b02 mixed mode):
"Attach Listener" daemon prio=10 tid=0x00007f96dc001000 nid=0x70d5 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"DubboServerHandler-172.16.1.177:28082-thread-50" daemon prio=10 tid=0x00007f967c001800 nid=0x70a1 waiting on condition [0x00007f9543040000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000ca928680> (a java.util.concurrent.SynchronousQueue$TransferStack)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:458)
at java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359)
at java.util.concurrent.SynchronousQueue.take(SynchronousQueue.java:925)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
"DubboServerHandler-172.16.1.177:28082-thread-49" daemon prio=10 tid=0x00007f95f820c000 nid=0x708e waiting on condition [0x00007f9543081000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000ca928680> (a java.util.concurrent.SynchronousQueue$TransferStack)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:458)
at java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359)
at java.util.concurrent.SynchronousQueue.take(SynchronousQueue.java:925)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
"DubboServerHandler-172.16.1.177:28082-thread-48" daemon prio=10 tid=0x00007f9674001800 nid=0x7085 waiting on condition [0x00007f95430c2000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000ca928680> (a java.util.concurrent.SynchronousQueue$TransferStack)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:458)
at java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359)
at java.util.concurrent.SynchronousQueue.take(SynchronousQueue.java:925)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
"DubboServerHandler-172.16.1.177:28082-thread-47" daemon prio=10 tid=0x00007f95f820a000 nid=0x7084 waiting on condition [0x00007f9543103000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000ca928680> (a java.util.concurrent.SynchronousQueue$TransferStack)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:458)
at java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359)
at java.util.concurrent.SynchronousQueue.take(SynchronousQueue.java:925)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
"DubboServerHandler-172.16.1.177:28082-thread-46" daemon prio=10 tid=0x00007f95f8207800 nid=0x7082 waiting on condition [0x00007f9543144000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000ca928680> (a java.util.concurrent.SynchronousQueue$TransferStack)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:458)
at java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359)
at java.util.concurrent.SynchronousQueue.take(SynchronousQueue.java:925)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
4.1、使用方案
cpu飙高,load高,响应很慢
方案:
* 一个请求过程中多次dump
* 对比多次dump文件的runnable线程,如果执行的方法有比较大变化,说明比较正常。如果在执行同一个方法,就有一些问题了。
查找占用cpu最多的线程信息
方案:
* 使用命令: top -H -p pid(pid为被测系统的进程号),找到导致cpu高的线程id。
上述Top命令找到的线程id,对应着dump thread信息中线程的nid,只不过一个是十进制,一个是十六进制。
* 在thread dump中,根据top命令查找的线程id,查找对应的线程堆栈信息。
cpu使用率不高但是响应很慢
方案:
* 进行dump,查看是否有很多thread struck在了i/o、数据库等地方,定位瓶颈原因。
请求无法响应
方案:
* 多次dump,对比是否所有的runnable线程都一直在执行相同的方法,如果是的,恭喜你,锁住了!
参考资料:http://blog.csdn.net/rachel_luo/article/details/8920596