1. 收集完整的数据信息
参考链接 - [url=http://www-01.ibm.com/support/docview.wss?rs=0&context=SSCMPB9&context=SSCMP9J&q1=MustGatherDocument&uid=swg21052641&loc=en_US&cs=utf-8&lang=#show-hide]MustGather: Performance, hang, or high CPU issues with WebSphere Application Server on AIX[/url]
2. 脚本运行结果和日志文件有
aixperf_RESULTS.tar.gz(本地目录生成)
three javacores ($WebSphere/profiles/AppSrv01/)
server logs (SystemOut.log, native_stderr.log,..../logs/server/)
3. 解压aixperf_RESULTS.tar.gz,打开sleep.prof,其中有占用CPU高的进程和线程列表
Process Freq Total Kernel User Shared Other
======= ==== ===== ====== ==== ====== =====
/usr/java5/jre/bin/java 432 69.30 2.65 0.00 2.01 64.64
wait 32 29.80 29.80 0.00 0.00 0.00
/usr/bin/runmqsc 96 0.45 0.18 0.00 0.27 0.00
/usr/bin/tprof 1 0.23 0.01 0.19 0.02 0.00
vmmd 1 0.08 0.08 0.00 0.00 0.00
...
Process PID TID Total Kernel User Shared Other
======= === === ===== ====== ==== ====== =====
r/java5/jre/bin/java 7012560 109772901 4.99 0.01 0.00 0.00 4.98
r/java5/jre/bin/java 7012560 85065811 4.92 0.01 0.00 0.00 4.91
r/java5/jre/bin/java 7012560 28770661 4.71 0.00 0.00 0.01 4.69
r/java5/jre/bin/java 7012560 5439747 4.65 0.00 0.00 0.01 4.63
r/java5/jre/bin/java 7012560 78970915 4.49 0.01 0.00 0.03 4.45
r/java5/jre/bin/java 7012560 35913987 4.41 0.01 0.00 0.01 4.40
r/java5/jre/bin/java 7012560 111083605 4.21 0.01 0.00 0.00 4.20
r/java5/jre/bin/java 7012560 87490635 4.04 0.01 0.00 0.01 4.03
r/java5/jre/bin/java 7012560 63635485 4.00 0.00 0.00 0.00 4.00
r/java5/jre/bin/java 7012560 86376471 3.83 0.00 0.00 0.02 3.81
r/java5/jre/bin/java 7012560 74842269 3.83 0.00 0.00 0.00 3.83
r/java5/jre/bin/java 11337904 72351955 3.73 0.01 0.00 0.02 3.70
r/java5/jre/bin/java 7012560 100335747 3.61 0.00 0.00 0.01 3.60
...
本例中7012560和11337904进程号对应的进程是WAS的两个server,可以看到其中一个7012560进程,所开的java线程占用CPU十分高,基本占3.5%以上。记下其中的TID号109772901(Thread ID),此进程号是十进制,转化为十六进制后为0x68B0065。
4. 在对应进程号的javacore文件中,搜索该十六进制进程号0x68B0065,即可查出该线程操作的堆栈数
"batchImportQuertz_Worker-9" J9VMThread:0x0000000032E9AE00, j9thread_t:0x000000012BAB1FA0, java/lang/Thread:0x0000000719D73DA8, state:CW, prio=5
(native thread ID:0x68B0065, native priority:0x5, native policy:UNKNOWN)
Java callstack:
at com/app/xxx/JobMonitorCommonService.triggerComplete(JobMonitorCommonService.java:328(Compiled Code))
at com/app/xxx/XXX.executeInternal(XXX.java:37)
at org/springframework/scheduling/xxx/XXX.execute(QuartzJobBean.java:86)
at org/xxx/core/JobRunShell.run(JobRunShell.java:216)
at org/quartz/simpl/SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:549(Compiled Code))
Native callstack:
_event_wait+0x2b8 (0x0900000000A9E9BC [libpthreads.a+0x169bc])
_cond_wait_local+0x4e4 (0x0900000000AAC768 [libpthreads.a+0x24768])
_cond_wait+0xbc (0x0900000000AACD40 [libpthreads.a+0x24d40])
pthread_cond_wait+0x1a8 (0x0900000000AAD9AC [libpthreads.a+0x259ac])
(0x0900000002A5B17C [libj9thr24.so+0x417c])
(0x0900000002A5AF40 [libj9thr24.so+0x3f40])
(0x0900000002A5AEE0 [libj9thr24.so+0x3ee0])
(0x09000000029DDB6C [libj9vm24.so+0x10b6c])
(0x09000000029E5AE8 [libj9vm24.so+0x18ae8])
(0x090000000305C2C8 [libj9jit24.so+0x54d2c8])
(0x09000000029D8E10 [libj9vm24.so+0xbe10])
(0x0900000002A6CF94 [libj9prt24.so+0x1f94])
(0x09000000029D8D30 [libj9vm24.so+0xbd30])
(0x0900000002A58C70 [libj9thr24.so+0x1c70])
_pthread_body+0xf0 (0x0900000000A8BD54 [libpthreads.a+0x3d54])