目录
前言:
1.问题背景
2.排查过程
2.1 crash log概要
2.2 crash log分析
2.2.1 手工分析
2.2.2 工具分析
2.3 问题排查
3.当前结论
参考链接:
内容是在我球的docs上直接复制过来的,懒得写两份,资源缺少的留言,我发你
文中部分链接需要cross greatwall。
时间:2019-04-03
情景:行情后端stock-api第五轮压测
描述:测试在不断的加压过程中,当QPS:500-600时。10.10.21.27 机器部署的服务stock-api,因为采用skyWalking的agent组件出现JVM crash现象。
具体测试报告:第五轮压测
JVM的crash log日志得出以下概要信息:
JVM信息:
系统信息:
运行时CPU、内存信息:
Memory: 4k page, physical 65733504k(5283224k free), swap 0k(0k free)
load average:21.87 16.41 12.45
当前的异常信息概要:
V代表虚拟机帧,其他类型:
导致crash线程信息分析:
通过对线程详细信息(寄存器、栈帧等)和线程栈分析导致虚拟机非预期终止的操作码是在jvm的内部导致。
导致问题的栈顶信息:
Stack: [0x00007f971d0f7000,0x00007f971d1f8000], sp=0x00007f971d1f6a30, free space=1022k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V [libjvm.so+0x4c0f9b] oopDesc::size()+0x2b
V [libjvm.so+0x4c2830] CMTask::deal_with_reference(oopDesc*)+0x180
V [libjvm.so+0x462164] ClassLoaderData::oops_do(OopClosure*, KlassClosure*, bool)+0x64
V [libjvm.so+0x648d40] InstanceMirrorKlass::oop_oop_iterate_nv(oopDesc*, G1CMOopClosure*)+0x40
V [libjvm.so+0x4c40a7] CMBitMapClosure::do_bit(unsigned long)+0xa7
V [libjvm.so+0x4bd6c4] CMTask::do_marking_step(double, bool, bool)+0x914
V [libjvm.so+0x4c3d8d] CMConcurrentMarkingTask::work(unsigned int)+0xdd
V [libjvm.so+0xacc17f] GangWorker::loop()+0xcf
V [libjvm.so+0x910de8] java_start(Thread*)+0x108
采用第三方辅助工具:CrashAnalysis(GitHub上的一个crash文件分析工具),得到以下内容:
诊断信息:
这是jvm的错误导致的问题
请根据后面给的问题点来进行分析,需要根据openjdk的实现来帮助分析问题。
线程信息中的上下文也会告诉你代码执行到什么地方出的错
在运行过程信息栏目中查看内部错误信息。
这种错误有两个大方向可以排查
1,操作系统方面:是否是系统资源问题或者是参数问题导致
2,有第三方动态库的调用,导致错误
如果不是以上情况,有可能是jdk的bug,换个系统,或者换个jdk吧。
可能问题点:
问题模块:
# V [libjvm.so+0x4c0f9b] oopDesc::size()+0x2b
-------------------------------------------------------
异常模块:
# SIGSEGV (0xb) at pc=0x00007f9801d60f9b, pid=2944, tid=140287005390592
-------------------------------------------------------
线程信息:
正在执行的线程信息:
Current thread (0x00007f97fc0a6000): ConcurrentGCThread [stack: 0x00007f971d0f7000,0x00007f971d1f8000] [id=3005]
-------------------------------------------------------
对应的堆栈信息:
Stack: [0x00007f971d0f7000,0x00007f971d1f8000], sp=0x00007f971d1f6a30, free space=1022k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V [libjvm.so+0x4c0f9b] oopDesc::size()+0x2b
V [libjvm.so+0x4c2830] CMTask::deal_with_reference(oopDesc*)+0x180
V [libjvm.so+0x462164] ClassLoaderData::oops_do(OopClosure*, KlassClosure*, bool)+0x64
V [libjvm.so+0x648d40] InstanceMirrorKlass::oop_oop_iterate_nv(oopDesc*, G1CMOopClosure*)+0x40
V [libjvm.so+0x4c40a7] CMBitMapClosure::do_bit(unsigned long)+0xa7
V [libjvm.so+0x4bd6c4] CMTask::do_marking_step(double, bool, bool)+0x914
V [libjvm.so+0x4c3d8d] CMConcurrentMarkingTask::work(unsigned int)+0xdd
V [libjvm.so+0xacc17f] GangWorker::loop()+0xcf
V [libjvm.so+0x910de8] java_start(Thread*)+0x108
-------------------------------------------------------
运行过程信息:
jvm异常信息:
Event: 1065.499 Thread 0x00007f92c80cd800 Exception (0x00000000eee2d518) thrown at [/RE-WORK/workspace/8-2-build-linux-amd64/jdk8u45/3457/hotspot/src/share/vm/prims/jni.cpp, line 735]
Event: 1065.499 Thread 0x00007f92c80cd800 Exception (0x00000000eeef34b0) thrown at [/RE-WORK/workspace/8-2-build-linux-amd64/jdk8u45/3457/hotspot/src/share/vm/prims/jni.cpp, line 735]
Event: 1065.500 Thread 0x00007f92d400c000 Exception (0x00000000ee887148) thrown at [/RE-WORK/workspace/8-2-build-linux-amd64/jdk8u45/3457/hotspot/src/share/vm/prims/jni.cpp, line 735]
Event: 1065.501 Thread 0x00007f92d400c000 Exception (0x00000000ee888e28) thrown at [/RE-WORK/workspace/8-2-build-linux-amd64/jdk8u45/3457/hotspot/src/share/vm/prims/jni.cpp, line 735]
Event: 1065.501 Thread 0x00007f92d4060000 Exception (0x00000000ef4140c0) thrown at [/RE-WORK/workspace/8-2-build-linux-amd64/jdk8u45/3457/hotspot/src/share/vm/prims/jni.cpp, line 735]
Event: 1065.501 Thread 0x00007f92d4058800 Exception (0x00000000ee532ab0) thrown at [/RE-WORK/workspace/8-2-build-linux-amd64/jdk8u45/3457/hotspot/src/share/vm/prims/jni.cpp, line 735]
Event: 1065.502 Thread 0x00007f92d4058800 Exception (0x00000000ee534790) thrown at [/RE-WORK/workspace/8-2-build-linux-amd64/jdk8u45/3457/hotspot/src/share/vm/prims/jni.cpp, line 735]
Event: 1067.101 Thread 0x00007f95e933a800 Exception (0x00000000f891e000) thrown at [/RE-WORK/workspace/8-2-build-linux-amd64/jdk8u45/3457/hotspot/src/share/vm/prims/jni.cpp, line 709]
Event: 1093.851 Thread 0x00007f92c8034800 Implicit null exception at 0x00007f97f2817963 to 0x00007f97f2817b11
Event: 1100.826 Thread 0x00007f95e933a800 Exception (0x00000000fa3bd060) thrown at [/RE-WORK/workspace/8-2-build-linux-amd64/jdk8u45/3457/hotspot/src/share/vm/prims/jni.cpp, line 709]
-------------------------------------------------------
编译事件:
Event: 1100.733 Thread 0x00007f97fd6cb800 nmethod 46822 0x00007f97f0ceee50 code [0x00007f97f0cef100, 0x00007f97f0cf0408]
Event: 1100.735 Thread 0x00007f97fd6c9800 nmethod 46823 0x00007f97f0cddd90 code [0x00007f97f0cde1a0, 0x00007f97f0ce20c8]
Event: 1100.805 Thread 0x00007f97fd6b4000 46824 4 java.util.concurrent.LinkedTransferQueue::xfer (292 bytes)
Event: 1100.824 Thread 0x00007f97fd6b4000 nmethod 46824 0x00007f97ee2c4350 code [0x00007f97ee2c44e0, 0x00007f97ee2c52f8]
Event: 1103.449 Thread 0x00007f97fd6d0000 46826 3 org.apache.catalina.util.LifecycleSupport::fireLifecycleEvent (59 bytes)
Event: 1103.449 Thread 0x00007f97fd6cd800 46825 3 org.apache.catalina.util.LifecycleBase::fireLifecycleEvent (10 bytes)
Event: 1103.449 Thread 0x00007f97fd6c7800 46827 3 org.apache.catalina.LifecycleEvent::
Event: 1103.451 Thread 0x00007f97fd6cd800 nmethod 46825 0x00007f97f0c7bd50 code [0x00007f97f0c7bec0, 0x00007f97f0c7c0e8]
Event: 1103.451 Thread 0x00007f97fd6c7800 nmethod 46827 0x00007f97f0d10550 code [0x00007f97f0d106e0, 0x00007f97f0d10b28]
Event: 1103.452 Thread 0x00007f97fd6d0000 nmethod 46826 0x00007f97efd957d0 code [0x00007f97efd959e0, 0x00007f97efd963a8]
-------------------------------------------------------
事件信息:
Event: 1106.688 Thread 0x00007f92e8046000 DEOPT PACKING pc=0x00007f97f2a081f5 sp=0x00007f92b2cc5300
Event: 1106.688 Thread 0x00007f92e8046000 DEOPT UNPACKING pc=0x00007f97ed047633 sp=0x00007f92b2cc50c8 mode 1
Event: 1106.688 Thread 0x00007f92e8046000 DEOPT PACKING pc=0x00007f97f293d004 sp=0x00007f92b2cc55e0
Event: 1106.688 Thread 0x00007f92e8046000 DEOPT UNPACKING pc=0x00007f97ed047633 sp=0x00007f92b2cc5340 mode 1
Event: 1106.688 Thread 0x00007f92e8046000 DEOPT PACKING pc=0x00007f97f3bc3280 sp=0x00007f92b2cc5650
Event: 1106.688 Thread 0x00007f92e8046000 DEOPT UNPACKING pc=0x00007f97ed047633 sp=0x00007f92b2cc5238 mode 1
Event: 1106.690 Thread 0x00007f9648e56000 DEOPT PACKING pc=0x00007f97f41d4d8c sp=0x00007f92aa53f9f0
Event: 1106.690 Thread 0x00007f9648e56000 DEOPT UNPACKING pc=0x00007f97ed047633 sp=0x00007f92aa53f438 mode 1
Event: 1106.690 Thread 0x00007f9648e56000 DEOPT PACKING pc=0x00007f97f41d394c sp=0x00007f92aa53f900
Event: 1106.690 Thread 0x00007f9648e56000 DEOPT UNPACKING pc=0x00007f97ed047633 sp=0x00007f92aa53f3d0 mode 1
-------------------------------------------------------
系统信息:
机器内存信息:
Memory: 4k page, physical 65733504k(5283224k free), swap 0k(0k free)
-------------------------------------------------------
通过2.2.1和2.2.2的问题分析,得出基本结论。
目前所运行的项目的JVM参数配置没有问题,所在机器的负载CPU和内存没有问题。
排查JVM的源码:
对g1ConcurrentMark.cpp的源码查看(不是很能看懂),发现源码版本已经更改,bugfix的Issues未找到。
对Oracle Java Bug Database搜索(查找G1垃圾回收的deal_with_reference方法),找到以下答复:
Comments
This issue is duplicate of JDK-8168914 as reported. Issue observed on 8u144 b01.
This issue is already fixed in 8u152 b04.
Kindly update to latest Java version to avoid this issue - http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html
Please do let us know if you still observe the issue.
该问题是有jdk的版本导致,之前已经有人发现。在jdk8中,update版本在152是修复。
对JDK的bugfix历史记录进行查看,发现该问题号:JDK-8168914 : Crash in ClassLoaderData/JNIHandleBlock::oops_do during concurrent marking在备份中有记录。
本次JVM crash的原因在于由于当前jdk版本8u45 b14版本过低,升级至8u152 b04以上可解决。
2019-04-11目前新版本为8u201,jdk地址:https://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html
---------------------------------------------------------------------------------------------------------------------------------------------
以上为2019-04-11的排查结果,如有疑问请沟通,如有更新,请关注~
1.https://blog.csdn.net/chenssy/article/details/78271744
2.http://www.raychase.net/1459
3.https://bugs.java.com/bugdatabase/view_bug.do?bug_id=8191009
4.https://bugs.java.com/bugdatabase/view_bug.do?bug_id=JDK-8168914
5.https://hg.openjdk.java.net/jdk/jdk
6.https://github.com/openjdk
7.https://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html