一次Full GC 过程的日志分析


基础概念:
Full GC,新生代,旧生代和持久代都发生GC,说明这次GC是发生了Stop-The-World的
ParNew GC发生区域,和收集器有关,这里是ParNew收集器,如果为Serial收集器,则显示DefNew。如果Parallel Scavenge收集器,那它配套的新生代称为“PSYoungGen”
GC 332674.627 后面的数字为JVM启动后到现在的时间
4194304K->2097152K(4194304K),GC前大小->GC后大小(总量)后面是时间
12708916K->11511721K(18874368K), 方括号外的为JAVA堆的大小
CMS Perm : 62737K->62170K(262144K),持久带GC前后大小(总量)-XX:PermSize -XX:MaxPermSize
[Times: user=16.08 sys=0.67, real=2.01 secs] 这里面的user、sys和real与Linux的time命令所输出的时间含义一致,分别代表用户态消耗的CPU时间、内核态消耗的CPU事件和操作从开始到结束所经过的墙钟时间(Wall Clock Time),由于多核等情况前者时间会大雨real时间
[1 CMS-initial-mark: 9414569K(14680064K) 旧生代的使用量(总量),由JVM参数 -Xmx 减去 -Xmn的值决定

当时的log:
2016-03-07T16:19:41.876+0800: 332665.291: [GC 332665.291: [ParNew: 4194304K->2097152K(4194304K), 2.0051080 secs] 12708916K->11511721K(18874368K), 2.0059970 secs] [Times: user=16.08 sys=0.67, real=2.01 secs]
2016-03-07T16:19:43.886+0800: 332667.301: [GC [1 CMS-initial-mark: 9414569K(14680064K)] 11550068K(18874368K), 3.3386230 secs] [Times: user=3.33 sys=0.00, real=3.34 secs]
2016-03-07T16:19:47.226+0800: 332670.640: [CMS-concurrent-mark-start]
2016-03-07T16:19:51.212+0800: 332674.626: [GC 332674.627: [ParNew: 4194304K->2097152K(4194304K), 2.1854100 secs] 13608873K->12438922K(18874368K), 2.1863460 secs] [Times: user=17.75 sys=0.73, real=2.18 secs]
2016-03-07T16:19:57.435+0800: 332680.849: [GC 332680.850: [ParNew: 4194304K->2097152K(4194304K), 2.2184010 secs] 14536074K->13394507K(18874368K), 2.2193090 secs] [Times: user=17.90 sys=0.73, real=2.22 secs]
2016-03-07T16:20:03.112+0800: 332686.526: [GC 332686.527: [ParNew: 4194304K->1828266K(4194304K), 1.9129620 secs] 15491659K->14026818K(18874368K), 1.9138510 secs] [Times: user=15.57 sys=0.64, real=1.91 secs]
2016-03-07T16:20:09.003+0800: 332692.418: [GC 332692.418: [ParNew: 3925418K->2097152K(4194304K), 2.0599170 secs] 16123970K->15235066K(18874368K), 2.0608010 secs] [Times: user=17.08 sys=0.60, real=2.06 secs]
2016-03-07T16:20:14.993+0800: 332698.407: [GC 332698.408: [ParNew: 4194304K->2097152K(4194304K), 1.9636240 secs] 17332218K->16063831K(18874368K), 1.9645380 secs] [Times: user=16.40 sys=0.56, real=1.96 secs]
2016-03-07T16:20:20.981+0800: 332704.396: [GC 332704.396: [ParNew: 4194304K->4194304K(4194304K), 0.0000300 secs]332704.396: [CMS2016-03-07T16:20:26.292+0800: 332709.706: [CMS-concurrent-mark: 28.673/39.066 secs] [Times: user=198.02 sys=3.86, real=39.06 secs]
 (concurrent mode failure): 13966679K->14680064K(14680064K), 83.3313810 secs] 18160983K->15733347K(18874368K), [CMS Perm : 62737K->62170K(262144K)], 83.3323760 secs] [Times: user=93.59 sys=0.21, real=83.32 secs]
2016-03-07T16:21:44.355+0800: 332787.769: [GC [1 CMS-initial-mark: 14680064K(14680064K)] 15794750K(18874368K), 1.9568130 secs] [Times: user=1.96 sys=0.00, real=1.96 secs]
2016-03-07T16:21:46.312+0800: 332789.727: [CMS-concurrent-mark-start]
2016-03-07T16:22:06.161+0800: 332809.575: [CMS-concurrent-mark: 19.485/19.849 secs] [Times: user=80.27 sys=0.34, real=19.85 secs]
2016-03-07T16:22:06.161+0800: 332809.576: [CMS-concurrent-preclean-start]
2016-03-07T16:22:14.802+0800: 332818.216: [Full GC 332818.217: [CMS2016-03-07T16:22:39.945+0800: 332843.359: [CMS-concurrent-preclean: 33.775/33.784 secs] [Times: user=43.01 sys=0.00, real=33.78 secs]
 (concurrent mode failure): 14680064K->14680064K(14680064K), 107.8068800 secs] 18874368K->16636083K(18874368K), [CMS Perm : 62208K->62189K(262144K)], 107.8080400 secs] [Times: user=107.84 sys=0.00, real=107.79 secs]
2016-03-07T16:24:03.175+0800: 332926.590: [GC [1 CMS-initial-mark: 14680064K(14680064K)] 16799809K(18874368K), 3.6179980 secs] [Times: user=3.62 sys=0.00, real=3.62 secs]
2016-03-07T16:24:06.794+0800: 332930.208: [CMS-concurrent-mark-start]
2016-03-07T16:24:22.329+0800: 332945.744: [CMS-concurrent-mark: 15.519/15.535 secs] [Times: user=62.76 sys=0.00, real=15.54 secs]
2016-03-07T16:24:22.330+0800: 332945.744: [CMS-concurrent-preclean-start]
2016-03-07T16:24:33.575+0800: 332956.989: [Full GC 332956.990: [CMS2016-03-07T16:24:56.657+0800: 332980.071: [CMS-concurrent-preclean: 34.312/34.327 secs] [Times: user=46.40 sys=0.00, real=34.32 secs]
 (concurrent mode failure): 14680064K->14680063K(14680064K), 109.5823190 secs] 18874367K->17654739K(18874368K), [CMS Perm : 62193K->62189K(262144K)], 109.5831290 secs] [Times: user=109.65 sys=0.00, real=109.57 secs]
2016-03-07T16:26:23.920+0800: 333067.335: [GC [1 CMS-initial-mark: 14680063K(14680064K)] 17712480K(18874368K), 5.1742070 secs] [Times: user=5.18 sys=0.00, real=5.17 secs]
2016-03-07T16:26:29.095+0800: 333072.510: [CMS-concurrent-mark-start]
2016-03-07T16:26:44.112+0800: 333087.526: [Full GC 333087.526: [CMS2016-03-07T16:26:44.585+0800: 333087.999: [CMS-concurrent-mark: 15.467/15.490 secs] [Times: user=61.61 sys=0.00, real=15.49 secs]
 (concurrent mode failure): 14680063K->14680064K(14680064K), 77.7754890 secs] 18874367K->15185441K(18874368K), [CMS Perm : 62193K->62179K(262144K)], 77.7762560 secs] [Times: user=78.75 sys=0.00, real=77.76 secs]
2016-03-07T16:28:01.996+0800: 333165.410: [GC [1 CMS-initial-mark: 14680064K(14680064K)] 15279065K(18874368K), 1.2033830 secs] [Times: user=1.21 sys=0.00, real=1.20 secs]
2016-03-07T16:28:03.200+0800: 333166.614: [CMS-concurrent-mark-start]
2016-03-07T16:28:22.552+0800: 333185.966: [CMS-concurrent-mark: 19.330/19.352 secs] [Times: user=81.59 sys=0.00, real=19.35 secs]
2016-03-07T16:28:22.552+0800: 333185.966: [CMS-concurrent-preclean-start]
2016-03-07T16:28:35.178+0800: 333198.592: [Full GC 333198.593: [CMS2016-03-07T16:28:57.794+0800: 333221.208: [CMS-concurrent-preclean: 35.234/35.242 secs] [Times: user=50.21 sys=0.00, real=35.24 secs]
 (concurrent mode failure): 14680064K->14680063K(14680064K), 103.5000360 secs] 18874367K->15402236K(18874368K), [CMS Perm : 62184K->62180K(262144K)], 103.5009710 secs] [Times: user=103.52 sys=0.00, real=103.48 secs]
2016-03-07T16:30:18.751+0800: 333302.165: [GC [1 CMS-initial-mark: 14680063K(14680064K)] 15517749K(18874368K), 1.5607860 secs] [Times: user=1.57 sys=0.00, real=1.56 secs]
2016-03-07T16:30:20.312+0800: 333303.727: [CMS-concurrent-mark-start]
2016-03-07T16:30:42.220+0800: 333325.634: [CMS-concurrent-mark: 21.881/21.908 secs] [Times: user=88.56 sys=0.00, real=21.91 secs]
2016-03-07T16:30:42.220+0800: 333325.635: [CMS-concurrent-preclean-start]
2016-03-07T16:30:50.347+0800: 333333.761: [Full GC 333333.762: [CMS2016-03-07T16:31:16.589+0800: 333360.004: [CMS-concurrent-preclean: 34.361/34.369 secs] [Times: user=42.84 sys=0.00, real=34.36 secs]
 (concurrent mode failure): 14680063K->14680063K(14680064K), 106.9292860 secs] 18874367K->15439329K(18874368K), [CMS Perm : 62189K->62180K(262144K)], 106.9302400 secs] [Times: user=106.94 sys=0.00, real=106.92 secs]
2016-03-07T16:32:37.752+0800: 333441.166: [GC [1 CMS-initial-mark: 14680063K(14680064K)] 15681799K(18874368K), 1.9164530 secs]
2016-03-07T16:42:31.839+0800: 10.199: [GC 10.199: [ParNew: 2097152K->15901K(4194304K), 0.0188800 secs] 2097152K->15901K(18874368K), 0.0190050 secs] [Times: user=0.13 sys=0.01, real=0.02 secs]
2016-03-07T16:52:30.715+0800: 609.075: [GC 609.076: [ParNew: 2099051K->291937K(4194304K), 0.7579600 secs] 2099051K->291937K(18874368K), 0.7583510 secs] [Times: user=0.85 sys=0.37, real=0.76 secs]
2016-03-07T17:02:02.482+0800: 1180.842: [GC 1180.843: [ParNew: 2389089K->374558K(4194304K), 0.7744590 secs] 2389089K->374558K(18874368K), 0.7748880 secs] [Times: user=0.84 sys=0.36, real=0.78 secs]
Heap
 par new generation   total 4194304K, used 1872694K [0x00000002f0000000, 0x0000000470000000, 0x0000000470000000)
  eden space 2097152K,  71% used [0x00000002f0000000, 0x000000034b705f88, 0x0000000370000000)
  from space 2097152K,  17% used [0x00000003f0000000, 0x0000000406dc7908, 0x0000000470000000)
  to   space 2097152K,   0% used [0x0000000370000000, 0x0000000370000000, 0x00000003f0000000)
 concurrent mark-sweep generation total 14680064K, used 0K [0x0000000470000000, 0x00000007f0000000, 0x00000007f0000000)
 concurrent-mark-sweep perm gen total 262144K, used 50238K [0x00000007f0000000, 0x0000000800000000, 0x0000000800000000)




当时的JVM参数:
-server -Xms20g -Xmx20g -Xmn6g -XX:PermSize=256m -XX:MaxPermSize=256m -Xss256K -XX:+DisableExplicitGC -XX:SurvivorRatio=1 -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:+UseCMSCompactAtFullCollection -XX:LargePageSizeInBytes=128m -XX:+UseFastAccessorMethods -XX:+UseCMSInitiatingOccupancyOnly -XX:CMSInitiatingOccupancyFraction=60 -Dconf.type=confcenter -Djava.awt.headless=true -Dfile.encoding=gbk


GC过程:
ParNew收集器是年轻代的并行GC,在对新生代GC时需要配合旧生代CMS GC,CMS(concurrent mark sweep)是一种并发GC,对应用的暂停时间短,GC要进行三次mark,时间长。
CMS GC触发,开始进行三次标记:
第一次CMS-initial-mark,该步骤暂停整个应用,并对旧生代对象扫描着色
第二次CMS-concurrent-mark,并发标记,恢复应用线程,对着色的对象进行轮询,以标记这些对象的可访问对象
第三次final mark(remark),重新着色,需要暂停整个应用,因为步骤二时,程序可能会改变饮用关系,所以需要重新着色
最后,sweep,恢复应用所有线程,将没有标记的对象回收。这是一次正常的过程,而该日志中出现了错误,稍后会进行分析。
由于CMS会产生碎片,所以通过-XX:+UseCMSCompactAtFullCollection 来使得每次Full GC时重新整理内存,这个整理过程也会暂停整个应用
CMS默认会开启多个线程(数目为(并行GC线程数+3)/4),通过-XX:CMSInitiatingOccupancyFraction=60,设置旧生代使用比例达到该比例后,触发CMS GC


进一步分析日志,日志中的重叠(一行中多条日志)是由于并发操作导致的
CMS-concurrent-preclean 过程重新扫描concurrent mark阶段heap中被新创建的对象,或从新生代晋升到旧生代对象的引用关系,减少remark步骤的时间
 concurrent mode failure 触发了Full Gc,该错误是由于在执行CMS GC的时候,同时有对象要放入旧生代,而此时旧生代的空间不足。
 应对措施:增大旧生代空间或者调低并发GC的比率,但在jdk5,jdk6中存在jdk的bug,导致CMS的remark完毕后,很久才出发sweep动作,这种情况可以通过设置-XX:CMSMaxAbortablePrecleanTime=5(单位ms)来避免


总体上,该日志描述了,在进行CMS GC的时候,年轻代GC也在执行,并且向旧生代放入对象,此时由于CMS GC还没有执行完,因此旧生代空间不足,所以触发了concurrent mode failure,导致Full GC。从数值分析,CMS GC时旧生代正好剩余5G,按理说可以分配下新生代的对象,但是可能因为碎片的存在导致无法分配,导致Full GC失败,整个程序Crash掉,之后应用自动重启,恢复正常。
程序恢复后打印了堆的状态,新生代一共6G,eden2G,两个survivor每个2G,旧生代14G,持久代256Mb

你可能感兴趣的:(JAVA)