序
本文主要试用一下JDK12新引入的ShenandoahGC
ShenandoahGC
Shenandoah是一款concurrent及parallel的垃圾收集器
- 跟ZGC一样也是面向low-pause-time的垃圾收集器,不过ZGC是基于colored pointers来实现,而Shenandoah GC是基于brooks pointers来实现
- 与G1 GC相比,G1的evacuation是parallel的但不是concurrent,而Shenandoah的evacuation是concurrent,因而能更好地减少pause time
- 与G1 GC一样,ShenandoahGC也是基于region的GC,不同的是ShenandoahGC在逻辑上没有分代,因而就没有young/old
GC cycle
ShenandoahGC主要有如下几个阶段:
Snapshot-at-the-beginning concurrent mark
这里包含Init Mark(
Pause
)、Concurrent Mark、Final Mark(Pause
);这里使用到了White(not yet visited
)、Gray(visited, but references are not scanned yet
)、Black(visited, and fully scanned
) Color算法进行mark
Concurrent evacuation
这个就是与G1不同的evacuation阶段,它是concurrent的;这里用到了Brooks Pointers(
object version change with additional atomically changed indirection
)算法进行copy,
Concurrent update references (optional)
这里包含Init update Refs(
Pause
)、Concurrent update Refs、Final update Refs(Pause
)
Final Mark或者Final update Refs之后都可能进行Concurrent cleanup,进行垃圾回收,reclaims region
相关参数
Shenandoah开头的参数
bool ShenandoahAcmpBarrier = true {diagnostic} {default}
bool ShenandoahAllocFailureALot = false {diagnostic} {default}
uintx ShenandoahAllocSpikeFactor = 5 {experimental} {default}
intx ShenandoahAllocationStallThreshold = 10000 {diagnostic} {default}
uintx ShenandoahAllocationThreshold = 0 {experimental} {default}
bool ShenandoahAllocationTrace = false {diagnostic} {default}
bool ShenandoahAllowMixedAllocs = true {diagnostic} {default}
bool ShenandoahAlwaysClearSoftRefs = false {experimental} {default}
bool ShenandoahAlwaysPreTouch = false {diagnostic} {default}
bool ShenandoahCASBarrier = true {diagnostic} {default}
bool ShenandoahCloneBarrier = true {diagnostic} {default}
uintx ShenandoahCodeRootsStyle = 2 {experimental} {default}
bool ShenandoahCommonGCStateLoads = false {experimental} {default}
bool ShenandoahConcurrentScanCodeRoots = true {experimental} {default}
uintx ShenandoahControlIntervalAdjustPeriod = 1000 {experimental} {default}
uintx ShenandoahControlIntervalMax = 10 {experimental} {default}
uintx ShenandoahControlIntervalMin = 1 {experimental} {default}
uintx ShenandoahCriticalFreeThreshold = 1 {experimental} {default}
bool ShenandoahDecreaseRegisterPressure = false {diagnostic} {default}
bool ShenandoahDegeneratedGC = true {diagnostic} {default}
bool ShenandoahDontIncreaseWBFreq = true {experimental} {default}
bool ShenandoahElasticTLAB = true {diagnostic} {default}
uintx ShenandoahEvacAssist = 10 {experimental} {default}
uintx ShenandoahEvacReserve = 5 {experimental} {default}
bool ShenandoahEvacReserveOverflow = true {experimental} {default}
double ShenandoahEvacWaste = 1.200000 {experimental} {default}
uintx ShenandoahFreeThreshold = 10 {experimental} {default}
uintx ShenandoahFullGCThreshold = 3 {experimental} {default}
ccstr ShenandoahGCHeuristics = adaptive {experimental} {default}
uintx ShenandoahGarbageThreshold = 60 {experimental} {default}
uintx ShenandoahGuaranteedGCInterval = 300000 {experimental} {default}
size_t ShenandoahHeapRegionSize = 0 {experimental} {default}
bool ShenandoahHumongousMoves = true {experimental} {default}
intx ShenandoahHumongousThreshold = 100 {experimental} {default}
uintx ShenandoahImmediateThreshold = 90 {experimental} {default}
bool ShenandoahImplicitGCInvokesConcurrent = true {experimental} {default}
uintx ShenandoahInitFreeThreshold = 70 {experimental} {default}
bool ShenandoahKeepAliveBarrier = true {diagnostic} {default}
uintx ShenandoahLearningSteps = 5 {experimental} {default}
bool ShenandoahLoopOptsAfterExpansion = true {experimental} {default}
uintx ShenandoahMarkLoopStride = 1000 {experimental} {default}
intx ShenandoahMarkScanPrefetch = 32 {experimental} {default}
size_t ShenandoahMaxRegionSize = 33554432 {experimental} {default}
uintx ShenandoahMergeUpdateRefsMaxGap = 200 {experimental} {default}
uintx ShenandoahMergeUpdateRefsMinGap = 100 {experimental} {default}
uintx ShenandoahMinFreeThreshold = 10 {experimental} {default}
size_t ShenandoahMinRegionSize = 262144 {experimental} {default}
bool ShenandoahOOMDuringEvacALot = false {diagnostic} {default}
bool ShenandoahOptimizeInstanceFinals = false {experimental} {default}
bool ShenandoahOptimizeStableFinals = false {experimental} {default}
bool ShenandoahOptimizeStaticFinals = true {experimental} {default}
bool ShenandoahPacing = true {experimental} {default}
uintx ShenandoahPacingCycleSlack = 10 {experimental} {default}
uintx ShenandoahPacingIdleSlack = 2 {experimental} {default}
uintx ShenandoahPacingMaxDelay = 10 {experimental} {default}
double ShenandoahPacingSurcharge = 1.100000 {experimental} {default}
uintx ShenandoahParallelRegionStride = 1024 {experimental} {default}
uint ShenandoahParallelSafepointThreads = 4 {experimental} {default}
bool ShenandoahPreclean = true {experimental} {default}
bool ShenandoahReadBarrier = true {diagnostic} {default}
uintx ShenandoahRefProcFrequency = 5 {experimental} {default}
bool ShenandoahRegionSampling = true {experimental} {command line}
int ShenandoahRegionSamplingRate = 40 {experimental} {default}
bool ShenandoahSATBBarrier = true {diagnostic} {default}
uintx ShenandoahSATBBufferFlushInterval = 100 {experimental} {default}
size_t ShenandoahSATBBufferSize = 1024 {experimental} {default}
bool ShenandoahStoreCheck = false {diagnostic} {default}
bool ShenandoahStoreValEnqueueBarrier = false {diagnostic} {default}
bool ShenandoahStoreValReadBarrier = true {diagnostic} {default}
bool ShenandoahSuspendibleWorkers = false {experimental} {default}
size_t ShenandoahTargetNumRegions = 2048 {experimental} {default}
bool ShenandoahTerminationTrace = false {diagnostic} {default}
bool ShenandoahUncommit = true {experimental} {default}
uintx ShenandoahUncommitDelay = 300000 {experimental} {default}
uintx ShenandoahUnloadClassesFrequency = 0 {experimental} {default}
ccstr ShenandoahUpdateRefsEarly = adaptive {experimental} {default}
bool ShenandoahVerify = false {diagnostic} {default}
intx ShenandoahVerifyLevel = 4 {diagnostic} {default}
bool ShenandoahWriteBarrier = true {diagnostic} {default}
其中有一些是diagnostic用的,比如ShenandoahAcmpBarrier、ShenandoahAllocFailureALot、ShenandoahAllocationStallThreshold等
Heuristics相关参数
ccstr ShenandoahGCHeuristics = adaptive {experimental} {default}
uintx ShenandoahInitFreeThreshold = 70 {experimental} {default}
uintx ShenandoahMinFreeThreshold = 10 {experimental} {default}
uintx ShenandoahAllocSpikeFactor = 5 {experimental} {default}
uintx ShenandoahGarbageThreshold = 60 {experimental} {default}
uintx ShenandoahFreeThreshold = 10 {experimental} {default}
uintx ShenandoahAllocationThreshold = 0 {experimental} {default}
ccstr ShenandoahUpdateRefsEarly = adaptive {experimental} {default}
Heuristics主要用于告诉Shenandoah何时启动一个GC cycle,其中ShenandoahGCHeuristics用于选择不同的策略,其可选值有adaptive(
默认
)、static、compact、passive(diagnostic用
)、aggressive(diagnostic用
)
- adaptive方式主要通过ShenandoahInitFreeThreshold(
Initial remaining free heap threshold for learning steps
)、ShenandoahMinFreeThreshold(free space threshold at which heuristics triggers the GC unconditionally
)、ShenandoahAllocSpikeFactor(How much heap to reserve for absorbing allocation spikes
)、XX:ShenandoahGarbageThreshold(Sets the percentage of garbage a region need to contain before it can be marked for collection
)来设置合适启动GC cycle - static方式主要是基于heap occupancy以及allocation pressure来决定是否启动GC cycle,相关参数有:ShenandoahFreeThreshold(
Set the percentage of free heap at which a GC cycle is started
)、ShenandoahAllocationThreshold(Set percentage of memory allocated since last GC cycle before a new GC cycle is started
)、ShenandoahGarbageThreshold - compact方式是continuous方式的,只要有allocation发生,上一个GC cycle结束之后就启动新的GC cycle,相关参数有ConcGCThreads(
Trim down the number of concurrent GC threads to make more room for application to run
)、ShenandoahAllocationThreshold - passive方式是完全passive,当内存耗尽时触发STW,通常用于diagnostic
- aggressive方式是完全active的,上一个GC cycle结束之后就启动新的GC cycle(
有点类似compact方式
),不过它会evacuate所有的live objects,通常用于diagnostic
Failure Modes
当allocation failure发生的时候,Shenandoah有一些优雅的degradation ladder用于处理这种情况,如下:
- Pacing(
<10 ms
)
ShenandoahPacing参数默认开启,Pacer用于在gc不够快的时候去stall正在分配对象的线程,当gc速度跟上来了就解除对这些线程的stall;stall不是无期限的,有个ShenandoahPacingMaxDelay(
单位毫秒
)参数可以设置,一旦超过该值allocation就会产生。当allocation压力大的时候,Pacer就无能为力了,这个时候就会进入下一个step
- Degenerated GC(
<100 ms
)
ShenandoahDegeneratedGC参数默认开启,在这个Degenerated cycle,Shenandoah使用的线程数取之于ParallelGCThreads而非ConcCGThreads
- Full GC(
>100 ms
)
当Degenerated GC之后还没有足够的内存,则进入Full GC cycle,它会尽可能地进行compact然后释放内存以确保不发生OOM
实例
启动参数
-server -XX:+UnlockExperimentalVMOptions -XX:+UseShenandoahGC -XX:+UsePerfData -XX:+ShenandoahRegionSampling -XX:ParallelGCThreads=4 -XX:ConcGCThreads=4 -XX:+UnlockDiagnosticVMOptions -Xlog:age*,ergo*,gc*=info
gc日志
[2019-03-21T15:12:53.771-0800][8707][gc] Consider -XX:+ClassUnloadingWithConcurrentMark if large pause times are observed on class-unloading sensitive workloads
[2019-03-21T15:12:53.862-0800][8707][gc,init] Regions: 2048 x 1024K
[2019-03-21T15:12:53.862-0800][8707][gc,init] Humongous object threshold: 1024K
[2019-03-21T15:12:53.863-0800][8707][gc,init] Max TLAB size: 1024K
[2019-03-21T15:12:53.863-0800][8707][gc,init] GC threads: 4 parallel, 4 concurrent
[2019-03-21T15:12:53.863-0800][8707][gc,init] Reference processing: parallel
[2019-03-21T15:12:53.864-0800][8707][gc ] Heuristics ergonomically sets -XX:+ExplicitGCInvokesConcurrent
[2019-03-21T15:12:53.864-0800][8707][gc ] Heuristics ergonomically sets -XX:+ShenandoahImplicitGCInvokesConcurrent
[2019-03-21T15:12:53.864-0800][8707][gc,init] Shenandoah heuristics: adaptive
[2019-03-21T15:12:53.864-0800][8707][gc,heap] Initialize Shenandoah heap with initial size 128M
[2019-03-21T15:12:53.865-0800][8707][gc,ergo] Pacer for Idle. Initial: 40M, Alloc Tax Rate: 1.0x
[2019-03-21T15:12:53.883-0800][8707][gc,init] Safepointing mechanism: global-page poll
[2019-03-21T15:12:53.883-0800][8707][gc ] Using Shenandoah
[2019-03-21T15:12:53.884-0800][8707][gc,heap,coops] Heap address: 0x0000000780000000, size: 2048 MB, Compressed Oops mode: Zero based, Oop shift amount: 3
[2019-03-21T15:12:59.530-0800][14083][gc ] Trigger: Metadata GC Threshold
[2019-03-21T15:12:59.532-0800][14083][gc,ergo ] Free: 1813M (1813 regions), Max regular: 1024K, Max humongous: 1855488K, External frag: 1%, Internal frag: 0%
[2019-03-21T15:12:59.532-0800][14083][gc,ergo ] Evacuation Reserve: 103M (103 regions), Max regular: 1024K
[2019-03-21T15:12:59.532-0800][14083][gc,start ] GC(0) Concurrent reset
[2019-03-21T15:12:59.533-0800][14083][gc,task ] GC(0) Using 4 of 4 workers for concurrent reset
[2019-03-21T15:12:59.533-0800][14083][gc ] GC(0) Concurrent reset 132M->132M(2048M) 0.441ms
[2019-03-21T15:12:59.533-0800][15619][gc,start ] GC(0) Pause Init Mark (process weakrefs) (unload classes)
[2019-03-21T15:12:59.533-0800][15619][gc,task ] GC(0) Using 4 of 4 workers for init marking
[2019-03-21T15:12:59.541-0800][15619][gc,ergo ] GC(0) Pacer for Mark. Expected Live: 204M, Free: 1813M, Non-Taxable: 181M, Alloc Tax Rate: 0.4x
[2019-03-21T15:12:59.541-0800][15619][gc ] GC(0) Pause Init Mark (process weakrefs) (unload classes) 7.568ms
[2019-03-21T15:12:59.541-0800][14083][gc,start ] GC(0) Concurrent marking (process weakrefs) (unload classes)
[2019-03-21T15:12:59.541-0800][14083][gc,task ] GC(0) Using 4 of 4 workers for concurrent marking
[2019-03-21T15:12:59.619-0800][14083][gc ] GC(0) Concurrent marking (process weakrefs) (unload classes) 132M->134M(2048M) 78.373ms
[2019-03-21T15:12:59.619-0800][14083][gc,start ] GC(0) Concurrent precleaning
[2019-03-21T15:12:59.619-0800][14083][gc,task ] GC(0) Using 1 of 4 workers for concurrent preclean
[2019-03-21T15:12:59.622-0800][14083][gc ] GC(0) Concurrent precleaning 134M->134M(2048M) 2.397ms
[2019-03-21T15:12:59.622-0800][15619][gc,start ] GC(0) Pause Final Mark (process weakrefs) (unload classes)
[2019-03-21T15:12:59.622-0800][15619][gc,task ] GC(0) Using 4 of 4 workers for final marking
[2019-03-21T15:12:59.625-0800][15619][gc,stringtable] GC(0) Cleaned string table, strings: 13692 processed, 50 removed
[2019-03-21T15:12:59.626-0800][15619][gc,ergo ] GC(0) Adaptive CSet Selection. Target Free: 204M, Actual Free: 1914M, Max CSet: 85M, Min Garbage: 0M
[2019-03-21T15:12:59.626-0800][15619][gc,ergo ] GC(0) Collectable Garbage: 117M (97% of total), 8M CSet, 126 CSet regions
[2019-03-21T15:12:59.626-0800][15619][gc,ergo ] GC(0) Immediate Garbage: 0M (0% of total), 0 regions
[2019-03-21T15:12:59.626-0800][15619][gc,ergo ] GC(0) Pacer for Evacuation. Used CSet: 126M, Free: 1811M, Non-Taxable: 181M, Alloc Tax Rate: 1.1x
[2019-03-21T15:12:59.626-0800][15619][gc ] GC(0) Pause Final Mark (process weakrefs) (unload classes) 4.712ms
[2019-03-21T15:12:59.626-0800][14083][gc,start ] GC(0) Concurrent cleanup
[2019-03-21T15:12:59.627-0800][14083][gc ] GC(0) Concurrent cleanup 134M->135M(2048M) 0.132ms
[2019-03-21T15:12:59.627-0800][14083][gc,ergo ] GC(0) Free: 1810M (1810 regions), Max regular: 1024K, Max humongous: 1852416K, External frag: 1%, Internal frag: 0%
[2019-03-21T15:12:59.627-0800][14083][gc,ergo ] GC(0) Evacuation Reserve: 102M (103 regions), Max regular: 1024K
[2019-03-21T15:12:59.627-0800][14083][gc,start ] GC(0) Concurrent evacuation
[2019-03-21T15:12:59.627-0800][14083][gc,task ] GC(0) Using 4 of 4 workers for concurrent evacuation
[2019-03-21T15:12:59.643-0800][14083][gc ] GC(0) Concurrent evacuation 135M->145M(2048M) 15.912ms
[2019-03-21T15:12:59.643-0800][15619][gc,start ] GC(0) Pause Init Update Refs
[2019-03-21T15:12:59.643-0800][15619][gc,ergo ] GC(0) Pacer for Update Refs. Used: 145M, Free: 1810M, Non-Taxable: 181M, Alloc Tax Rate: 1.1x
[2019-03-21T15:12:59.643-0800][15619][gc ] GC(0) Pause Init Update Refs 0.090ms
[2019-03-21T15:12:59.643-0800][14083][gc,start ] GC(0) Concurrent update references
[2019-03-21T15:12:59.643-0800][14083][gc,task ] GC(0) Using 4 of 4 workers for concurrent reference update
[2019-03-21T15:12:59.652-0800][14083][gc ] GC(0) Concurrent update references 145M->147M(2048M) 9.028ms
[2019-03-21T15:12:59.652-0800][15619][gc,start ] GC(0) Pause Final Update Refs
[2019-03-21T15:12:59.652-0800][15619][gc,task ] GC(0) Using 4 of 4 workers for final reference update
[2019-03-21T15:12:59.653-0800][15619][gc ] GC(0) Pause Final Update Refs 0.489ms
[2019-03-21T15:12:59.653-0800][14083][gc,start ] GC(0) Concurrent cleanup
[2019-03-21T15:12:59.653-0800][14083][gc ] GC(0) Concurrent cleanup 147M->21M(2048M) 0.088ms
[2019-03-21T15:12:59.653-0800][14083][gc,ergo ] Free: 1924M (1924 regions), Max regular: 1024K, Max humongous: 1840128K, External frag: 7%, Internal frag: 0%
[2019-03-21T15:12:59.653-0800][14083][gc,ergo ] Evacuation Reserve: 103M (103 regions), Max regular: 1024K
[2019-03-21T15:12:59.653-0800][14083][gc,ergo ] Pacer for Idle. Initial: 40M, Alloc Tax Rate: 1.0x
[2019-03-21T15:17:59.666-0800][14083][gc ] Trigger: Time since last GC (300009 ms) is larger than guaranteed interval (300000 ms)
gc visualizer
有个shenandoah-visualizer工具可以用来可视化ShenandoahGC,可视化效果如下:
小结
- Shenandoah是一款concurrent及parallel的垃圾收集器;跟ZGC一样也是面向low-pause-time的垃圾收集器,不过ZGC是基于colored pointers来实现,而Shenandoah GC是基于brooks pointers来实现;与G1 GC相比,G1的evacuation是parallel的但不是concurrent,而Shenandoah的evacuation是concurrent,因而能更好地减少pause time;与G1 GC一样,ShenandoahGC也是基于region的GC,不同的是ShenandoahGC在逻辑上没有分代,因而就没有young/old
- Shenandoah的GC cycle主要有Snapshot-at-the-beginning concurrent mark
包括Init Mark(
Pause)、Concurrent Mark、Final Mark(
Pause)
、Concurrent evacuation、Concurrent update references (optional)包括Init update Refs(
Pause)、Concurrent update Refs、Final update Refs(
Pause)
;其中Final Mark或者Final update Refs之后都可能进行Concurrent cleanup,进行垃圾回收,reclaims region - Heuristics主要用于告诉Shenandoah何时启动一个GC,其中ShenandoahGCHeuristics用于选择不同的策略,其可选值有adaptive(
默认
)、static、compact、passive(diagnostic用
)、aggressive(diagnostic用
);另外当allocation failure发生的时候,Shenandoah有一些优雅的degradation ladder用于处理这种情况,包括Pacing(<10 ms
)、Degenerated GC(<100 ms
)、Full GC(>100 ms
)
doc
- Shenandoah GC
- JEP 189: Shenandoah: A Low-Pause-Time Garbage Collector (Experimental)
- Changes to Garbage Collection in Java 12
- 9 Garbage-First Garbage Collector
- G1GC – Java 9 Garbage Collector explained in 5 minutes
- devoxx-Nov2017-shenandoah(
部分图片来源于此pdf
)