G1 垃圾收集器

参考资料

https://www.infoq.com/articles/tuning-tips-G1-GC/

https://plumbr.io/handbook/garbage-collection-algorithms-implementations#g1

https://blog.csdn.net/coderlius/article/details/79272773

https://www.oracle.com/technetwork/tutorials/tutorials-1876574.html

 

G1简介,内存结构

One of the key design goals of G1 was to make the duration and distribution of stop-the-world pauses due to garbage collection predictable and configurable. In fact, Garbage-First is a soft real-time garbage collector, meaning that you can set specific performance goals to it. You can request the stop-the-world pauses to be no longer than x milliseconds within any given y-millisecond long time range, e.g. no more than 5 milliseconds in any given second. Garbage-First GC will do its best to meet this goal with high probability (but not with certainty, that would be hard real-time).

G1的目标是替代CMS。G1比CMS灵活,支持对最大stop the world时间的设定,得到比CMS更低的延迟。同时,避免CMS存在的内存碎片问题。

To achieve this, G1 builds upon a number of insights. First, the heap does not have to be split into contiguous Young and Old generation. Instead, the heap is split into a number (typically about 2048) smaller heap regions that can house objects. Each region may be an Eden region, a Survivor region or an Old region. The logical union of all Eden and Survivor regions is the Young Generation, and all the Old regions put together is the Old Generation:

G1与CMS相比,内存结构完全不同,堆内存分为 多个region。每个region可以是eden,survivor,old。如下图

G1 垃圾收集器_第1张图片

This allows the GC to avoid collecting the entire heap at once, and instead approach the problem incrementally: only a subset of the regions, called the collection set will be considered at a time. All the Young regions are collected during each pause, but some Old regions may be included as well:

这种设计允许G1不必每次都收集整个堆。每次只有部分region(称为collection set)被收集。在每次暂停(pause)的时候,young区都会被收集,同时部分old区会被收集。

 

When performing garbage collections, G1 operates in a manner similar to the CMS collector. G1 performs a concurrent global marking phase to determine the liveness of objects throughout the heap. After the mark phase completes, G1 knows which regions are mostly empty. It collects in these regions first, which usually yields a large amount of free space. This is why this method of garbage collection is called Garbage-First. As the name suggests, G1 concentrates its collection and compaction activity on the areas of the heap that are likely to be full of reclaimable objects, that is, garbage. G1 uses a pause prediction model to meet a user-defined pause time target and selects the number of regions to collect based on the specified pause time target.

G1的收集器参考了CMS。G1会同步进行makring操作。marking完成之后,G1知道哪些region垃圾最多。通常能从这些region回收到大量的内存空间。这也是G1名字的由来,garbage-first。G1会考虑用户设置的暂停时间来决定每次收集的region数量。

The regions identified by G1 as ripe for reclamation are garbage collected using evacuation. G1 copies objects from one or more regions of the heap to a single region on the heap, and in the process both compacts and frees up memory. This evacuation is performed in parallel on multi-processors, to decrease pause times and increase throughput. Thus, with each garbage collection, G1 continuously works to reduce fragmentation, working within the user defined pause times. This is beyond the capability of both the previous methods. CMS (Concurrent Mark Sweep ) garbage collector does not do compaction. ParallelOld garbage collection performs only whole-heap compaction, which results in considerable pause times.

G1使用标记-复制算法,从多个region拷贝对象到一个region,避免了CMS的内存碎片问题,也不会像Parallel那样每次回收整个heap导致高延迟。清除对象或者拷贝对象被称为evacuated (copied/moved)。

It is important to note that G1 is not a real-time collector. It meets the set pause time target with high probability but not absolute certainty. Based on data from previous collections, G1 does an estimate of how many regions can be collected within the user specified target time. Thus, the collector has a reasonably accurate model of the cost of collecting the regions, and it uses this model to determine which and how many regions to collect while staying within the pause time target.

 

Note: G1 has both concurrent (runs along with application threads, e.g., refinement, marking, cleanup) and parallel (multi-threaded, e.g., stop the world) phases. Full garbage collections are still single threaded, but if tuned properly your applications should avoid full GCs.

G1与CMS相似,有同步过程(与应用线程同步执行,比如标记,清除),也有并行过程(stop the world,并发执行)。Full GC是单线程的,但是G1会尽量避免full GC。

 

G1使用到的数据结构

Remembered Sets or RSets track object references into a given region. There is one RSet per region in the heap. The RSet enables the parallel and independent collection of a region. The overall footprint impact of RSets is less than 5%.

Region会被分为多个card(参考CMS中的card),如果一个card中的对象引用了其他region的对象,则Rsets会记录这个card

Rsets对内存的消耗低于footprint(已消耗内存)的5%。

G1 垃圾收集器_第2张图片

Collection Sets or CSets the set of regions that will be collected in a GC. All live data in a CSet is evacuated (copied/moved) during a GC. Sets of regions can be Eden, survivor, and/or old generation. CSets have a less than 1% impact on the size of the JVM.

Csets存储了被jvm选中的都要回收的region(young或者old)。Csets对象内存的消耗低于JVM内存的1%。

 

G1 young gc

1,jvm启动时,内存都是可用的,分为了多个region,运行一段时候后状态如下:

G1 垃圾收集器_第3张图片

 

 

3, G1 中的young gc

young gc使用标记-复制算法,多个young region的对象拷贝到别的region。部分达到阈值的对象会被拷贝到old region。young gc需要stop the world,使用多线程运行垃圾收集器。

young gc 开始前:

G1 垃圾收集器_第4张图片     

结束时:

G1 垃圾收集器_第5张图片

G1 old gc

 

Phase Description
(1) Initial Mark
(Stop the World Event)

This is a stop the world event. With G1, it is piggybacked on a normal young GC. Mark survivor regions (root regions) which may have references to objects in old generation.

属于young gc的一部分。找到所有的survivor region。

(2) Root Region Scanning

Scan survivor regions for references into the old generation. This happens while the application continues to run. The phase must be completed before a young GC can occur.

在上一步找到的root region中,看看哪些region有引用到old 区的对象。这些region就是root region。

(3) Concurrent Marking

Find live objects over the entire heap. This happens while the application is running. This phase can be interrupted by young generation garbage collections.

参考CMS的并行标记。与应用线程一同运行。

(4) Remark
(Stop the World Event)

Completes the marking of live object in the heap. Uses an algorithm called snapshot-at-the-beginning (SATB) which is much faster than what was used in the CMS collector.

参考CMS。stop the world,再次标记。

(5) Cleanup
(Stop the World Event and Concurrent)
  • Performs accounting on live objects and completely free regions. (Stop the world)
  • Scrubs the Remembered Sets. (Stop the world)
  • Reset the empty regions and return them to the free list. (Concurrent)

一共有3步:

  1. 统计存活的对象和空闲的region(stop the world)
  2. 清除Rsets数据,因为当前gc的查找工作已经完成了。stop the world
  3. 回收被清空的region

(*) Copying
(Stop the World Event)

*注意这个也是第5步的一部分

These are the stop the world pauses to evacuate or copy live objects to new unused regions. This can be done with young generation regions which are logged as [GC pause (young)]. Or both young and old generation regions which are logged as [GC Pause (mixed)].

stop the world。使用复制算法拷贝对象。copying是 young gc和 old gc都有的功能,如果log显示 pause (young)则表示只涉及到young region,如果是 pause (mixed)则是young和old都有

 

使用方法和最佳实践

-XX:+UseG1GC - Tells the JVM to use the G1 Garbage collector. 启用G1 

-XX:MaxGCPauseMillis=200 - Sets a target for the maximum GC pause time. This is a soft goal, and the JVM will make its best effort to achieve it. Therefore, the pause time goal will sometimes not be met. The default value is 200 milliseconds.

设置最大停顿时间。默认值就是200毫秒。

-XX:InitiatingHeapOccupancyPercent=45 - Percentage of the (entire) heap occupancy to start a concurrent GC cycle. It is used by G1 to trigger a concurrent GC cycle based on the occupancy of the entire heap, not just one of the generations. A value of 0 denotes 'do constant GC cycles'. The default value is 45 (i.e., 45% full or occupied).

触发GC的内存占用量。默认是45%,整个堆占用超过45%,触发一次GC

 

如果G1回收失败了(G1收集器与CMS类型,内存的利用率不可能达到100%,需要保留部分内存给G1和region 拷贝),一般是内存不足。

  1. 可以修改G1保留内存占比(默认是10%),
  2. 或者增加GC的线程数,提高速度,
  3. 或者更早得触发GC

 

 

你可能感兴趣的:(G1 垃圾收集器)