G1垃圾回收器的补充

话说我这是直接从Oracle拷贝下来的文档,希望不会有版权问题。(侵删)

Oracle官方链接:https://docs.oracle.com/javase/8/docs/technotes/guides/vm/gctuning/g1_gc.html#garbage_first_garbage_collection

9 Garbage-First Garbage Collector

The Garbage-First (G1) garbage collector is a server-style garbage collector, targeted for multiprocessor machines with large memories. It attempts to meet garbage collection (GC) pause time goals with high probability while achieving high throughput. Whole-heap operations, such as global marking, are performed concurrently with the application threads. This prevents interruptions proportional to heap or live-data size(这里吹了下牛).

The G1 collector achieves high performance and pause time goals through several techniques.(这里说G1是通过以下黑科技来达到高性能和最大暂停时间目标的)

The heap is partitioned into a set of equally sized heap regions, each a contiguous range of virtual memory(这里说明了G1会把heap分成相同大小的region). G1 performs a concurrent global marking phase to determine the liveness of objects throughout the heap(G1通过并发标记阶段来确定哪些对象是活的). After the marking phase completes, G1 knows which regions are mostly empty(并发标记完成后,G1就会知道哪些region是最有可能空的). It collects these regions first, which often yields a large amount of free space. This is why this method of garbage collection is called Garbage-First(然后G1就会首先清理这些region,这样就可以获得大量的可用heap). As the name suggests, G1 concentrates its collection and compaction activity on the areas of the heap that are likely to be full of reclaimable objects, that is, garbage(就像名字说的,G1关注的是回收和压缩活动,在这些很有可能是充满了可回收对象的区域,可回收对象,就是垃圾). G1 uses a pause prediction model to meet a user-defined pause time target and selects the number of regions to collect based on the specified pause time target.(这里提到了G1使用暂停预测模型来满足用户定义的最大暂停时间目标,并且在指定暂停时间目标内选择会被回收的region,这也就说明,G1不会进行全局的垃圾回收,只是一片一片的回收。)

G1 copies objects from one or more regions of the heap to a single region on the heap, and in the process both compacts and frees up memory. This evacuation is performed in parallel on multiprocessors to decrease pause times and increase throughput. Thus, with each garbage collection, G1 continuously works to reduce fragmentation. This is beyond the capability of both of the previous methods. CMS (Concurrent Mark Sweep) garbage collection does not do compaction. Parallel compaction performs only whole-heap compaction, which results in considerable pause times.

It is important to note that G1 is not a real-time collector(G1不是要给实时的垃圾回收器). It meets the set pause time target with high probability but not absolute certainty. Based on data from previous collections, G1 estimates how many regions can be collected within the target time. Thus, the collector has a reasonably accurate model of the cost of collecting the regions, and it uses this model to determine which and how many regions to collect while staying within the pause time target.

The first focus of G1 is to provide a solution for users running applications that require large heaps with limited GC latency. This means heap sizes of around 6 GB or larger, and a stable and predictable pause time below 0.5 seconds.

Applications running today with either the CMS or the with parallel compaction would benefit from switching to G1 if the application has one or more of the following traits.

  • More than 50% of the Java heap is occupied with live data.

  • The rate of object allocation rate or promotion varies significantly.

  • The application is experiencing undesired long garbage collection or compaction pauses (longer than 0.5 to 1 second).

G1 is planned as the long-term replacement for the Concurrent Mark-Sweep Collector (CMS). Comparing G1 with CMS reveals differences that make G1 a better solution. One difference is that G1 is a compacting collector. Also, G1 offers more predictable garbage collection pauses than the CMS collector, and allows users to specify desired pause targets.

As with CMS, G1 is designed for applications that require shorter GC pauses.

G1 divides the heap into fixed-sized regions (the gray boxes) as in Figure 9-1, "Heap Division by G1".

Figure 9-1 Heap Division by G1(heap内存是长这样的)

G1垃圾回收器的补充_第1张图片
Description of "Figure 9-1 Heap Division by G1"
 

G1 is generational in a logical sense(G1逻辑分代的,有young region和old region). A set of empty regions is designated as the logical young generation. In the figure, the young generation is light blue. Allocations are done out of that logical young generation, and when the young generation is full, that set of regions is garbage collected (a young collection). In some cases, regions outside the set of young regions (old regions in dark blue) can be garbage collected at the same time. This is referred to as a mixed collection. In the figure, the regions being collected are marked by red boxes. The figure illustrates a mixed collection because both young regions and old regions are being collected. The garbage collection is a compacting collection that copies live objects to selected, initially empty regions. Based on the age of a surviving object, the object can be copied to a survivor region (marked by "S") or to an old region (not specifically shown)(这里说明了针对不同的region,有不同的收集方法,标记为S的region是survivor region). The regions marked by "H" contain humongous objects that are larger than half a region and are treated specially(这里说被标记为H的region,是包含了超过region容量一半的大对象,这些region会被特殊处理); see the section Humongous Objects and Humongous Allocations in Garbage-First Garbage Collector.

Allocation (Evacuation) Failure

As with CMS, the G1 collector runs parts of its collection while the application continues to run and there is a risk that the application will allocate objects faster than the garbage collector can recover free space. See the section Concurrent Mode Failure in Concurrent Mark Sweep (CMS) Collector for the analogous CMS behavior. In G1, the failure (exhaustion of the Java heap) occurs while G1 is copying live data out of one region (evacuating) into another region(G1会出现垃圾回收失败,在把对象从一个region拷贝到另一个region的过程中,如果出现垃圾回收失败,就是STW进行垃圾回收). The copying is done to compact the live data. If a free (empty) region cannot be found during the evacuation of a region being garbage collected, then an allocation failure occurs (because there is no space to allocate the live objects from the region being evacuated) and a stop-the-world (STW) full collection is done.

Floating Garbage(浮动垃圾)

Objects can die during a G1 collection and not be collected. G1 uses a technique called snapshot-at-the-beginning (SATB) to guarantee that all live objects are found by the garbage collector. SATB states that any object that is live at the start of the concurrent marking (a marking over the entire heap) is considered live for the purpose of the collection. SATB allows floating garbage in a way analogous to that of a CMS incremental update.

Pauses(这个就是STW)

G1 pauses the application to copy live objects to new regions(在清理阶段是STW的). These pauses can either be young collection pauses where only young regions are collected or mixed collection pauses where young and old regions are evacuated. As with CMS there is a final marking or remark pause to complete the marking while the application is stopped. Whereas CMS also had an initial marking pause, G1 does the initial marking work as part of an evacuation pause. G1 has a cleanup phase at the end of a collection which is partly STW and partly concurrent. The STW part of the cleanup phase identifies empty regions and determines old regions that are candidates for the next collection.

Card Tables and Concurrent Phases

If a garbage collector does not collect the entire heap (an incremental collection), the garbage collector needs to know where there are pointers from the uncollected part of the heap into the part of the heap that is being collected. This is typically for a generational garbage collector in which the uncollected part of the heap is usually the old generation, and the collected part of the heap is the young generation. The data structure for keeping this information (old generation pointers to young generation objects), is a remembered set. A card table is a particular type of remembered set. Java HotSpot VM uses an array of bytes as a card table. Each byte is referred to as a card. A card corresponds to a range of addresses in the heap. Dirtying a card means changing the value of the byte to a dirty value; a dirty value might contain a new pointer from the old generation to the young generation in the address range covered by the card.

Processing a card means looking at the card to see if there is an old generation to young generation pointer and perhaps doing something with that information such as transferring it to another data structure.

G1 has concurrent marking phase which marks live objects found from the application. The concurrent marking extends from the end of a evacuation pause (where the initial marking work is done) to the remark. The concurrent cleanup phase adds regions emptied by the collection to the list of free regions and clears the remembered sets of those regions. In addition, a concurrent refinement thread runs as needed to process card table entries that have been dirtied by application writes and which may have cross region references.

Starting a Concurrent Collection Cycle

As mentioned previously, both young and old regions are garbage collected in a mixed collection. To collect old regions, G1 does a complete marking of the live objects in the heap(为了回收old region,G1会做一个全局的并发标记). Such a marking is done by a concurrent marking phase. A concurrent marking phase is started when the occupancy of the entire Java heap reaches the value of the parameter InitiatingHeapOccupancyPercent(什么时候会开始做全局并发标记呢?当然是old region占总的Java heap比例达到一个阈值百分比,可使用如下语法调节这个值,默认阈值是45). Set the value of this parameter with the command-line option -XX:InitiatingHeapOccupancyPercent=. The default value of InitiatingHeapOccupancyPercent is 45.

Pause Time Goal

Set a pause time goal for G1 with the flag MaxGCPauseMillis(这个参数就是最大暂停时间,默认值是200毫秒). G1 uses a prediction model to decide how much garbage collection work can be done within that target pause time(这里说了,G1使用一个预测模型去决定在目标暂停时间内能做的垃圾回收工作的任务量). At the end of a collection, G1 chooses the regions to be collected in the next collection (the collection set)(并且在回收的最后,G1会选择哪些region将会被下一次垃圾回收所处理,这些region是包括young region). The collection set will contain young regions (the sum of whose sizes determines the size of the logical young generation). It is partly through the selection of the number of young regions in the collection set that G1 exerts control over the length of the GC pauses. You can specify the size of the young generation on the command line as with the other garbage collectors, but doing so may hamper the ability of G1 to attain the target pause time. In addition to the pause time goal, you can specify the length of the time period during which the pause can occur. You can specify the minimum mutator usage with this time span (GCPauseIntervalMillis)(这个值是暂停时间的间隔,默认是0,也就是第一次暂停和第二次暂停可以紧挨着发生) along with the pause time goal. The default value for MaxGCPauseMillis is 200 milliseconds. The default value for GCPauseIntervalMillis (0) is the equivalent of no requirement on the time span.

 

你可能感兴趣的:(个人)