G1垃圾收集器原理剖析【官方解读】

2022-11-03,,,,

继续基于上一次https://www.cnblogs.com/webor2006/p/11135005.html的官方G1文档进行解读,上一次分析到了这:

话不多说,继续往前读:

When performing garbage collections, G1 operates in a manner similar to the CMS collector. G1 performs a concurrent global marking phase to determine the liveness of objects throughout the heap. After the mark phase completes, G1 knows which regions are mostly empty. It collects in these regions first, which usually yields a large amount of free space. This is why this method of garbage collection is called Garbage-First. As the name suggests, G1 concentrates its collection and compaction activity on the areas of the heap that are likely to be full of reclaimable objects, that is, garbage. G1 uses a pause prediction model to meet a user-defined pause time target and selects the number of regions to collect based on the specified pause time target.

【解读:

“When performing garbage collections, G1 operates in a manner similar to the CMS collector.”

在执行垃圾收集的时候,G1的操作行为方式类似于CMS收集器很。

“G1 performs a concurrent global marking phase to determine the liveness of objects throughout the heap. After the mark phase completes, G1 knows which regions are mostly empty. ”

G1会执行一个并发的全局标记的阶段来去确定整个堆当中对象的存活情况。在标记阶段完成之后,G1就知道哪个区域几乎是空的【言外之后,也就是经过G1的标记之后就能清楚哪些区域存活的对象多,哪些存活的对象少了】。

“It collects in these regions first, which usually yields a large amount of free space. This is why this method of garbage collection is called Garbage-First. ”

它就会优先收集这些区域,这样就能产生大量的可用空间,这就是为什么这种垃圾收集器的方式称之为Garbage-Firt原因之所在。

“As the name suggests, G1 concentrates its collection and compaction activity on the areas of the heap that are likely to be full of reclaimable objects, that is, garbage. ”

如这个名字所表示的一样,G1会重点观注它的收集和压缩活动在这些堆的区域上,而这些区域是最有可能充满了可回收的对象,也就是说,垃圾。

“G1 uses a pause prediction model to meet a user-defined pause time target and selects the number of regions to collect based on the specified pause time target.”

G1使用了一个停顿可预测的模型来满足一个用户定义的暂停时间的目标,并且会选择一定数量的区域来进行回收,它是基于一个用户所指定的暂时间目标。

The regions identified by G1 as ripe for reclamation are garbage collected using evacuation. G1 copies objects from one or more regions of the heap to a single region on the heap, and in the process both compacts and frees up memory. This evacuation is performed in parallel on multi-processors, to decrease pause times and increase throughput. Thus, with each garbage collection, G1 continuously works to reduce fragmentation, working within the user defined pause times. This is beyond the capability of both the previous methods. CMS (Concurrent Mark Sweep ) garbage collector does not do compaction. ParallelOld garbage collection performs only whole-heap compaction, which results in considerable pause times.

【解读:

“The regions identified by G1 as ripe for reclamation are garbage collected using evacuation. ”

由G1标识出来的区域是完全可以做为垃圾来回收的。

“G1 copies objects from one or more regions of the heap to a single region on the heap, and in the process both compacts and frees up memory. ”

G1会从堆中的一个或者多个区域复制对象到堆中的另外一个单个区域,而且在复制这个过程当中还会进行压缩并且释放内存。【这里解释一下,比如图中这些区域存活的对象总大小只会占用一个区域的大小了:

这时就会将这三个存活的对象一起复制到另一个单个的区域中,比如Survivor中,如一下:

在存放到S时还会进行压缩,这时原来三个Eden Space中的空间就会得到释放可以用于其它对象的存放了。

“This evacuation is performed in parallel on multi-processors, to decrease pause times and increase throughput. Thus, with each garbage collection, G1 continuously works to reduce fragmentation, working within the user defined pause times. ”

这种回收在多处理器上是并行执行的,从而减少停顿的时间并且还能增加吞吐量。因此,对于每一个垃圾收集来说,G1会持续的工作来减少碎片的产生,而且是运行在用户所指定的停顿时间之内。

“This is beyond the capability of both the previous methods. CMS (Concurrent Mark Sweep ) garbage collector does not do compaction. ParallelOld garbage collection performs only whole-heap compaction, which results in considerable pause times.”

这完全超出了原来垃圾收集器的能力。CMS垃圾收集器并不会进行压缩。而ParallelOld垃圾收集器只会执行整个堆的压缩,这样就会耗费相当长的暂停时间。

It is important to note that G1 is not a real-time collector. It meets the set pause time target with high probability but not absolute certainty. Based on data from previous collections, G1 does an estimate of how many regions can be collected within the user specified target time. Thus, the collector has a reasonably accurate model of the cost of collecting the regions, and it uses this model to determine which and how many regions to collect while staying within the pause time target.

【解读:

“It is important to note that G1 is not a real-time collector. It meets the set pause time target with high probability but not absolute certainty. ”

值得关注的重点是G1并非是一个实时的收集器,它会尽最大的可能满足用户有所设定的停顿时间目标,但并非绝对精准。

“Based on data from previous collections, G1 does an estimate of how many regions can be collected within the user specified target time. ”

根据之前收集器的一些数据,G1会做一个近视的估量在用户所指定的目标时间内到底能收集多少个区域。

“Thus, the collector has a reasonably accurate model of the cost of collecting the regions, and it uses this model to determine which and how many regions to collect while staying within the pause time target.”

这样的话,收集器就会有一个相对精确的收集区域成本的模型,并且用这个模型来确定哪些以及多少区域要被收集才能满足停顿时间的目标。

Note: G1 has both concurrent (runs along with application threads, e.g., refinement, marking, cleanup) and parallel (multi-threaded, e.g., stop the world) phases. Full garbage collections are still single threaded, but if tuned properly your applications should avoid full GCs.

【解读:

Note: G1 has both concurrent (runs along with application threads, e.g., refinement, marking, cleanup) and parallel (multi-threaded, e.g., stop the world) phases. ”

说明:G1既拥有并发(同应用线程同时运行,如标记、清除等),又有并行(多个线程,比如说STW)阶段。

“Full garbage collections are still single threaded, but if tuned properly your applications should avoid full GCs.”

Full GC依然是单线程的,但是如果调优得比较好的话你就应用应该尽量Full GC的情况发生。

接着来看一下“G1 Footprint”,也就是G1物理占用的空间,如下:

If you migrate from the ParallelOldGC or CMS collector to G1, you will likely see a larger JVM process size. This is largely related to "accounting" data structures such as Remembered Sets and Collection Sets.

如果你从ParallelOldGC或者CMS收集器往G1来迁移的话,你会看到一个更大的JVM处理的大小。这是因为跟“RSet”和“CSet”的数据结构相关。【这个在之前的理论中也学习过了,如:

Remembered Sets or RSets track object references into a given region. There is one RSet per region in the heap. The RSet enables the parallel and independent collection of a region. The overall footprint impact of RSets is less than 5%.

已记忆集合或者也称之为RSet,它会追踪到一个指定区域的对象引用。在堆中的每一个区域都有一个RSet。这个RSet使得并行和独立区域的收集成为可能。整体RSets的大小应该是小于5%。

Collection Sets or CSets the set of regions that will be collected in a GC. All live data in a CSet is evacuated (copied/moved) during a GC. Sets of regions can be Eden, survivor, and/or old generation. CSets have a less than 1% impact on the size of the JVM.

收集集合又称为CSet,它会设置在GC当中会被收集的区域,在CSets当中所有存活的数据在GC当中都会被拷贝或复制。区域的集合可能是Eden、survivor或者是Old generation,CSets在整个JVM的的大小会少于1%。

继续,“Recommended Use Cases for G1,G1推荐的使用场景”,如下:

The first focus of G1 is to provide a solution for users running applications that require large heaps with limited GC latency. This means heap sizes of around 6GB or larger, and stable and predictable pause time below 0.5 seconds.

G1的第一关注点是为用户提供一种解决方案,其用户群体是他们运行的应用需要大的堆空间并且是有限的GC的延迟。这个意味着堆空间大小大于是在6G左右或者更大,并且稳定可预测的间隔时间应该是在0.5s之内。

Applications running today with either the CMS or the ParallelOldGC garbage collector would benefit switching to G1 if the application has one or more of the following traits.

在现在应用使用从CMS或者ParallelOldGC垃圾收集器都会从切换至G1受益,如果应用拥有如下几个特点:

      Full GC durations are too long or too frequent.
      Full GC的持续时间太长或太频繁了。
      The rate of object allocation rate or promotion varies significantly.
      对象分派或晋升率比较大。
      Undesired long garbage collection or compaction pauses (longer than 0.5 to 1 second)
      不需要长时间的垃圾收集或者停顿的间隔(通常是长于0.5或1秒)

Note: If you are using CMS or ParallelOldGC and your application is not experiencing long garbage collection pauses, it is fine to stay with your current collector. Changing to the G1 collector is not a requirement for using the latest JDK.
说明:如果你使用了CMS或者ParallelOldGC,并且你的应用并没有经历过长时间的垃圾收集停顿时间,还是可以继续保持你现有的收集器的,切换成G1收集器对于使用了最新版的JDK来说并不是一个需求。

至此咱们就把这三部分的东东给解读完了,如下:

G1垃圾收集器原理剖析【官方解读】的相关教程结束。

《G1垃圾收集器原理剖析【官方解读】.doc》

下载本文的Word格式文档,以方便收藏与打印。