原文下载:IBM Garbage Collection and Storage Allocation Techniques
1 Introduction
This document describes the functions of the Storage (ST) component from release 1.2.2
to 1.4.1, Service Refresh 1. (IBM JDK5.0/6.0没有发布最新文档)
The ST component allocates areas of storage in the heap. These areas of storage define
objects, arrays, and classes. When an area of storage has been allocated, an object
continues to be live while a reference (pointer) to it exists somewhere in the active state
of the JVM; thus the object is reachable. When an object ceases to be referenced from
the active state, it becomes garbage and can be reclaimed for reuse. When this
reclamation occurs, the Garbage Collector must process a possible finalizer and also
ensure that any monitor that is associated with the object is returned to the pool of
available monitors (sometimes called the monitor cache). Not all objects are treated
equally by the ST component. Some (ClassClass and Thread) are allocated into special
regions of the heap (pinned clusters)(类和线程被分配到堆的特定区域); others (Reference and its derivatives) are treated
specially during tracing of the heap. More details on these special cases are given in
section 4.4 “Reference Objects”.
1.1 Object allocation
Object allocation is driven by calls to one of the allocation interfaces; for example,
stCacheAlloc, stAllocObject, stAllocArray, stAllocClass(这些函数没见过。。。). These interfaces all allocate a given amount of storage from the heap, but have different parameters and semantics. The
stCacheAlloc(用于小对象) routine is specifically designed to deliver optimal allocation performance
for small objects. Objects are allocated directly from a thread local allocation buffer that
the thread has previously allocated from the heap.(对象从线程之前对分配来的本地缓存中获取空间) A new object is allocated from the end
of this cache without the need to grab the heap lock(新对象无需获取堆的锁即可分配空间,效率很高); therefore, it is very efficient. Objects
that are allocated through the stAllocObject and stAllocArray interfaces are, if small
enough (currently 512 bytes), also allocated from the cache.
1.2 Reachable Objects
The active state of the JVM is made up of the set of stacks that represents the threads, the
static’s that are inside Java classes, and the set of local and global JNI references(堆包含线程所用栈、类的静态变量和本地、全局JNI引用集合). All
functions that are invoked inside the JVM itself cause a frame on the C stack. This
information is used to find the roots. These roots are then used to find references to other
objects. This process is repeated until all reachable objects are found.
1.3 Garbage Collection
When the JVM cannot allocate an object from the current heap because of lack of space,
the first task is to collect all the garbage that is in the heap. This process starts when any
thread calls stGC either as a result of allocation failure, or by a specific call to
System.gc()(引起完整GC?). The first step is to get all the locks that the garbage collection process needs. (第一步是获取垃圾回收所需的全部锁)
This step ensures that other threads are not suspended while they are holding critical
locks. All the other threads are then suspended through an execution manager (XM)
interface, which guarantees to make the suspended state of the thread accessible to the
calling thread. This state is the top and bottom of the stack and the contents of the
registers at the suspension point. It represents the state that is required to trace for object
references. Garbage collection can then begin. It occurs in three phases:
y Mark
y Sweep
y Compaction (optional)
1.3.1 Mark Phase
In the mark phase, all the objects that are referenced from the thread stacks, static’s,
interned strings, and JNI references are identified. This action creates the root set of
objects that the JVM references. Each of those objects might, in turn, reference others.
Therefore, the second part of the process is to scan each object for other references that it
makes. These two processes together generate a vector that defines live objects.
Each bit in the vector (allocbits) corresponds to an 8-byte section of the heap.(矢量中的每个bit代表了堆的一个8字节区域) The
appropriate bit is set when an object is allocated. When the Garbage Collector traces the
stacks, it first compares the pointer against the low and high limits of the heap. It then
ensures that the pointer is pointing to an object that is on an 8-byte boundary (GRAIN)
and that the appropriate allocbit is set to indicate that the pointer is actually pointing at an
object. The Garbage Collector now sets a bit in the markbits vector to indicate that the
object has been referenced.
Finally, the Garbage Collector scans the fields of the object to search for other object
references that the object makes. This scan of the objects is done accurately because the
method pointer that is stored in its first word enables the Garbage Collector to know the
class of the object. The Garbage Collector therefore has access to a vector of offsets that
the classloader builds at class linking time (before the creation of the first instance). The
offsets vector gives the offset of fields that are in the object that contains object
references. (垃圾回收器可以通过类的定义找到对象字段引用)
1.3.2 Sweep Phase
After the mark phase, the markbits vector contains a bit for every reachable object that is
in the heap. The markbits vector must be a subset of the allocbits vector. The task of the
sweep phase is to identify the intersection(交集) of these vectors; that is, objects that have been
allocated but are no longer referenced.
The original technique for this sweep phase was to start a scan at the bottom of the heap,
and visit each object in turn. The length of each object was held in the word that
immediately preceded it on the heap. At each object, the appropriate allocbit and markbit
was tested to locate the garbage.
Now, the bitsweep technique avoids the need to scan the objects that are in the heap and
therefore avoids the associated overhead cost for paging(换页). In the bitsweep technique, the
markbits vector is examined directly to look for long sequences of zeros (not marked),
which probably identify free space. When such a long sequence is found, the length of
the object that is at the start of the sequence is examined to determine the amount of free
space that is to be released.
1.3.3 Compaction Phase
After the garbage has been removed from the heap, the Garbage Collector can compact
the resulting set of objects to remove the spaces that are between them. Because
compaction can take a long time, it is avoided if possible(压缩耗时). Compaction, therefore, is a rare
event. Compaction avoidance is explained in more detail later in section “4.3.1
Compaction Avoidance”.
The process of compaction is complicated because handles are no longer in the JVM. If
any object is moved, the Garbage Collector must change all the references that exist to it.
If one of those references was from a stack, and therefore the Garbage Collector is not
sure that it was an object reference (it might have been a float, for example), the Garbage
Collector cannot move the object. (如果引用来自于栈,垃圾回收器不敢确定它是否是对象引用,也可能是浮点数比如,所以垃圾回收器不能移动对象)Such objects that are temporarily fixed in position are
referred to as dosed in the code and have the dosed bit set in the header word to indicate
this fact. Similarly, objects can be pinned during some JNI operations. Pinning has the
same effect, but is permanent until the object is explicitly unpinned by JNI. Objects that
remain mobile are compacted in two phases by taking advantage of the fact that the mptr
is known to have the low three bits set to zero. One of these bits can therefore be used to
denote the fact that it has been swapped. Note that this swapped bit is applied in two
places: the link field (where it is known as OLINK_IsSwapped), and also the mptr (where
it is known as GC_FirstSwapped). In both cases, the least significant bit (x01) is being
set.
At the end of the compaction phase, the threads are restarted through an XM interface.