An advantage of this scheme is that it can run in small chunks of time closely interwoven with the execution of the program. This characteristic makes it particularly suitable for real-time environments where the program can't be interrupted for very long. A disadvantage of reference counting is that it does not detect cycles. A cycle is two or more objects that refer to one another, for example, a parent object that has a reference to its child object, which has a reference back to its parent. These objects will never have a reference count of zero even though they may be unreachable by the roots of the executing program. Another disadvantage is the overhead of incrementing and decrementing the reference count each time. Because of these disadvantages, reference counting currently is out of favor. It is more likely that the JVMs you encounter in the real world will use a tracing algorithm in their garbage-collected heaps.
Tracing collectors
Tracing garbage collectors trace out the graph of object references starting with the root nodes. Objects that are encountered during the trace are marked in some way. Marking is generally done by either setting flags in the objects themselves or by setting flags in a separate bitmap. After the trace is complete, unmarked objects are known to be unreachable and can be garbage collected.
The basic tracing algorithm is called mark and sweep. This name refers to the two phases of the garbage collection process. In the mark phase, the garbage collector traverses the tree of references and marks each object it encounters. In the sweep phase unmarked objects are freed, and the resulting memory is made available to the executing program. In the JVM the sweep phase must include finalization of objects.
Some Java objects have finalizers, others do not. Objects with finalizers that are left unmarked after the sweep phase must be finalized before they are freed. Unmarked objects without finalizers may be freed immediately unless referred to by an unmarked finalizable object. All objects referred to by a finalizable object must remain on the heap until after the object has been finalized.
Compacting collectors
Garbage collectors of JVMs will likely have a strategy to combat heap fragmentation. Two strategies commonly used by mark and sweep collectors are compacting and copying. Both of these approaches move objects on the fly to reduce heap fragmentation. Compacting collectors slide live objects over free memory space toward one end of the heap. In the process the other end of the heap becomes one large contiguous free area. All references to the moved objects are updated to refer to the new location.
Updating references to moved objects is sometimes made simpler by adding a level of indirection to object references. Instead of referring directly to objects on the heap, object references refer to a table of object handles. The object handles refer to the actual objects on the heap. When an object is moved, only the object handle must be updated with the new location. All references to the object in the executing program will still refer to the updated handle, which did not move. While this approach simplifies the job of heap defragmentation, it adds a performance overhead to every object access.
Copying collectors
Copying garbage collectors move all live objects to a new area. As the objects are moved to the new area, they are placed side by side, thus eliminating any free spaces that may have separated them in the old area. The old area is then known to be all free space. The advantage of this approach is that objects can be copied as they are discovered by the traversal from the root nodes. There are no separate mark and sweep phases. Objects are copied to the new area on the fly, and forwarding pointers are left in their old locations. The forwarding pointers allow objects encountered later in the traversal that refer to already copied objects to know the new location of the copied objects.
A common copying collector is called stop and copy. In this scheme, the heap is divided into two regions. Only one of the two regions is used at any time. Objects are allocated from one of the regions until all the space in that region has been exhausted. At that point program execution is stopped and the heap is traversed. Live objects are copied to the other region as they are encountered by the traversal. When the stop and copy procedure is finished, program execution resumes. Memory will be allocated from the new heap region until it too runs out of space. At that point the program will once again be stopped. The heap will be traversed and live objects will be copied back to the original region. The cost associated with this approach is that twice as much memory is needed for a given amount of heap space because only half of the available memory is used at any time.
Heap Of Fish: a garbage-collected heap in action
The applet below demonstrates a mark and sweep garbage-collected heap that uses compaction. It uses indirect handles to objects instead of direct references to facilitate compaction. It is called Heap Of Fish because the only type of objects stored on the heap for this demonstration are fish objects that are defined as follows:
class YellowFish {
YellowFish myFriend;
}
class BlueFish {
BlueFish myFriend;
YellowFish myLunch;
}
class RedFish {
RedFish myFriend;
BlueFish myLunch;
YellowFish mySnack;
}
As you can see, there are three classes of fish -- red, blue, and yellow. The red fish is the largest as it has three instance variables. The yellow fish, with only one instance variable, is the smallest fish. The blue fish has two instance variables and is therefore medium-sized.
The instance variables of fish objects are references to other fish objects. BlueFish.myLunch, for example, is a reference to a YellowFish object. In this implementation of a garbage-collected heap, a reference to an object occupies four bytes. Therefore, the size of a RedFish object is 12 bytes, the size of a BlueFish object is eight bytes, and the size of a YellowFish object is four bytes.
A big difference between the Heap Of Fish code and the kind of code likely to be found in a real JVM stems from the fact that Java does not have pointers. The heaps of real world JVMs would use pointers where Heap Of Fish uses array indexes. In the sections that follow I describe some of the structure of the Java code that implements the heap in the applet. If you are curious about how the heap is implemented you can consult the source code for the ultimate level of detail. The heap data structures and behavior are implemented in the applet source as class GCHeap.
Swimming fish
Heap Of Fish has five modes, which are selectable via radio buttons at the bottom left of the applet. When the applet starts it is in swim mode. Swim mode is just a gratuitous animation. The animation is vaguely reminiscent of the familiar image of a big fish about to eat a medium-sized fish, which is about to eat a small fish.
The other four modes -- allocate fish, assign references, garbage collect, and compact heap -- allow you to interact with the heap. You can instantiate new fish objects in the allocate fish mode. The new fish objects go on the heap as all Java objects do. In the assign references mode you can build a network of local variables and fish that refer to other fish. In garbage collect mode, a mark and sweep operation will free any unreferenced fish. The compact heap mode allows you to slide heap objects so that they are side by side at one end of the heap, leaving all free memory as one large contiguous block at the other end of the heap.
Allocate fish
The allocate fish mode shows the two parts that make up the heap, the object pool and handle pool. The object pool is a contiguous block of memory from which space is taken for new objects. The object pool is structured as a series of memory blocks. Each memory block has a four-byte header which indicates the length of the memory block and whether it is free. The headers are shown in the applet as black horizontal lines in the object pool.
The object pool in Heap Of Fish is implemented as an array of ints. The first header is always at objectPool[0]. The object pool's series of memory blocks can be traversed by hopping from header to header. Each header gives the length of its memory block, which also reveals where the next header is going to be. The header of the next memory block will be the first int immediately following the current memory block. When a new object is allocated the object pool is traversed until a memory block is encountered with enough space to accommodate the new object. Allocated objects in the object pool are shown as colored bars. YellowFish objects are shown in yellow, BlueFish objects are shown in blue, and RedFish objects are shown in red. Free memory blocks, those that currently contain no fish, are shown in white.
The handle pool in Heap Of Fish is implemented as an array of objects of a class named ObjectHandle. An ObjectHandle contains information about an object, including the vital index into the object pool array. The object pool index functions as a reference to the actual allocated object's instance data in the object pool. The ObjectHandle also reveals information about the class of the fish object. In a real JVM, each allocated object would need to be associated with the information read in from the class file such as the method bytecodes, names of the class, its superclass, any interfaces it implements, its fields, and the type signatures of its methods and fields. In Heap Of Fish, the ObjectHandle associates each allocated object with information such as its class -- whether it is a RedFish, BlueFish, or YellowFish -- and some data used in displaying the fish in the applet user interface.
The handle pool exists to make it easier to defragment the object pool through compaction. References to objects, which can be stored in local variables of a stack or the instance variables of other objects, are not direct indexes into the object pool array. They are instead indexes into the handle pool array. When objects in the object pool are moved for compaction, only the corresponding ObjectHandle must be updated with the object's new object pool array index.
Each handle in the handle pool that refers to a fish object is shown as a horizontal bar painted the same color as the fish to which it refers. A line connects each handle to its fish object in the object pool. Those handles that are not currently in use are drawn in white.
Assign references
The assign references mode allows you to build a network of references between local variables and allocated fish objects. A reference is merely a local or instance variable that contains a valid object reference. There are three local variables which serve as the roots of garbage collection, one for each class of fish. If you do not link any fish to local variables, then all fish will be considered unreachable and freed by the garbage collector.
The assign references mode has three sub-modes -- move fish, link fish, and unlink fish. The sub-mode is selectable via radio buttons at the bottom of the canvas upon which the fish appear. In move fish mode, you can click on a fish and drag it to a new position. You might want to do this so that your links are easier to see or just because you feel like rearranging fish in the sea.
In link fish mode, you can click on a fish or local variable and drag a link to another fish. The fish or local variable you initially drag from will be assigned a reference to the fish you ultimately drop upon. A line will be shown connecting the two items. A line connecting two fish will be drawn between the nose of the fish with the reference to the tail of the referenced fish.
Class YellowFish has only one instance variable, myFriend, which is a reference to a YellowFish object. Therefore, a yellow fish can only be linked to one other yellow fish. When you link two yellow fish the myFriend variable of the "dragged from" fish will be assigned the reference to the "dropped upon" fish. If this action were implemented in Java code, it might look like:
// Fish are allocated somewhere
YellowFish draggedFromFish = new YellowFish();
YellowFish droppedUponFish = new YellowFish();
// Sometime later the assignment takes place
draggedFromFish.myFriend = droppedUponFish;
Class BlueFish has two instance variables, BlueFish myFriend and YellowFish myLunch, therefore a blue fish can be linked to one blue fish and one yellow fish. Class RedFish has three instance variables, RedFish myFriend, BlueFish myLunch, and RedFish mySnack. Red fish can therefore link to one instantiation of each class of fish.
In unlink fish mode, you can disconnect fish by moving the cursor over the line connecting two fish. When the cursor is over the line, the line will turn black. If you click a black line the reference will be set to null and the line will disappear.
Garbage collect
The garbage collect mode allows you to drive the mark and sweep algorithm. The Step button at the bottom of the canvas takes you through the garbage collection process one step at a time. You can reset the garbage collector at any time by clicking the Reset button. However, once the garbage collector has swept, the freed fish are gone forever. No manner of frantic clicking of the Reset button will bring them back.
The garbage collection process is divided into a mark phase and a sweep phase. During the mark phase, the fish objects on the heap are traversed depth-first starting from the local variables. During the sweep phase, all unmarked fish objects are freed.
At the start of the mark phase, all local variables, fish, and links are shown in white. Each press of the Step button advances the depth-first traversal one more node. The current node of the traversal, either a local variable or a fish, is shown in magenta. As the garbage collector traverses down a branch, fish along the branch are changed from white to gray. Gray indicates the fish has been reached by the traversal, but there may yet be fish further down the branch that have not been reached. Once the terminal node of a branch is reached, the color of the terminal fish is changed to black and the traversal retreats back up the branch. Once all links below a fish have been marked black, that fish is marked black and the traversal returns back the way it came.
At the end of the mark phase, all reachable fish are colored black and any unreachable fish are colored white. The sweep phase then frees the memory occupied by the white fish.
Compact heap
The compact heap mode allows you to move one object at a time to one end of the object pool. Each press of the Slide button will move one object. You can see that only the object instance data in the object pool moves; the handle in the handle pool does not move.
The Heap Of Fish applet allows you to allocate new fish objects, link fish, garbage collect, and compact the heap. These activities can be done in any order as much as you please. By playing around with this applet you should be able to get a good idea how a mark and sweep garbage-collected heap works. There is some text at the bottom of the applet that should help you as you go along. Happy clicking.