卡表(CardTable)在CMS中是最常见的概念之一,G1中不仅保留了这个概念,还引入了RSet。卡表到底是一个什么东西?GC最早引入卡表的目的是为了对内存的引用关系做标记,从而根据引用关系快速遍历活跃对象。举个简单的例子,有两个分区,假设分区大小都为1MB,分别为A和B。如果A中有一个对象objA,B中有一个对象objB,且objA.field=objB,那么这两个分区就有引用关系了,但是如果我们想找到分区A,要如何引用分区B?做法有两种:·遍历整个分区A,一个字一个字的移动(为什么以字为单位?原因是JVM中对象会对齐,所以不需要按字节移动),然后查看内存里面的值到底是不是指向B,这种方法效率太低,可以优化为一个对象一个对象地移动(这里涉及JVM如何识别对象,以及如何区分指针和立即数),但效率还是太低。
·借助额外的数据结构描述这种引用关系,例如使用类似位图(bitmap)的方法,记录A和B的内存块之间的引用关系,用一个位来描述一个字,假设在32位机器上(一个字为32位),需要32KB(32KB×32=1M)的空间来描述一个分区。那么我们就可以在这个对象ObjA所在分区A里面添加一个额外的指针,这个指针指向另外一个分区B的位图,如果我们可以把对象ObjA和指针关系进行映射,那么当访问ObjA的时候,顺便访问这个额外的指针,从这个指针指向的位图就能找到被ObjA引用的分区B对应的内存块。通常我们只需要判定位图里面对应的位是否有1,有的话则认为发生了引用。
class CardTable: public CHeapObj {
friend class VMStructs;
public:
typedef uint8_t CardValue;
// All code generators assume that the size of a card table entry is one byte.
// They need to be updated to reflect any change to this.
// This code can typically be found by searching for the byte_map_base() method.
STATIC_ASSERT(sizeof(CardValue) == 1);
protected:
// The declaration order of these const fields is important; see the
// constructor before changing.
const MemRegion _whole_heap; // the region covered by the card table
const size_t _page_size; // page size used when mapping _byte_map
size_t _byte_map_size; // in bytes
CardValue* _byte_map; // the card marking array
CardValue* _byte_map_base;
// Some barrier sets create tables whose elements correspond to parts of
// the heap; the CardTableBarrierSet is an example. Such barrier sets will
// normally reserve space for such tables, and commit parts of the table
// "covering" parts of the heap that are committed. At most one covered
// region per generation is needed.
static constexpr int max_covered_regions = 2;
// The covered regions should be in address order.
MemRegion _covered[max_covered_regions];
// The last card is a guard card; never committed.
MemRegion _guard_region;
inline size_t compute_byte_map_size(size_t num_bytes);
enum CardValues {
clean_card = (CardValue)-1,
dirty_card = 0,
CT_MR_BS_last_reserved = 1
};
// a word's worth (row) of clean card values
static const intptr_t clean_card_row = (intptr_t)(-1);
// CardTable entry size
static uint _card_shift;
static uint _card_size;
static uint _card_size_in_words;
size_t last_valid_index() const {
return cards_required(_whole_heap.word_size()) - 1;
}
private:
void initialize_covered_region(void* region0_start, void* region1_start);
MemRegion committed_for(const MemRegion mr) const;
public:
CardTable(MemRegion whole_heap);
virtual ~CardTable() = default;
void initialize(void* region0_start, void* region1_start);
// *** Barrier set functions.
// Initialization utilities; covered_words is the size of the covered region
// in, um, words.
inline size_t cards_required(size_t covered_words) const {
assert(is_aligned(covered_words, _card_size_in_words), "precondition");
return covered_words / _card_size_in_words;
}
// Dirty the bytes corresponding to "mr" (not all of which must be
// covered.)
void dirty_MemRegion(MemRegion mr);
// Clear (to clean_card) the bytes entirely contained within "mr" (not
// all of which must be covered.)
void clear_MemRegion(MemRegion mr);
// Return true if "p" is at the start of a card.
bool is_card_aligned(HeapWord* p) {
CardValue* pcard = byte_for(p);
return (addr_for(pcard) == p);
}
// Mapping from address to card marking array entry
CardValue* byte_for(const void* p) const {
assert(_whole_heap.contains(p),
"Attempt to access p = " PTR_FORMAT " out of bounds of "
" card marking array's _whole_heap = [" PTR_FORMAT "," PTR_FORMAT ")",
p2i(p), p2i(_whole_heap.start()), p2i(_whole_heap.end()));
CardValue* result = &_byte_map_base[uintptr_t(p) >> _card_shift];
assert(result >= _byte_map && result < _byte_map + _byte_map_size,
"out of bounds accessor for card marking array");
return result;
}
// The card table byte one after the card marking array
// entry for argument address. Typically used for higher bounds
// for loops iterating through the card table.
CardValue* byte_after(const void* p) const {
return byte_for(p) + 1;
}
void invalidate(MemRegion mr);
// Provide read-only access to the card table array.
const CardValue* byte_for_const(const void* p) const {
return byte_for(p);
}
const CardValue* byte_after_const(const void* p) const {
return byte_after(p);
}
// Mapping from card marking array entry to address of first word
HeapWord* addr_for(const CardValue* p) const {
assert(p >= _byte_map && p < _byte_map + _byte_map_size,
"out of bounds access to card marking array. p: " PTR_FORMAT
" _byte_map: " PTR_FORMAT " _byte_map + _byte_map_size: " PTR_FORMAT,
p2i(p), p2i(_byte_map), p2i(_byte_map + _byte_map_size));
// As _byte_map_base may be "negative" (the card table has been allocated before
// the heap in memory), do not use pointer_delta() to avoid the assertion failure.
size_t delta = p - _byte_map_base;
HeapWord* result = (HeapWord*) (delta << _card_shift);
assert(_whole_heap.contains(result),
"Returning result = " PTR_FORMAT " out of bounds of "
" card marking array's _whole_heap = [" PTR_FORMAT "," PTR_FORMAT ")",
p2i(result), p2i(_whole_heap.start()), p2i(_whole_heap.end()));
return result;
}
// Mapping from address to card marking array index.
size_t index_for(void* p) {
assert(_whole_heap.contains(p),
"Attempt to access p = " PTR_FORMAT " out of bounds of "
" card marking array's _whole_heap = [" PTR_FORMAT "," PTR_FORMAT ")",
p2i(p), p2i(_whole_heap.start()), p2i(_whole_heap.end()));
return byte_for(p) - _byte_map;
}
CardValue* byte_for_index(const size_t card_index) const {
return _byte_map + card_index;
}
// Resize one of the regions covered by the remembered set.
void resize_covered_region(MemRegion new_region);
// *** Card-table-RemSet-specific things.
static uintx ct_max_alignment_constraint();
static uint card_shift() {
return _card_shift;
}
static uint card_size() {
return _card_size;
}
static uint card_size_in_words() {
return _card_size_in_words;
}
static constexpr CardValue clean_card_val() { return clean_card; }
static constexpr CardValue dirty_card_val() { return dirty_card; }
static intptr_t clean_card_row_val() { return clean_card_row; }
// Initialize card size
static void initialize_card_size();
// Card marking array base (adjusted for heap low boundary)
// This would be the 0th element of _byte_map, if the heap started at 0x0.
// But since the heap starts at some higher address, this points to somewhere
// before the beginning of the actual _byte_map.
CardValue* byte_map_base() const { return _byte_map_base; }
virtual bool is_in_young(const void* p) const = 0;
};
class G1CardTable : public CardTable {
friend class VMStructs;
friend class G1CardTableChangedListener;
G1CardTableChangedListener _listener;
public:
enum G1CardValues {
g1_young_gen = CT_MR_BS_last_reserved << 1,
// During evacuation we use the card table to consolidate the cards we need to
// scan for roots onto the card table from the various sources. Further it is
// used to record already completely scanned cards to avoid re-scanning them
// when incrementally evacuating the old gen regions of a collection set.
// This means that already scanned cards should be preserved.
//
// The merge at the start of each evacuation round simply sets cards to dirty
// that are clean; scanned cards are set to 0x1.
//
// This means that the LSB determines what to do with the card during evacuation
// given the following possible values:
//
// 11111111 - clean, do not scan
// 00000001 - already scanned, do not scan
// 00000000 - dirty, needs to be scanned.
//
g1_card_already_scanned = 0x1
};
static const size_t WordAllClean = SIZE_MAX;
static const size_t WordAllDirty = 0;
STATIC_ASSERT(BitsPerByte == 8);
static const size_t WordAlreadyScanned = (SIZE_MAX / 255) * g1_card_already_scanned;
G1CardTable(MemRegion whole_heap): CardTable(whole_heap), _listener() {
_listener.set_card_table(this);
}
static CardValue g1_young_card_val() { return g1_young_gen; }
static CardValue g1_scanned_card_val() { return g1_card_already_scanned; }
void verify_g1_young_region(MemRegion mr) PRODUCT_RETURN;
void g1_mark_as_young(const MemRegion& mr);
size_t index_for_cardvalue(CardValue const* p) const {
return pointer_delta(p, _byte_map, sizeof(CardValue));
}
// Mark the given card as Dirty if it is Clean. Returns whether the card was
// Clean before this operation. This result may be inaccurate as it does not
// perform the dirtying atomically.
inline bool mark_clean_as_dirty(CardValue* card);
// Change Clean cards in a (large) area on the card table as Dirty, preserving
// already scanned cards. Assumes that most cards in that area are Clean.
inline void mark_range_dirty(size_t start_card_index, size_t num_cards);
// Change the given range of dirty cards to "which". All of these cards must be Dirty.
inline void change_dirty_cards_to(CardValue* start_card, CardValue* end_card, CardValue which);
inline uint region_idx_for(CardValue* p);
static size_t compute_size(size_t mem_region_size_in_words) {
size_t number_of_slots = (mem_region_size_in_words / _card_size_in_words);
return ReservedSpace::allocation_align_size_up(number_of_slots);
}
// Returns how many bytes of the heap a single byte of the Card Table corresponds to.
static size_t heap_map_factor() { return _card_size; }
void initialize(G1RegionToSpaceMapper* mapper);
bool is_in_young(const void* p) const override;
};
以位为粒度的位图能准确描述每一个字的引用关系,但是一个位通常包含的信息太少,只能描述2个状态:引用还是未引用。实际应用中JVM在垃圾回收的时候需要更多的状态,如果增加至一个字节来描述状态,则位图需要256KB的空间,这个数字太大,开销占了25%。所以一个可能的做法位图不再描述一个字,而是一个区域,JVM选择512字节为单位,即用一个字节描述512字节的引用关系。选择一个区域除了空间利用率的问题之外,实际上还有现实的意义。我们知道Java对象实际上不是一个字能描述的(有一个参数可以控制对象最小对齐的大小,默认是8字节,实际上Java在JVM中还有一些附加信息,所以对齐后最小的Java对象是16字节),很多Java对象可能是几十个字节或者几百个字节,所以用一个字节描述一个区域是有意义的。但是我没有找到512的来源,为什么512效果最好?没有相应的数据来支持这个数字,而且这个值不可以配置,不能修改,但是有理由相信512字节的区域是为了节约内存额外开销。按照这个值,1MB的内存只需要2KB的额外空间就能描述引用关系。这又带来另一个问题,就是512字节里面的内存可能被引用多次,所以这是一个粗略的关系描述,那么在使用的时候需要遍历这512字节。
再举一个例子,假设有两个对象B、C都在这512字节的区域内。为了方便处理,记录对象引用关系的时候,都使用对象的起始位置,然后用这个地址和512对齐,因此B和C对象的卡表指针都指向这一个卡表的位置。那么对于引用处理也有可有两种处理方法:·处理的时候会以堆分区为处理单位,遍历整个堆分区,在遍历的时候,每次都会以对象大小为步长,结合卡表,如果该卡表中对应的位置被设置,则说明对象和其他分区的对象发生了引用。具体内容在后文中介绍Refine的时候还会详细介绍。·处理的时候借助于额外的数据结构,找到真正对象的位置,而不需要从头开始遍历。在后文的并发标记处理时就使用了这种方法,用于找到第一个对象的起始位置。在G1除了512字节粒度的卡表之外,还有bitMap,例如使用bitMap可以描述一个分区对另外一个分区的引用情况。在JVM中bitMap使用非常多,例如还可以描述内存的分配情况。
在G1除了512字节粒度的卡表之外,还有bitMap,例如使用bitMap可以描述一个分区对另外一个分区的引用情况。在JVM中bitMap使用非常多,例如还可以描述内存的分配情况。G1在混合收集算法中用到了并发标记。在并发标记的时候使用了bitMap来描述对象的分配情况。例如1MB的分区可以用16KB(16KB×ObjectAlignmentInBytes×8=1MB)来描述,即16KB额外的空间。其中ObjectAlignmentInBytes是8字节,指的是对象对齐,第二个8是指一个字节有8位。即每一个位可以描述64位。例如一个对象长度对齐之后为24字节,理论上它占用3个位来描述这个24字节已被使用了,实际上并不需要,在标记的时候只需要标记这3个位中的第一个位,再结合堆分区对象的大小信息就能准确找出。其最主要的目的是为了效率,标记一个位和标记3个位相比能节约不少时间,如果对象很大,则更划算。这些都是源码的实现细节,大家在阅读源码时需要细细斟酌。