一 android art 内存模型
理解art虚拟机内存管理,需要先了解虚拟机的内存组织,先看一下一个app运行时内存分布情况如下图所绘:
12c00000-12cc0000 rw-p 00000000 00:05 18389 /dev/ashmem/dalvik-main space (region space) (deleted)
12cc0000-13bc0000 ---p 000c0000 00:05 18389 /dev/ashmem/dalvik-main space (region space) (deleted)
13bc0000-32c00000 rw-p 00fc0000 00:05 18389 /dev/ashmem/dalvik-main space (region space) (deleted)
70589000-70c84000 rw-p 00000000 103:10 240 /data/dalvik-cache/arm/system@[email protected]
70c84000-7118a000 r--p 00000000 103:10 135 /data/dalvik-cache/arm/system@[email protected]
7118a000-71d47000 r-xp 00506000 103:10 135 /data/dalvik-cache/arm/system@[email protected]
71d47000-71d4e000 rw-p 00000000 00:00 0 [anon:.bss]
71d4e000-71d4f000 r--p 010c3000 103:10 135 /data/dalvik-cache/arm/system@[email protected]
71d4f000-71d50000 rw-p 010c4000 103:10 135 /data/dalvik-cache/arm/system@[email protected]
71d50000-71e04000 rw-p 00000000 00:05 18386 /dev/ashmem/dalvik-zygote space (deleted)
71e04000-71e05000 rw-p 00000000 00:05 21823 /dev/ashmem/dalvik-non moving space (deleted)
71e05000-75551000 ---p 00001000 00:05 21823 /dev/ashmem/dalvik-non moving space (deleted)
75d50000-75d50000 rw-p 0374d000 00:05 21823 /dev/ashmem/dalvik-non moving space (deleted)
12c00000 -32c00000 512M的heap空间,与[dalvik.vm.dex2oat-Xmx]: [512m] 正好对应后面依次是boot.art的加载空间,属于image space,紧跟着的是boat.oat的加载地址,Zygote分配以及预加载的地址(non-moving),在art/runtime/gc/space下是art的内存管理代码,对于各种内存以及他们之间的关系可以通过下面的类继承图可以看出,imagespace largeobjectspace zygotespace 以及heap 使用的dlmallocaspace或者rosallocspace,不同的内存使用不同的分配策略。还有一个region_space,这个是在cocurrent_copying的回收算法中使用。
Rosalloc是后面加的内存分配算法,全称runs-of-slots memory allocator。ROS allocator的基本分配单元是slot。slot大小从16Bytes到2048Bytes,分别是16,32,48,… n*16,512,1024,2048。应该类似于内核的伙伴管理算法。
二 堆内存回收算法
先看一下art支持的回收算法
enum CollectorType {
kCollectorTypeNone,
kCollectorTypeMS, mark-sweep 回收算法
kCollectorTypeCMS, 并行mark-sweep 算法
kCollectorTypeSS, // semi-space和mark-sweep 混合算法
kCollectorTypeGSS, 分代kCollectorTypeSS.
kCollectorTypeMC, mark-compact算法
// Heap trimming collector, doesn't do any actual collecting.
kCollectorTypeHeapTrim,
kCollectorTypeCC, concurrent-copying算法
// The background compaction of the concurrent copying collector.
kCollectorTypeCCBackground,
// Instrumentation critical section fake collector.
kCollectorTypeInstrumentation,
// Fake collector for adding or removing application image spaces.
kCollectorTypeAddRemoveAppImageSpace,
// Fake collector used to implement exclusion between GC and debugger.
kCollectorTypeDebugger,
// A homogeneous space compaction collector used in background transition
// when both foreground and background collector are CMS.
kCollectorTypeHomogeneousSpaceCompact,
// Class linker fake collector.
kCollectorTypeClassLinker,
// JIT Code cache fake collector.
kCollectorTypeJitCodeCache,
// Hprof fake collector.
kCollectorTypeHprof,
// Fake collector for installing/removing a system-weak holder.
kCollectorTypeAddRemoveSystemWeakHolder,
// Fake collector type for GetObjectsAllocated
kCollectorTypeGetObjectsAllocated,
// Fake collector type for ScopedGCCriticalSection
kCollectorTypeCriticalSection,
};
Android支持的回收算法大体有
Mark-sweep算法:还分为Sticky, Partial, Full,根据是否并行,又分为ConCurrent和Non-Concurrent
MarkSweep::MarkSweep(Heap* heap, bool is_concurrent, const std::string& name_prefix)
mark_compact 算法:标记-压缩(整理)算法
concurrent_copying算法:
semi_space算法:
if (foreground_collector_type_ == kCollectorTypeCC) {
region_space_ = space::RegionSpace::Create(kRegionSpaceName, region_space_mem_map);
} else if (IsMovingGc(foreground_collector_type_) &&
foreground_collector_type_ != kCollectorTypeGSS) {
bump_pointer_space_ = space::BumpPointerSpace::CreateFromMemMap("Bump pointer space 1",
main_mem_map_1.release());
} else {
CreateMainMallocSpace(main_mem_map_1.release(), initial_size, growth_limit_, capacity_);
if (foreground_collector_type_ == kCollectorTypeGSS) {
bump_pointer_space_ = space::BumpPointerSpace::Create("Bump pointer space 1",
kGSSBumpPointerSpaceCapacity, nullptr);
从上面的代码可以看出,如果是使用cocurrent_copying算法的话,使用RegionSpace分配空间。
如果使用的是移动回收算法同时前端回收算法不是kCollectorTypeGSS的话使用BumpPointerSpace分配空间。具体的回收算法如下:
static bool IsMovingGc(CollectorType collector_type) {
return
collector_type == kCollectorTypeSS ||
collector_type == kCollectorTypeGSS ||
collector_type == kCollectorTypeCC ||
collector_type == kCollectorTypeCCBackground ||
collector_type == kCollectorTypeMC ||
collector_type == kCollectorTypeHomogeneousSpaceCompact;
}
GSS也使用BumpPointerSpace,只不过处理有所不同:
具体的回收算法和内存分配算法之间的映射关系
三 android ART垃圾回收算法选择
使用kill –s QUIT $PID 可以查看当前虚拟机使用的垃圾回收算法,这个命令会在/data/anr下创建traces.txt文件有如下一行
Start Dumping histograms for 1 iterations for concurrent copying
上面这句话说明当前使用的kCollectorTypeCC 算法
如何决定ART的回收算法,代码在
heap_ = new gc::Heap(runtime_options.GetOrDefault(Opt::MemoryInitialSize),
// Override the collector type to CC if the read barrier config.
kUseReadBarrier ? gc::kCollectorTypeCC : xgc_option.collector_type_,
kUseReadBarrier ? BackgroundGcOption(gc::kCollectorTypeCCBackground)
其中kUseReadBarrier的定义为:
static constexpr bool kUseReadBarrier =
kUseBakerReadBarrier || kUseBrooksReadBarrier || kUseTableLookupReadBarrier;
如果kUseReadBarrier 为true的话,那么前端回收使用concurrent copying,后台使用kCollectorTypeCCBackground。
{
// If not set, background collector type defaults to homogeneous compaction.
// If foreground is GSS, use GSS as background collector.
// If not low memory mode, semispace otherwise.
gc::CollectorType background_collector_type_;
gc::CollectorType collector_type_ = (XGcOption{}).collector_type_; // NOLINT [whitespace/braces] [5]
bool low_memory_mode_ = args.Exists(M::LowMemoryMode);
background_collector_type_ = args.GetOrDefault(M::BackgroundGc);
{
XGcOption* xgc = args.Get(M::GcOption);
if (xgc != nullptr && xgc->collector_type_ != gc::kCollectorTypeNone) {
collector_type_ = xgc->collector_type_;
}
}
if (background_collector_type_ == gc::kCollectorTypeNone) {
if (collector_type_ != gc::kCollectorTypeGSS) {
background_collector_type_ = low_memory_mode_ ?
gc::kCollectorTypeSS : gc::kCollectorTypeHomogeneousSpaceCompact;
} else {
background_collector_type_ = collector_type_;
}
}
args.Set(M::BackgroundGc, BackgroundGcOption { background_collector_type_ });
}
从上面的注释可以看出,如果不设置参数的话,前端使用CMS算法,后端如果是low memory设备的话 使用kCollectorTypeSS 如下:
Start Dumping histograms for 1 iterations for marksweep + semispace
如果不是Low memory设备的话, 指定为homogeneous compact
Start Dumping histograms for 1 iterations for sticky concurrent mark sweep
如果前端是GSS的话,后端也使用GSS
如何手动设置ART的gc算法呢?
dalvik.vm.gctype和dalvik.vm.backgroundgctype 参数来控制
adb shell setprop dalvik.vm.gctype SS,preverify
四 堆内存回收管理
通过上面图,card_table Live_bitmap 和mark_bitmap都是对堆内存的映射,这三个被用来管理堆内存的回收。
if (live_bitmap != nullptr && !space->IsRegionSpace()) {
CHECK(mark_bitmap != nullptr);
live_bitmap_->AddContinuousSpaceBitmap(live_bitmap);
mark_bitmap_->AddContinuousSpaceBitmap(mark_bitmap);
}
} else {
CHECK(space->IsDiscontinuousSpace());
space::DiscontinuousSpace* discontinuous_space = space->AsDiscontinuousSpace();
live_bitmap_->AddLargeObjectBitmap(discontinuous_space->GetLiveBitmap());
mark_bitmap_->AddLargeObjectBitmap(discontinuous_space->GetMarkBitmap());
discontinuous_spaces_.push_back(discontinuous_space);
}
// Allocate the card table.
// We currently don't support dynamically resizing the card table.
// Since we don't know where in the low_4gb the app image will be located, make the card table
// cover the whole low_4gb. TODO: Extend the card table in AddSpace.
UNUSED(heap_capacity);
// Start at 64 KB, we can be sure there are no spaces mapped this low since the address range is
// reserved by the kernel.
static constexpr size_t kMinHeapAddress = 4 * KB;
card_table_.reset(accounting::CardTable::Create(reinterpret_cast(kMinHeapAddress),
4 * GB - kMinHeapAddress));
从注释看,card_table映射整个4G空间,其中card_table的结构如下:
// Maintain a card table from the the write barrier. All writes of
// non-null values to heap addresses should go through an entry in
// WriteBarrier, and from there to here.
class CardTable {
public:
static constexpr size_t kCardShift = 10;
static constexpr size_t kCardSize = 1 << kCardShift;
static constexpr uint8_t kCardClean = 0x0;
static constexpr uint8_t kCardDirty = 0x70;
static constexpr uint8_t kCardAged = kCardDirty - 1;
一个cardtable使用一个字节空间标记,这一个字节的值是GC_CARD_CLEAN或者 GC_CARD_DIRTY。一个字节表示1<< kCardShift也就是1k的空间,也就是映射4G左右的空间,card_table大约占4M左右的空间。那么card_table到底是做什么的呢? 在并行回收算法中,为了快速定位object的变化,挨个扫描栈进行遍历这种耗时又费力的方法显然不能满足需求,因此就有了card_table这种快速定位那些object修改,遍历object后标记回收。针对card_table的管理引入了ModUnionTable 和RememberedSet,其中ModUnionTable主要是管理zygote的空间和Image space,rememberdset则主要管理no_moving space通过ProcessCards的处理我们可以看到。
void Heap::ProcessCards(TimingLogger* timings,
bool use_rem_sets,
bool process_alloc_space_cards,
bool clear_alloc_space_cards) {
TimingLogger::ScopedTiming t(__FUNCTION__, timings);
// Clear cards and keep track of cards cleared in the mod-union table.
for (const auto& space : continuous_spaces_) {
accounting::ModUnionTable* table = FindModUnionTableFromSpace(space);
accounting::RememberedSet* rem_set = FindRememberedSetFromSpace(space);
if (table != nullptr) {
const char* name = space->IsZygoteSpace() ? "ZygoteModUnionClearCards" :
"ImageModUnionClearCards";
TimingLogger::ScopedTiming t2(name, timings);
table->ProcessCards(); //针对modunio_table扫描dirty,放入到dirty_card中
} else if (use_rem_sets && rem_set != nullptr) {
DCHECK(collector::SemiSpace::kUseRememberedSet && collector_type_ == kCollectorTypeGSS)
<< static_cast(collector_type_);
TimingLogger::ScopedTiming t2("AllocSpaceRemSetClearCards", timings);
rem_set->ClearCards();
} else if (process_alloc_space_cards) {
TimingLogger::ScopedTiming t2("AllocSpaceClearCards", timings);
if (clear_alloc_space_cards) {
uint8_t* end = space->End();
if (space->IsImageSpace()) {
// Image space end is the end of the mirror objects, it is not necessarily page or card
// aligned. Align up so that the check in ClearCardRange does not fail.
end = AlignUp(end, accounting::CardTable::kCardSize);
}
card_table_->ClearCardRange(space->Begin(), end);
} else {
// No mod union table for the AllocSpace. Age the cards so that the GC knows that these
// cards were dirty before the GC started.
// TODO: Need to use atomic for the case where aged(cleaning thread) -> dirty(other thread)
// -> clean(cleaning thread).
// The races are we either end up with: Aged card, unaged card. Since we have the
// checkpoint roots and then we scan / update mod union tables after. We will always
// scan either card. If we end up with the non aged card, we scan it it in the pause.
card_table_->ModifyCardsAtomic(space->Begin(), space->End(), AgeCardVisitor(),
VoidFunctor());
}
}
}
}
1 针对modunio_table 执行ProcessCards()
void ModUnionTableCardCache::ProcessCards() {
CardTable* const card_table = GetHeap()->GetCardTable();
ModUnionAddToCardBitmapVisitor visitor(card_bitmap_.get(), card_table);
// Clear dirty cards in the this space and update the corresponding mod-union bits.
card_table->ModifyCardsAtomic(space_->Begin(), space_->End(), AgeCardVisitor(), visitor);
}
主要是将修改的table添加到bitmap_中
class ModUnionAddToCardBitmapVisitor {
public:
ModUnionAddToCardBitmapVisitor(ModUnionTable::CardBitmap* bitmap, CardTable* card_table)
: bitmap_(bitmap), card_table_(card_table) {}
inline void operator()(uint8_t* card,
uint8_t expected_value,
uint8_t new_value ATTRIBUTE_UNUSED) const {
if (expected_value == CardTable::kCardDirty) {
// We want the address the card represents, not the address of the card.
bitmap_->Set(reinterpret_cast(card_table_->AddrFromCard(card)));
}
}
2 针对remember_set,执行ClearCards(),添加到dirty_cards_中
void RememberedSet::ClearCards() {
CardTable* card_table = GetHeap()->GetCardTable();
RememberedSetCardVisitor card_visitor(&dirty_cards_);
// Clear dirty cards in the space and insert them into the dirty card set.
card_table->ModifyCardsAtomic(space_->Begin(), space_->End(), AgeCardVisitor(), card_visitor);
}
3 现在只针对mark_sweep算法来看其他space在ProcessCards中,clear_alloc_space_cards为true,card_table_->ClearCardRange(space->Begin(), end);主要工作是清除card_table。如果是stick_mark_sweep的话,只是进行age操作
card_table_->ModifyCardsAtomic(space->Begin(), space->End(), AgeCardVisitor(),
VoidFunctor());
第一章中有描述,一个java进程的空间分为三大类zygote+image空间 non-moving space空间 和main space空间,通过栈空间的扫描markroots来标记main_space空间的引用,但是对于zygote non-moving space空间的话,需要通过card_table标记他们的引用。最后通过比较live_bitmap和mark_bitmap的区别来找到那些已经不被使用的Ojbect回收。