LeakCanary 是大名鼎鼎的 Square 公司出品的开源库,用于分析 APP 的内存泄露并以非常直观的方式展示出来。本文中的 LeakCanary 版本为 v1.6.3 。现在最新的版本为 v2.0-alpha-2,已经全部用 Kotlin 重写了(网络库 okhttp 也同样用 Kotlin 重写了),这也从侧面反映了 Kotlin 是大势所趋。
在 github 的官方指南链接中,我们可以看到,LeakCanary 检测程序内存泄露的入口是:
LeakCanary.install(this);
我们看该行代码做了什么操作:
/**
* Creates a {@link RefWatcher} that works out of the box, and starts watching activity
* references (on ICS+).
*/
public static @NonNull RefWatcher install(@NonNull Application application) {
return refWatcher(application).listenerServiceClass(DisplayLeakService.class)
.excludedRefs(AndroidExcludedRefs.createAppDefaults().build())
.buildAndInstall();
}
上面的代码,采样 Bulid 模式构造了一个 RefWatcher 对象,看下buildAndInstall()
做了什么:
public @NonNull RefWatcher buildAndInstall() {
if (LeakCanaryInternals.installedRefWatcher != null) {
throw new UnsupportedOperationException("buildAndInstall() should only be called once.");
}
RefWatcher refWatcher = build();
if (refWatcher != DISABLED) {
if (enableDisplayLeakActivity) {
LeakCanaryInternals.setEnabledAsync(context, DisplayLeakActivity.class, true);
}
if (watchActivities) {
ActivityRefWatcher.install(context, refWatcher);
}
if (watchFragments) {
FragmentRefWatcher.Helper.install(context, refWatcher);
}
}
LeakCanaryInternals.installedRefWatcher = refWatcher;
return refWatcher;
}
其中,refWatcher 不为 DISABLED,而 enableDisplayLeakActivity
赋值是通过:
enableDisplayLeakActivity = DisplayLeakService.class.isAssignableFrom(listenerServiceClass);
得到的,如果 A 是 B 的父类或父接口,A.isAssignableFrom(B) 返回 true,因此 enableDisplayLeakActivity 为 true。watchActivities
和 watchFragments
默认也为 true,接着看ActivityRefWatcher.install()
做了什么:
public static void install(@NonNull Context context, @NonNull RefWatcher refWatcher) {
Application application = (Application) context.getApplicationContext();
ActivityRefWatcher activityRefWatcher = new ActivityRefWatcher(application, refWatcher);
application.registerActivityLifecycleCallbacks(activityRefWatcher.lifecycleCallbacks);
}
private final Application.ActivityLifecycleCallbacks lifecycleCallbacks =
new ActivityLifecycleCallbacksAdapter() {
@Override public void onActivityDestroyed(Activity activity) {
refWatcher.watch(activity);
}
};
可以看到,这里会注册 activity 的生命周期回调,在onActivityDestroyed()
即 activity 销毁的时候,执行 refWatcher.watch()
。找到 RefWatcher
的 watch()
:
public void watch(Object watchedReference) {
watch(watchedReference, "");
}
public void watch(Object watchedReference, String referenceName) {
if (this == DISABLED) {
return;
}
checkNotNull(watchedReference, "watchedReference");
checkNotNull(referenceName, "referenceName");
final long watchStartNanoTime = System.nanoTime();
String key = UUID.randomUUID().toString();
retainedKeys.add(key);
final KeyedWeakReference reference =
new KeyedWeakReference(watchedReference, key, referenceName, queue);
ensureGoneAsync(watchStartNanoTime, reference);
}
其中,先获取了一个随机的 key,然后将 key 加入到 retainedKeys
这个Set中,再创建一个与引用队列 queue
绑定的弱引用。这里需要说一下弱引用 WeakReference 和引用队列 ReferenceQueue,由于 Java 的 内存管理机制,当 GC 时,弱引用对象所引用的对象会被加入到引用队列;相反,如果引用队列中没有该弱引用对象,那就说明弱引用对象的引用没有被释放,也即有可能发生了内存泄露。可以说,WeakReference + ReferenceQueue 撑起了 LeakCanary 的大旗。
在 watch()
方法的最后,会走 ensureGoneAsync()
:
private void ensureGoneAsync(final long watchStartNanoTime, final KeyedWeakReference reference) {
watchExecutor.execute(new Retryable() {
@Override public Retryable.Result run() {
return ensureGone(reference, watchStartNanoTime);
}
});
}
@SuppressWarnings("ReferenceEquality") // Explicitly checking for named null.
Retryable.Result ensureGone(final KeyedWeakReference reference, final long watchStartNanoTime) {
long gcStartNanoTime = System.nanoTime();
long watchDurationMs = NANOSECONDS.toMillis(gcStartNanoTime - watchStartNanoTime);
removeWeaklyReachableReferences();
//这里Debugger指的是当前是否在调试模式(IDE断点调试时会返回true),调试模式下可能导致错误的泄露,需要排除
if (debuggerControl.isDebuggerAttached()) {
// The debugger can create false leaks.
return RETRY;
}
if (gone(reference)) {
return DONE;
}
//触发GC
gcTrigger.runGc();
removeWeaklyReachableReferences();
if (!gone(reference)) {
long startDumpHeap = System.nanoTime();
long gcDurationMs = NANOSECONDS.toMillis(startDumpHeap - gcStartNanoTime);
//得到堆转储hprof文件
File heapDumpFile = heapDumper.dumpHeap();
if (heapDumpFile == RETRY_LATER) {
// Could not dump the heap.
return RETRY;
}
long heapDumpDurationMs = NANOSECONDS.toMillis(System.nanoTime() - startDumpHeap);
HeapDump heapDump = heapDumpBuilder.heapDumpFile(heapDumpFile).referenceKey(reference.key)
.referenceName(reference.name)
.watchDurationMs(watchDurationMs)
.gcDurationMs(gcDurationMs)
.heapDumpDurationMs(heapDumpDurationMs)
.build();
heapdumpListener.analyze(heapDump);
}
return DONE;
}
ensureGoneAsync()
里面会 调用watchExecutor.execute()
,watchExecutor
的真正实现是 AndroidWatchExecutor
,看下里面的 execute()
方法:
@Override public void execute(@NonNull Retryable retryable) {
if (Looper.getMainLooper().getThread() == Thread.currentThread()) {
waitForIdle(retryable, 0);
} else {
postWaitForIdle(retryable, 0);
}
}
private void postWaitForIdle(final Retryable retryable, final int failedAttempts) {
mainHandler.post(new Runnable() {
@Override public void run() {
waitForIdle(retryable, failedAttempts);
}
});
}
private void waitForIdle(final Retryable retryable, final int failedAttempts) {
// This needs to be called from the main thread.
Looper.myQueue().addIdleHandler(new MessageQueue.IdleHandler() {
@Override public boolean queueIdle() {
postToBackgroundWithDelay(retryable, failedAttempts);
return false;
}
});
}
private void postToBackgroundWithDelay(final Retryable retryable, final int failedAttempts) {
long exponentialBackoffFactor = (long) Math.min(Math.pow(2, failedAttempts), maxBackoffFactor);
long delayMillis = initialDelayMillis * exponentialBackoffFactor;
backgroundHandler.postDelayed(new Runnable() {
@Override public void run() {
Retryable.Result result = retryable.run();
if (result == RETRY) {
postWaitForIdle(retryable, failedAttempts + 1);
}
}
}, delayMillis);
}
execute()
会先判断当前线程,如果当前线程是在主线程,就执行waitForIdle(retryable, 0);
,否则执行postWaitForIdle(retryable, 0);
。而postWaitForIdle()
方法中,mainHandler 为主线程的Handler,因此该方法又将其执行的线程切换为主线程,故waitForIdle()
始终执行在主线程。
在主线程执行waitForIdle()
,注册了addIdleHandler()
回调,IdleHandler
会在主线程的空闲期得到执行,当执行的时候会调用postToBackgroundWithDelay()
。这个方法为 LeakCanary 检测的重试机制,参数 failedAttempts 表示重试的次数,从0开始递增,因此指数补偿因子exponentialBackoffFactor
从 2^0 开始指数级增加,而延时时间 delayMillis
同样从 5 * 2^0 开始指数级增加。postToBackgroundWithDelay()
然后会切换到子线程中执行,Retryable.Result result = retryable.run();
这里执行的是 ensureGoneAsync()
,run()
返回的是 ensureGone()
的执行结果。当没有内存泄露时,会返回 RETRY
,就会走postWaitForIdle(retryable, failedAttempts + 1);
可以看到,这里的重试次数增加了1,并且postToBackgroundWithDelay()
下次执行的时间也会指数级增加。
由于这个方法很重要,因此单独拿出来作为一小节。下面来分析前面的ensureGone()
,再贴一下其代码:
Retryable.Result ensureGone(final KeyedWeakReference reference, final long watchStartNanoTime) {
long gcStartNanoTime = System.nanoTime();
long watchDurationMs = NANOSECONDS.toMillis(gcStartNanoTime - watchStartNanoTime);
removeWeaklyReachableReferences();
if (debuggerControl.isDebuggerAttached()) {
// The debugger can create false leaks.
return RETRY;
}
if (gone(reference)) {
return DONE;
}
gcTrigger.runGc();
removeWeaklyReachableReferences();
if (!gone(reference)) {
long startDumpHeap = System.nanoTime();
long gcDurationMs = NANOSECONDS.toMillis(startDumpHeap - gcStartNanoTime);
File heapDumpFile = heapDumper.dumpHeap();
if (heapDumpFile == RETRY_LATER) {
// Could not dump the heap.
return RETRY;
}
long heapDumpDurationMs = NANOSECONDS.toMillis(System.nanoTime() - startDumpHeap);
HeapDump heapDump = heapDumpBuilder.heapDumpFile(heapDumpFile).referenceKey(reference.key)
.referenceName(reference.name)
.watchDurationMs(watchDurationMs)
.gcDurationMs(gcDurationMs)
.heapDumpDurationMs(heapDumpDurationMs)
.build();
heapdumpListener.analyze(heapDump);
}
return DONE;
}
removeWeaklyReachableReferences()
,顾名思义,即去除弱可达引用:
private void removeWeaklyReachableReferences() {
// WeakReferences are enqueued as soon as the object to which they point to becomes weakly
// reachable. This is before finalization or garbage collection has actually happened.
KeyedWeakReference ref;
while ((ref = (KeyedWeakReference) queue.poll()) != null) {
retainedKeys.remove(ref.key);
}
}
queue
是引用队列 ReferenceQueue
,retainedKeys 是保存与弱引用关联着的 key 的 Set 集合,在watch()
的时候会将 key 加入到该集合。GC 的时候,对象会被回收,并且对象的弱引用会被加入到该引用队列queue
。源码中,如果queue.poll() 不为空,说明弱引用对象已经被加入到了引用队列中,也就说明ReferenceQueue
所引用的对象已经被 GC 正常回收了,因此该对象不需要再被监视了,即从集合中 remove 掉。
再往下看ensureGone()
,再判断这个对象有没有被回收就很容易了:
private boolean gone(KeyedWeakReference reference) {
return !retainedKeys.contains(reference.key);
}
key
不在集合中,就表明对象已经被回收了。
经过两次removeWeaklyReachableReferences()
操作,如果还没有被回收,那么就有可能发生了内存泄露,继续往下执行。heapDumper.dumpHeap()
是生成 hprof 的方法,heapDumper 是 AndroidHeapDumper 的一个对象:
@Override @Nullable
public File dumpHeap() {
// 生成一个用于存储 hprof 的空文件
File heapDumpFile = leakDirectoryProvider.newHeapDumpFile();
//如果文件创建失败,会走重试逻辑
if (heapDumpFile == RETRY_LATER) {
return RETRY_LATER;
}
// FutureResult 内部有一个 CountDownLatch,用于倒计时
FutureResult waitingForToast = new FutureResult<>();
// 切换到主线程显示 toast(也就是我们见到的那个带有金丝雀icon的toast)
showToast(waitingForToast);
// 等待5秒,确保 toast 已完成显示
if (!waitingForToast.wait(5, SECONDS)) {
CanaryLog.d("Did not dump heap, too much time waiting for Toast.");
return RETRY_LATER;
}
// 创建通知,显示在通知栏
Notification.Builder builder = new Notification.Builder(context)
.setContentTitle(context.getString(R.string.leak_canary_notification_dumping));
Notification notification = LeakCanaryInternals.buildNotification(context, builder);
NotificationManager notificationManager =
(NotificationManager) context.getSystemService(Context.NOTIFICATION_SERVICE);
int notificationId = (int) SystemClock.uptimeMillis();
notificationManager.notify(notificationId, notification);
Toast toast = waitingForToast.get();
try {
// 采用系统自带的Debug工具来 doump 出 hprof文件,并保存到指定的文件目录下
Debug.dumpHprofData(heapDumpFile.getAbsolutePath());
//dump完毕后,取消toast显示
cancelToast(toast);
notificationManager.cancel(notificationId);
return heapDumpFile;
} catch (Exception e) {
CanaryLog.d(e, "Could not dump heap");
// Abort heap dump
return RETRY_LATER;
}
}
看看showToast()
是如何确保 toast 已完成显示:
private void showToast(final FutureResult waitingForToast) {
mainHandler.post(new Runnable() {
@Override public void run() {
if (resumedActivity == null) {
waitingForToast.set(null);
return;
}
final Toast toast = new Toast(resumedActivity);
toast.setGravity(Gravity.CENTER_VERTICAL, 0, 0);
toast.setDuration(Toast.LENGTH_LONG);
LayoutInflater inflater = LayoutInflater.from(resumedActivity);
toast.setView(inflater.inflate(R.layout.leak_canary_heap_dump_toast, null));
toast.show();
// Waiting for Idle to make sure Toast gets rendered.
Looper.myQueue().addIdleHandler(new MessageQueue.IdleHandler() {
@Override public boolean queueIdle() {
waitingForToast.set(toast);
return false;
}
});
}
});
}
由于前面的ensureGone()
运行在子线程,dumpHeap()
因此也在子线程。showToast()
先切换到主线程,在主线程消息队列中,先显示toast,然后在主线程空闲的时候执行waitingForToast.set(toast)
将 CountDownLatch 计数器减 1。CountDownLatch 计数器count的初值为1,执行set()
后即为 0。
CountDownLatch 的 await(long time, TimeUnit unit)
方法意思是在指定的时间内一直处于等待状态,即阻塞当前线程。当超过等待时间或者计数器减至0时,就结束等待,不再阻塞当前线程,此时waitingForToast.wait(5, SECONDS)
返回 true。所以,在if (!waitingForToast.wait(5, SECONDS))
判断中,最多会等待 5s,如果超时会走重试机制;否则如果在这 5s 时间内 CountDownLatch 执行了减 1 操作使 计数器count 至 0,则会正常走后续流程,同时可以推出它前面 toast 肯定也已经显示完成了(因为后执行的都执行完毕了,那么它之前的肯定也执行完毕了)。
再接着看ensureGone()
后面的 heapdumpListener.analyze(heapDump)
。
heapdumpListener 是 ServiceHeapDumpListener 的一个对象,最终执行了HeapAnalyzerService.runAnalysis方法。
public static void runAnalysis(Context context, HeapDump heapDump,
Class extends AbstractAnalysisResultService> listenerServiceClass) {
setEnabledBlocking(context, HeapAnalyzerService.class, true);
setEnabledBlocking(context, listenerServiceClass, true);
Intent intent = new Intent(context, HeapAnalyzerService.class);
intent.putExtra(LISTENER_CLASS_EXTRA, listenerServiceClass.getName());
intent.putExtra(HEAPDUMP_EXTRA, heapDump);
//这里会开启一个前台服务。
ContextCompat.startForegroundService(context, intent);
}
HeapAnalyzerService 继承自 ForegroundService,而ForegroundService 又继承自 IntentService。IntentService 最终实际的耗时操作都在onHandleIntent()
回调中进行:
@Override
protected void onHandleIntentInForeground(@Nullable Intent intent) {
if (intent == null) {
CanaryLog.d("HeapAnalyzerService received a null intent, ignoring.");
return;
}
String listenerClassName = intent.getStringExtra(LISTENER_CLASS_EXTRA);
// hprof 文件
HeapDump heapDump = (HeapDump) intent.getSerializableExtra(HEAPDUMP_EXTRA);
HeapAnalyzer heapAnalyzer =
new HeapAnalyzer(heapDump.excludedRefs, this, heapDump.reachabilityInspectorClasses);
// checkForLeak 会调用 haha 库,分析 hprof 文件
AnalysisResult result = heapAnalyzer.checkForLeak(heapDump.heapDumpFile, heapDump.referenceKey,
heapDump.computeRetainedHeapSize);
AbstractAnalysisResultService.sendResultToListener(this, listenerClassName, heapDump, result);
}
这其中,最关键的在于分析 hprof 文件checkForLeak()
。
public @NonNull AnalysisResult checkForLeak(@NonNull File heapDumpFile,
@NonNull String referenceKey,
boolean computeRetainedSize) {
long analysisStartNanoTime = System.nanoTime();
if (!heapDumpFile.exists()) {
Exception exception = new IllegalArgumentException("File does not exist: " + heapDumpFile);
return failure(exception, since(analysisStartNanoTime));
}
try {
// 更新分析进度的回调
listener.onProgressUpdate(READING_HEAP_DUMP_FILE);
HprofBuffer buffer = new MemoryMappedFileBuffer(heapDumpFile);
HprofParser parser = new HprofParser(buffer);
listener.onProgressUpdate(PARSING_HEAP_DUMP);
// 将 hprof 文件解析为 Snapshot(由 Square 另一个开源框架 haha 库完成)
Snapshot snapshot = parser.parse();
listener.onProgressUpdate(DEDUPLICATING_GC_ROOTS);
// 移除相同 GC root 项
deduplicateGcRoots(snapshot);
listener.onProgressUpdate(FINDING_LEAKING_REF);
// 查找内存泄漏项
Instance leakingRef = findLeakingReference(referenceKey, snapshot);
// False alarm, weak reference was cleared in between key check and heap dump.
// leakingRef为空,就说明没有泄漏
if (leakingRef == null) {
String className = leakingRef.getClassObj().getClassName();
return noLeak(className, since(analysisStartNanoTime));
}
// 找到泄漏处的引用关系链
return findLeakTrace(analysisStartNanoTime, snapshot, leakingRef, computeRetainedSize);
} catch (Throwable e) {
return failure(e, since(analysisStartNanoTime));
}
}
findLeakingReference()
是查找泄漏的引用处,看下代码:
private Instance findLeakingReference(String key, Snapshot snapshot) {
// 从 hprof 文件保存的对象中找到所有 KeyedWeakReference 的实例
ClassObj refClass = snapshot.findClass(KeyedWeakReference.class.getName());
if (refClass == null) {
throw new IllegalStateException(
"Could not find the " + KeyedWeakReference.class.getName() + " class in the heap dump.");
}
List keysFound = new ArrayList<>();
// 对 KeyedWeakReference 实例列表进行遍历
for (Instance instance : refClass.getInstancesList()) {
// 获取每个实例里的所有字段
List values = classInstanceValues(instance);
// 找到 key 字段对应的值
Object keyFieldValue = fieldValue(values, "key");
if (keyFieldValue == null) {
keysFound.add(null);
continue;
}
// 将 keyFieldValue 转为 String 对象
String keyCandidate = asString(keyFieldValue);
// 如果这个对象的 key 和 查找的 key 相同,就返回这个弱对象持有的原对象
if (keyCandidate.equals(key)) {
return fieldValue(values, "referent");
}
keysFound.add(keyCandidate);
}
throw new IllegalStateException(
"Could not find weak reference with key " + key + " in " + keysFound);
}
至此,LeakCanary 的分析就告一段落了。
整个 LeakCanary 内存泄露分析的流程图如下:
学到的一些知识点:
内存泄露如何判断(WeakReference)
Builder模式的继承(泛型实现)
Debug.dumpHprofData()获取内存数据
动态设置组件开关:setComponentEnabledSetting()
当前进程判断:isInServiceProcess()
手动GC的方法:GcTrigger.run()
注册UI线程的空闲回调:Looper.myQueue().addIdleHandler()