最近实习公司给了个任务调研一下Android App的ANR监控问题。为了方便自己总结记录,把整个学习过程都在博客中记录一下。
首先本人读了一下这篇简书博客《Android ANR监测诊断以及解决办法》1,看标题还是很符合我们的需求的,我先读读看哈,瞧瞧有没有帮助:
刚刚这个文章好像简单介绍了一下,为了深入了解,在具体研究方法前我又看了这篇搜狐上的文章《Android ANR监测方案解析》5,在这里看看,主要有下面几个小结:
后来我又看了一眼,Google上有一个issue entry-Possibility to detect ANR dialogs from application的帖子讨论这个ANR能否被正常追踪,我们也来一起看一下哈:
开源项目地址:ANR-WatchDog项目
监视器本身只是一个循环执行下面操作的简单线程:
项目源码附带了一个testapp module,修改了一些基本配置以后运行,就能看到ANR-Watchdog阻止了ANR窗口的弹出,相反是弹出了一个错误而crash闪退。查看log日志看到:
2019-01-07 16:19:48.839 1612-1645/anrwatchdog.github.com.testapp E/AndroidRuntime: FATAL EXCEPTION: |ANR-WatchDog|
Process: anrwatchdog.github.com.testapp, PID: 1612
com.github.anrwatchdog.ANRError: Application Not Responding
Caused by: com.github.anrwatchdog.ANRError$$$_Thread: main (state = TIMED_WAITING)
at java.lang.Thread.sleep(Native Method)
at java.lang.Thread.sleep(Thread.java:386)
at java.lang.Thread.sleep(Thread.java:327)
at com.github.anrtestapp.MainActivity.SleepAMinute(MainActivity.java:18)
at com.github.anrtestapp.MainActivity.access$100(MainActivity.java:12)
at com.github.anrtestapp.MainActivity$2.onClick(MainActivity.java:63)
at android.view.View.performClick(View.java:6291)
at android.view.View$PerformClick.run(View.java:24931)
at android.os.Handler.handleCallback(Handler.java:808)
at android.os.Handler.dispatchMessage(Handler.java:101)
at android.os.Looper.loop(Looper.java:166)
at android.app.ActivityThread.main(ActivityThread.java:7529)
at java.lang.reflect.Method.invoke(Native Method)
at com.android.internal.os.Zygote$MethodAndArgsCaller.run(Zygote.java:245)
at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:921)
Caused by: com.github.anrwatchdog.ANRError$$$_Thread: FinalizerDaemon (state = WAITING)
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:422)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:188)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:209)
at java.lang.Daemons$FinalizerDaemon.runInternal(Daemons.java:235)
at java.lang.Daemons$Daemon.run(Daemons.java:103)
at java.lang.Thread.run(Thread.java:784)
Caused by: com.github.anrwatchdog.ANRError$$$_Thread: FinalizerWatchdogDaemon (state = TIMED_WAITING)
at java.lang.Thread.sleep(Native Method)
at java.lang.Thread.sleep(Thread.java:386)
at java.lang.Thread.sleep(Thread.java:327)
at java.lang.Daemons$FinalizerWatchdogDaemon.sleepFor(Daemons.java:345)
at java.lang.Daemons$FinalizerWatchdogDaemon.waitForFinalization(Daemons.java:371)
at java.lang.Daemons$FinalizerWatchdogDaemon.runInternal(Daemons.java:284)
at java.lang.Daemons$Daemon.run(Daemons.java:103)
at java.lang.Thread.run(Thread.java:784)
Caused by: com.github.anrwatchdog.ANRError$$$_Thread: ReferenceQueueDaemon (state = WAITING)
at java.lang.Object.wait(Native Method)
at java.lang.Daemons$ReferenceQueueDaemon.runInternal(Daemons.java:178)
at java.lang.Daemons$Daemon.run(Daemons.java:103)
at java.lang.Thread.run(Thread.java:784)
Caused by: com.github.anrwatchdog.ANRError$$$_Thread: Thread-7 (state = RUNNABLE)
at libcore.io.Linux.accept(Native Method)
at libcore.io.BlockGuardOs.accept(BlockGuardOs.java:64)
at android.system.Os.accept(Os.java:43)
at android.net.LocalSocketImpl.accept(LocalSocketImpl.java:344)
at android.net.LocalServerSocket.accept(LocalServerSocket.java:90)
at com.android.tools.ir.server.Server$SocketServerThread.run(Server.java:165)
at java.lang.Thread.run(Thread.java:784)
Caused by: com.github.anrwatchdog.ANRError$$$_Thread: queued-work-looper (state = RUNNABLE)
at android.os.MessageQueue.nativePollOnce(Native Method)
at android.os.MessageQueue.next(MessageQueue.java:379)
at android.os.Looper.loop(Looper.java:144)
at android.os.HandlerThread.run(HandlerThread.java:65)
Caused by: com.github.anrwatchdog.ANRError$$$_Thread: |ANR-WatchDog| (state = RUNNABLE)
at dalvik.system.VMStack.getThreadStackTrace(Native Method)
at java.lang.Thread.getStackTrace(Thread.java:1556)
at java.lang.Thread.getAllStackTraces(Thread.java:1606)
at com.github.anrwatchdog.ANRError.New(ANRError.java:72)
at com.github.anrwatchdog.ANRWatchDog.run(ANRWatchDog.java:209)
所以我们也更清楚的看到了相关的原理——将ANR检测到之后,在ANR发生之前,直接报错。这里可以用其他的对crash的处理方法handle这个名为“ANRError”的Error,就不会crash了。
接下来就一起具体看看项目源码吧:
其实项目源码很简单,基本上就两个java文件:ANRError.java
和ANRWatchDog.java
ANRError.java
从简单的开始看,首先看看这个ANRError.java
文件开始吧:
整个文件就100多行,干脆我把整个源码贴上来吧:
package com.github.anrwatchdog;
import android.os.Looper;
import java.io.Serializable;
import java.util.Comparator;
import java.util.HashMap;
import java.util.Map;
import java.util.TreeMap;
/**
* Error thrown by {@link com.github.anrwatchdog.ANRWatchDog} when an ANR is detected.
* Contains the stack trace of the frozen UI thread.
*
* It is important to notice that, in an ANRError, all the "Caused by" are not really the cause
* of the exception. Each "Caused by" is the stack trace of a running thread. Note that the main
* thread always comes first.
*/
@SuppressWarnings({"Convert2Diamond", "UnusedDeclaration"})
public class ANRError extends Error {
private static class $ implements Serializable {
private final String _name;
private final StackTraceElement[] _stackTrace;
private class _Thread extends Throwable {
private _Thread(_Thread other) {
super(_name, other);
}
@Override
public Throwable fillInStackTrace() {
setStackTrace(_stackTrace);
return this;
}
}
private $(String name, StackTraceElement[] stackTrace) {
_name = name;
_stackTrace = stackTrace;
}
}
private static final long serialVersionUID = 1L;
private ANRError($._Thread st) {
super("Application Not Responding", st);
}
@Override
public Throwable fillInStackTrace() {
setStackTrace(new StackTraceElement[] {});
return this;
}
static ANRError New(String prefix, boolean logThreadsWithoutStackTrace) {
final Thread mainThread = Looper.getMainLooper().getThread();
final Map<Thread, StackTraceElement[]> stackTraces = new TreeMap<Thread, StackTraceElement[]>(new Comparator<Thread>() {
@Override
public int compare(Thread lhs, Thread rhs) {
if (lhs == rhs)
return 0;
if (lhs == mainThread)
return 1;
if (rhs == mainThread)
return -1;
return rhs.getName().compareTo(lhs.getName());
}
});
for (Map.Entry<Thread, StackTraceElement[]> entry : Thread.getAllStackTraces().entrySet())
if (
entry.getKey() == mainThread
|| (
entry.getKey().getName().startsWith(prefix)
&& (
logThreadsWithoutStackTrace
||
entry.getValue().length > 0
)
)
)
stackTraces.put(entry.getKey(), entry.getValue());
// Sometimes main is not returned in getAllStackTraces() - ensure that we list it
if (!stackTraces.containsKey(mainThread)) {
stackTraces.put(mainThread, mainThread.getStackTrace());
}
$._Thread tst = null;
for (Map.Entry<Thread, StackTraceElement[]> entry : stackTraces.entrySet())
tst = new $(getThreadTitle(entry.getKey()), entry.getValue()).new _Thread(tst);
return new ANRError(tst);
}
static ANRError NewMainOnly() {
final Thread mainThread = Looper.getMainLooper().getThread();
final StackTraceElement[] mainStackTrace = mainThread.getStackTrace();
return new ANRError(new $(getThreadTitle(mainThread), mainStackTrace).new _Thread(null));
}
private static String getThreadTitle(Thread thread) {
return thread.getName() + " (state = " + thread.getState() + ")";
}
}
通读一下……嗯……没什么需要注意的,直接进入ANRWatchDog.java
文件吧。
ANRWatchDog.java
大佬就是大佬,你看这个类的名字和项目名称一样,一眼就知道这个是核心类。代码也不长,一共220行,也干脆贴出来吧:
package com.github.anrwatchdog;
/*
* The MIT License (MIT)
*
* Copyright (c) 2016 Salomon BRYS
*
* Permission is hereby granted, free of charge, to any person obtaining a copy of
* this software and associated documentation files (the "Software"), to deal in
* the Software without restriction, including without limitation the rights to
* use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of
* the Software, and to permit persons to whom the Software is furnished to do so,
* subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in all
* copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
* FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
* COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
* IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
* CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
*/
import android.os.Debug;
import android.os.Handler;
import android.os.Looper;
import android.util.Log;
/**
* A watchdog timer thread that detects when the UI thread has frozen.
*/
@SuppressWarnings("UnusedDeclaration")
public class ANRWatchDog extends Thread {
public interface ANRListener {
public void onAppNotResponding(ANRError error);
}
public interface InterruptionListener {
public void onInterrupted(InterruptedException exception);
}
private static final int DEFAULT_ANR_TIMEOUT = 5000;
private static final ANRListener DEFAULT_ANR_LISTENER = new ANRListener() {
@Override public void onAppNotResponding(ANRError error) {
throw error;
}
};
private static final InterruptionListener DEFAULT_INTERRUPTION_LISTENER = new InterruptionListener() {
@Override public void onInterrupted(InterruptedException exception) {
Log.w("ANRWatchdog", "Interrupted: " + exception.getMessage());
}
};
private ANRListener _anrListener = DEFAULT_ANR_LISTENER;
private InterruptionListener _interruptionListener = DEFAULT_INTERRUPTION_LISTENER;
private final Handler _uiHandler = new Handler(Looper.getMainLooper());
private final int _timeoutInterval;
private String _namePrefix = "";
private boolean _logThreadsWithoutStackTrace = false;
private boolean _ignoreDebugger = false;
private volatile int _tick = 0;
private final Runnable _ticker = new Runnable() {
@Override public void run() {
_tick = (_tick + 1) % Integer.MAX_VALUE;
}
};
/**
* Constructs a watchdog that checks the ui thread every {@value #DEFAULT_ANR_TIMEOUT} milliseconds
*/
public ANRWatchDog() {
this(DEFAULT_ANR_TIMEOUT);
}
/**
* Constructs a watchdog that checks the ui thread every given interval
*
* @param timeoutInterval The interval, in milliseconds, between to checks of the UI thread.
* It is therefore the maximum time the UI may freeze before being reported as ANR.
*/
public ANRWatchDog(int timeoutInterval) {
super();
_timeoutInterval = timeoutInterval;
}
/**
* Sets an interface for when an ANR is detected.
* If not set, the default behavior is to throw an error and crash the application.
*
* @param listener The new listener or null
* @return itself for chaining.
*/
public ANRWatchDog setANRListener(ANRListener listener) {
if (listener == null) {
_anrListener = DEFAULT_ANR_LISTENER;
}
else {
_anrListener = listener;
}
return this;
}
/**
* Sets an interface for when the watchdog thread is interrupted.
* If not set, the default behavior is to just log the interruption message.
*
* @param listener The new listener or null.
* @return itself for chaining.
*/
public ANRWatchDog setInterruptionListener(InterruptionListener listener) {
if (listener == null) {
_interruptionListener = DEFAULT_INTERRUPTION_LISTENER;
}
else {
_interruptionListener = listener;
}
return this;
}
/**
* Set the prefix that a thread's name must have for the thread to be reported.
* Note that the main thread is always reported.
* Default "".
*
* @param prefix The thread name's prefix for a thread to be reported.
* @return itself for chaining.
*/
public ANRWatchDog setReportThreadNamePrefix(String prefix) {
if (prefix == null)
prefix = "";
_namePrefix = prefix;
return this;
}
/**
* Set that only the main thread will be reported.
*
* @return itself for chaining.
*/
public ANRWatchDog setReportMainThreadOnly() {
_namePrefix = null;
return this;
}
/**
* Set that all running threads will be reported,
* even those from which no stack trace could be extracted.
* Default false.
*
* @param logThreadsWithoutStackTrace Whether or not all running threads should be reported
* @return itself for chaining.
*/
public ANRWatchDog setLogThreadsWithoutStackTrace(boolean logThreadsWithoutStackTrace) {
_logThreadsWithoutStackTrace = logThreadsWithoutStackTrace;
return this;
}
/**
* Set whether to ignore the debugger when detecting ANRs.
* When ignoring the debugger, ANRWatchdog will detect ANRs even if the debugger is connected.
* By default, it does not, to avoid interpreting debugging pauses as ANRs.
* Default false.
*
* @param ignoreDebugger Whether to ignore the debugger.
* @return itself for chaining.
*/
public ANRWatchDog setIgnoreDebugger(boolean ignoreDebugger) {
_ignoreDebugger = ignoreDebugger;
return this;
}
@Override
public void run() {
setName("|ANR-WatchDog|");
int lastTick;
int lastIgnored = -1;
while (!isInterrupted()) {
lastTick = _tick;
_uiHandler.post(_ticker);
try {
Thread.sleep(_timeoutInterval);
}
catch (InterruptedException e) {
_interruptionListener.onInterrupted(e);
return ;
}
// If the main thread has not handled _ticker, it is blocked. ANR.
if (_tick == lastTick) {
if (!_ignoreDebugger && Debug.isDebuggerConnected()) {
if (_tick != lastIgnored)
Log.w("ANRWatchdog", "An ANR was detected but ignored because the debugger is connected (you can prevent this with setIgnoreDebugger(true))");
lastIgnored = _tick;
continue ;
}
ANRError error;
if (_namePrefix != null)
error = ANRError.New(_namePrefix, _logThreadsWithoutStackTrace);
else
error = ANRError.NewMainOnly();
_anrListener.onAppNotResponding(error);
return;
}
}
}
}
ANRWatchDog.run()
我们直接从最后一个函数:public void run()
开始理解:
@Override
public void run() {
setName("|ANR-WatchDog|");
int lastTick;
int lastIgnored = -1;
while (!isInterrupted()) {
lastTick = _tick;
_uiHandler.post(_ticker);
try {
Thread.sleep(_timeoutInterval);
}
catch (InterruptedException e) {
_interruptionListener.onInterrupted(e);
return ;
}
// If the main thread has not handled _ticker, it is blocked. ANR.
if (_tick == lastTick) {
if (!_ignoreDebugger && Debug.isDebuggerConnected()) {
if (_tick != lastIgnored)
Log.w("ANRWatchdog", "An ANR was detected but ignored because the debugger is connected (you can prevent this with setIgnoreDebugger(true))");
lastIgnored = _tick;
continue ;
}
ANRError error;
if (_namePrefix != null)
error = ANRError.New(_namePrefix, _logThreadsWithoutStackTrace);
else
error = ANRError.NewMainOnly();
_anrListener.onAppNotResponding(error);
return;
}
}
}
首先说明,(假设读者肯定理解多线程的逻辑哈)这个run()
函数是Thread
子类的核心想必都清楚,我要告诉大家的是这个Thread线程(ANRWatchDog
)的start()
方法是在Application.onCreate()
中调用的。
int型
变量:lastTick, lastIgnored
// todo: 详解run()方法逻辑
StackTraceElement
6。这里参考官方文档可以理解个大概,我们又找到了这样一个博客7,以及易百教程8。StackTraceElement
的解释有下面这一段话:/*
* StackTrace简述
* 1 StackTrace用栈的形式保存了方法的调用信息.
* 2 怎么获取这些调用信息呢?
* 可用Thread.currentThread().getStackTrace()方法
* 得到当前线程的StackTrace信息.
* 该方法返回的是一个StackTraceElement数组.
* 3 该StackTraceElement数组就是StackTrace中的内容.
* 4 遍历该StackTraceElement数组.就可以看到方法间的调用流程.
* 比如线程中methodA调用了methodB那么methodA先入栈methodB再入栈.
* 5 在StackTraceElement数组下标为2的元素中保存了当前方法的所属文件名,当前方法所属
* 的类名,以及该方法的名字.除此以外还可以获取方法调用的行数.
* 6 在StackTraceElement数组下标为3的元素中保存了当前方法的调用者的信息和它调用
* 时的代码行数.
* /
按照该文档的
《Android ANR监测诊断以及解决办法》[简书]https://www.jianshu.com/p/8ae173c9fb08 ↩︎
《StrictMode:安卓中的严格模式》[简书]https://www.jianshu.com/p/271474cd1d91 ↩︎
《Android 编程下的 TraceView 简介及其案例实战》[博客园]https://www.cnblogs.com/sunzn/p/3192231.html ↩︎
《Android trace文件抓取原理》[简书]https://www.jianshu.com/p/f406d535a8bc ↩︎
《Android ANR监测方案解析》[搜狐]https://www.sohu.com/a/220647552_741445 ↩︎ ↩︎
《Oracle官方java说明文档——StackTraceElement》https://docs.oracle.com/javase/9/docs/api/java/lang/StackTraceElement.html ↩︎
《StackTrace简述以及StackTraceElement使用实例》[博客园]https://www.cnblogs.com/xiaozz/p/6448622.html ↩︎ ↩︎ ↩︎
《java.lang.StackTraceElement类
》[易百教程]https://www.yiibai.com/java/lang/java_lang_stacktraceelement.html ↩︎