https://www.jianshu.com/p/6d855e984b99
https://blog.csdn.net/dadoneo/article/details/8270107
https://blog.csdn.net/abm1993/article/details/80461752
https://blog.csdn.net/sinat_22657459/article/details/52780759
一、 Log adb pull data/anr/traces.txt
1.弹出一个对话框
2.将ANR信息输出到traces.txt文件中
traces.txt文件是一个ANR记录文件,用于开发人员调试,目录位于/data/anr中,
无需root权限即可通过pull命令获取,下面的命令可以将traces.txt文件拷贝到当前目录下
3.将ANR信息输出到Logcat中
ANR一般有三种类型:
1:KeyDispatchTimeout(5 seconds) --主要类型
按键或触摸事件在特定时间内无响应
主线程对输入事件5秒内没有处理完毕
2:BroadcastTimeout(10 seconds)
前台广播 后台广播
BroadcastReceiver在特定时间内无法处理完成
主线程在执行BroadcastReceiver的onReceive()函数时10秒内没有处理完毕
3:ServiceTimeout(20 seconds) --小概率类型
前台服务,后台服务
Service在特定的时间内无法处理完成
主线程在Service的各个生命周期函数时20秒内没有处理完毕。
PS:KeyDispatchTimeout --- 时间定义的地方
Akey or touch event was not dispatched within the specified time(按键或触摸事件在特定时间内无响应)
具体的超时时间的定义在framework下的
ActivityManagerService.java
//How long we wait until we timeout on key dispatching.
staticfinal int KEY_DISPATCHING_TIMEOUT = 5*1000
二、那么导致ANR的根本原因是什么呢?细分的话,导致ANR的原因有如下几点:
1.耗时的网络访问
2.大量的数据读写
3.数据库操作
4.硬件操作(比如camera)
5.调用thread的join()方法、sleep()方法、wait()方法或者等待线程锁的时候
6.service binder的数量达到上限
7.system server中发生WatchDog ANR
8.service忙导致超时无响应
9.其他线程持有锁,导致主线程等待超时
10.其它线程终止或崩溃导致主线程一直等待。
11.lowmemorykill
PS
1:UI线程尽量只做跟UI相关的工作
2:CPU满负荷, I/O阻塞的 耗时的工作(比如数据库操作,I/O,连接网络或者别的有可能阻碍UI线程的操作)把它放入单独的线程处理
3:尽量用Handler来处理UIthread和别的thread之间的交互
4. 内存不够用的,增大VM内存, 使用largeHeap属性, 排查内存泄露.
---------------------------------------------------------------------------------------------------
Activity的所有生命周期回调都是执行在主线程的.
Service默认是执行在主线程的.
BroadcastReceiver的onReceive回调是执行在主线程的.
没有使用子线程的looper的Handler的handleMessage, post(Runnable)是执行在主线程的.
AsyncTask的回调中除了doInBackground, 其他都是执行在主线程的.
View的post(Runnable)是执行在主线程的.
1.Loader CursorLoader 异步loader
2.Thread AsyncTask HandlerThread
使用Thread和HandlerThread时, 为了使效果更好, 建议设置Thread的优先级偏低一点:
Process.setThreadPriority(THREAD_PRIORITY_BACKGROUND);
因为如果没有做任何优先级设置的话, 你创建的Thread默认和UI Thread是具有同样的优先级的
. 同样的优先级的Thread, CPU调度上还是可能会阻塞掉你的UI Thread, 导致ANR的.
// 启动一个名为new_thread的子线程
HandlerThread thread = new HandlerThread("new_thread");
thread.start();
// 取new_thread赋值给ServiceHandler
private ServiceHandler mServiceHandler;
mServiceLooper = thread.getLooper();
mServiceHandler = new ServiceHandler(mServiceLooper);
private final class ServiceHandler extends Handler {
public ServiceHandler(Looper looper) {
super(looper);
}
@Override
public void handleMessage(Message msg) {
// 此时handleMessage是运行在new_thread这个子线程中了.
}
}
3. IntentService
Service是运行在主线程的, 然而IntentService是运行在子线程的.
实际上IntentService就是实现了一个HandlerThread + ServiceHandler的模式.
--------------------------------------------------------------------------------------------------
三、
3.1 .cpu 满负荷
Process:
...
CPU usage from 3330ms to 814ms ago:----CPU在ANR发生前的使用情况
...
100%TOTAL: 5.9% user + 4.1% kernel + 89% iowait
CPUusage from 3697ms to 4223ms later:-- ANR后CPU的使用量
其中绝大数是被iowait即I/O操作占用了.
此时分析方法调用栈 trace log, 一般来说会发现是方法中有频繁的文件读写或是数据库读写操作放在主线程来做了.
3.2 内存吃紧
Cmdline: xxxx
DALVIK THREADS:
"main"prio=5 tid=3 VMWAIT
...
...
MEMINFO in pid 1360 [android.xxx.xxx] **
native dalvik other total
free: 296 2436 N/A 2732
3.3 线程,等其他问题都需要分析主线程当时在干什么
八:Thread状态(每一段都是一个线程,主要看main线程)
ThreadState (defined at “dalvik/vm/thread.h “)
THREAD_UNDEFINED = -1, /* makes enum compatible with int32_t */
THREAD_ZOMBIE = 0, /* TERMINATED 死亡,终止运行 */
THREAD_RUNNING = 1, /* RUNNABLE or running now 线程可运行或正在运行*/
THREAD_TIMED_WAIT = 2, /* TIMED_WAITING in Object.wait() 执行了带有超时参数的wait、sleep或join函数 */
THREAD_MONITOR = 3, /* BLOCKED on a monitor 线程阻塞,等待获取对象锁 */
THREAD_WAIT = 4, /* WAITING in Object.wait() 执行了无超时参数的wait函数*/
THREAD_INITIALIZING= 5, /* allocated, not yet running 新建,正在初始化,*/
THREAD_STARTING = 6, /* started, not yet on thread list 新建,正在启动 */
THREAD_NATIVE = 7, /* off in a JNI native method or do something wait 正在执行JNI本地函数 */
THREAD_VMWAIT = 8, /* waiting on a VM resource 正在等待VM资源 */
THREAD_SUSPENDED = 9, /* suspended, usually by GC or debugger 线程暂停,通常是由于GC或debug被暂停 */
DALVIK THREADS:
"main" prio=5 tid=1NATIVE
| group="main" sCount=1 dsCount=0obj=0x2aad2248 self=0xcf70
| sysTid=21404 nice=0 sched=0/0cgrp=[fopen-error:2] handle=1876218976
atandroid.os.MessageQueue.nativePollOnce(Native Method)
atandroid.os.MessageQueue.next(MessageQueue.java:119)
atandroid.os.Looper.loop(Looper.java:110)
at android.app.ActivityThread.main(ActivityThread.java:3688)
at java.lang.reflect.Method.invokeNative(Native Method)
atjava.lang.reflect.Method.invoke(Method.java:507)
atcom.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:866)
at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:624)
at dalvik.system.NativeStart.main(Native Method)
在UI线程进行网络数据的读写
DALVIKTHREADS:
""main"" prio=5 tid=3 NATIVE
|group=""main"" sCount=1 dsCount=0 s=Yobj=0x4001b240 self=0xbda8
| sysTid=2579 nice=0 sched=0/0cgrp=unknown handle=-1343993184
atorg.apache.harmony.luni.platform.OSNetworkSystem.receiveStreamImpl(NativeMethod)
atorg.apache.harmony.luni.platform.OSNetworkSystem.receiveStream(OSNetworkSystem.java:478) ---网络耗时
atorg.apache.harmony.luni.net.PlainSocketImpl.read(PlainSocketImpl.java:565)
atorg.apache.harmony.luni.net.SocketInputStream.read(SocketInputStream.java:87)
atorg.apache.harmony.luni.internal.net.www.protocol.http.HttpURLConnection$LimitedInputStream.read(HttpURLConnection.java:303) ---- 网络
atjava.io.InputStream.read(InputStream.java:133)
DALVIK THREADS:
"main"prio=5 tid=3 VMWAIT
|group="main" sCount=1 dsCount=0 s=N obj=0x40026240self=0xbda8
| sysTid=1815 nice=0 sched=0/0 cgrp=unknownhandle=-1344001376
atdalvik.system.VMRuntime.trackExternalAllocation(NativeMethod) ---VMRuntime.trackExternalAllocation 内存不足
atandroid.graphics.Bitmap.nativeCreate(Native Method)
atandroid.graphics.Bitmap.createBitmap(Bitmap.java:468)
atandroid.view.View.buildDrawingCache(View.java:6324)
atandroid.view.View.getDrawingCache(View.java:6178)
atandroid.view.ViewGroup.drawChild(ViewGroup.java:1541)
3.4 high iowait
0%409/com.android.eventstream.telephonyplugin: 0% user + 0% kernel /faults: 2 minor
0.1% 632/com.android.devicemonitor: 0.1% user + 0%kernel
100%TOTAL: 6.9% user + 8.2% kernel +84%iowait --
DALVIK THREADS:
(mutexes: tll=0 tsl=0tscl=0 ghl=0 hwl=0 hwll=0)
"main" prio=5 tid=1 WAIT
|group="main" sCount=1 dsCount=0 obj=0x2aaca180self=0xcf20
| sysTid=1134 nice=0 sched=
tandroid.content.ContentResolver.query(ContentResolver.java:268) ----数据库操作耗时
atcom.android.email.provider.EmailContent$Message.restoreMessageWithId(EmailContent.java:648)
atcom.android.email.Controller.setMessageRead(Controller.java:658)
atcom.android.email.activity.MessageView.onMarkAsRead(MessageView.java:700)
atcom.android.email.activity.MessageView.access$2500(MessageView.java:98)
atcom.android.email.activity.MessageView$LoadBodyTask.onPostExecute(MessageView.java:1290) --操作
atcom.android.email.activity.MessageView$LoadBodyTask.onPostExecute(MessageView.java:1255)
atandroid.os.AsyncTask.finish(AsyncTask.java:417)
atandroid.os.AsyncTask.access$300(AsyncTask.java:127)
atandroid.os.AsyncTask$InternalHandler.handleMessage(AsyncTask.java:429)
atandroid.os.Handler.dispatchMessage(Handler.java:99)
3.5
Could not read input channel file descriptors from parce
3.6 Input dispatching timed out
(Waiting because no window has focus but there is a focused application that may eventually add a window when it
finishes starting up.)
从trace信息看:
system_server:
Binder:3030_13' prio=5 tid=107 Runnable
Line 459: - locked <0x08b6d1b4> (a java.lang.Object) === 锁住0x08b6d1b4 对象
'android.bg' prio=5 tid=9 Blocked
Line 590: - waiting to lock <0x08b6d1b4> (a java.lang.Object) held by thread 107 ======= 该线程处于等待0x08b6d1b4状态
'Binder:3030_4' prio=5 tid=36 Blocked
Line 1076: - waiting to lock <0x08b6d1b4> (a java.lang.Object) held by thread 107 ==== 如上两条Binder 线程处于waiting 状态
在锁的地方,执行代码:
at com.android.server.usage.UsageStatsDatabase.queryUsageStats(UsageStatsDatabase.java:441)- locked <0x022c9e87> (a java.lang.Object)
at com.android.server.usage.UserUsageStatsService.queryStats(UserUsageStatsService.java:301)
at com.android.server.usage.UserUsageStatsService.queryUsageStats(UserUsageStatsService.java:328)
at com.android.server.usage.UsageStatsService.queryUsageStats(UsageStatsService.java:738) - locked <0x08b6d1b4> (a java.lang.Object)
at com.android.server.usage.UsageStatsService$BinderService.queryUsageStats(UsageStatsService.java:1267)
===== 而在同一时刻,system:ui (android/com.android.internal.app.ResolverActivity) 这条进程的main进程也在执行到此处
at android.app.usage.UsageStatsManager.queryUsageStats(UsageStatsManager.java:133)
at android.app.usage.UsageStatsManager.queryAndAggregateUsageStats(UsageStatsManager.java:204)
at com.android.internal.app.ResolverComparator.(ResolverComparator.java:78)
at com.android.internal.app.ResolverActivity.onCreate(ResolverActivity.java:304)
at com.android.internal.app.ResolverActivity.onCreate(ResolverActivity.java:246)
=== 所以system:ui (android/com.android.internal.app.ResolverActivity) 进程的main线程也需要等待Binder:3030_13' prio=5 tid=107 Runnable 端释放锁,
因等待时间过长导致anr现象。那么是什么原因导致binder长时间锁住0x08b6d1b4对象?继续看锁住0x08b6d1b4对象的线程。
Binder:3030_13' 长时间未释放锁原因:
at java.lang.Exception.(Exception.java:66)
at java.lang.RuntimeException.(RuntimeException.java:62)
at java.lang.IllegalArgumentException.(IllegalArgumentException.java:53)
at java.lang.NumberFormatException.(NumberFormatException.java:55)
===在锁住0x08b6d1b4对象后,java代码抛NumberFormatException异常。
从上面可看到,Binder:3030_13'在锁住0x08b6d1b4对象后,抛异常导致,导致Binder:3030_13挂掉,所以无法及时释放锁,
从而导致另外两个线程处于长时间等待状态。那么现在问题则是需要查看该binder:3030_13为什么会报NumberFormatException异常 ?
从logcat log中搜索NumberFormatException异常可看到:
12-21 08:27:24.143 4220 4220 W System.err: java.lang.NumberFormatException: For input string: 'E'
12-21 08:27:24.143 4220 4220 W System.err: at java.lang.Integer.parseInt(Integer.java:521)
12-21 08:27:24.143 4220 4220 W System.err: at java.lang.Integer.parseInt(Integer.java:556)
12-21 08:27:24.143 4220 4220 W System.err: at com.android.phone.settings.AccountSelectionPreference.setModel(AccountSelectionPreference.java:99)
12-21 08:27:24.143 4220 4220 W System.err: at com.android.phone.MultiSimSettings.populateDefaultOutgoingAccountsModel(MultiSimSettings.java:2567)
12-21 08:27:24.143 4220 4220 W System.err: at com.android.phone.MultiSimSettings.updateVoiceSummary(MultiSimSettings.java:980)
12-21 08:27:24.143 4220 4220 W System.err: at com.android.phone.MultiSimSettings.updateState(MultiSimSettings.java:925)
12-21 08:27:24.143 4220 4220 W System.err: at com.android.phone.MultiSimSettings.-wrap11(MultiSimSettings.java)
12-21 08:27:24.143 4220 4220 W System.err: at com.android.phone.MultiSimSettings$1.handleMessage(MultiSimSettings.java:1352)
12-21 08:27:24.143 4220 4220 W System.err: at android.os.Handler.dispatchMessage(Handler.java:102)
12-21 08:27:24.143 4220 4220 W System.err: at android.os.Looper.loop(Looper.java:159)
12-21 08:27:24.143 4220 4220 W System.err: at android.app.ActivityThread.main(ActivityThread.java:6146)
12-21 08:27:24.143 4220 4220 W System.err: at java.lang.reflect.Method.invoke(Native Method)
12-21 08:27:24.143 4220 4220 W System.err: at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:886)
12-21 08:27:24.143 4220 4220 W System.err: at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:776)
四、ANR机制的实现原理 文章:http://gityuan.com/2016/07/02/android-anr/
从源码角度详细的分析了ANR机制实现的原理。对于上一章讲到的1-4中情况,分别找到了其源码中是如何实现的,
对于每一种大概原理如下:1.在进行相关操作调用hander.sendMessageAtTime()发送一个ANR的消息,延时时间为ANR发生的时间(如前台Service是当前时间20s之后)。2.进行相关的操作3.操作结束后向remove掉该条message。如果相关的操作在规定时间没有执行完成,该条message将被handler取出并执行,就发生了ANR。
下面以BroadcastReceiver为例详细介绍:
BroadcastQueue.processNextBroadcast()
final void processNextBroadcast(boolean fromMsg) { … synchronized (mService) { … do { if (mOrderedBroadcasts.size() == 0) { … if (mService.mProcessesReady && r.dispatchTime > 0) { long now = SystemClock.uptimeMillis(); if ((numReceivers > 0) && (now > r.dispatchTime + (2 * mTimeoutPeriod * numReceivers))) { //1.发送延时消息 broadcastTimeoutLocked(false); // forcibly finish this broadcast forceReceive = true; r.state = BroadcastRecord.IDLE; } } if (r.state != BroadcastRecord.IDLE) { if (DEBUG_BROADCAST) Slog.d(TAG, “processNextBroadcast(” + mQueueName + “) called when not idle (state=” + r.state + “)”); return; } if (r.receivers == null || r.nextReceiver >= numReceivers || r.resultAbort || forceReceive) { // No more receivers for this broadcast! Send the final // result if requested… if (r.resultTo != null) { try { //2. 处理广播消息 performReceiveLocked(r.callerApp, r.resultTo, new Intent(r.intent), r.resultCode, r.resultData, r.resultExtras, false, false, r.userId); // Set this to null so that the reference // (local and remote) isn’t kept in the mBroadcastHistory. r.resultTo = null; } catch (RemoteException e) { … } } //3.取消延时消息 cancelBroadcastTimeoutLocked(); … } } while (r == null) ; … } } }
1.发送延时消息:broadcastTimeoutLocked(false): final void broadcastTimeoutLocked(boolean fromMsg) { … long now = SystemClock.uptimeMillis(); if (fromMsg) { if (mService.mDidDexOpt) { // Delay timeouts until dexopt finishes. mService.mDidDexOpt = false; long timeoutTime = SystemClock.uptimeMillis() + mTimeoutPeriod; setBroadcastTimeoutLocked(timeoutTime); return; } if (!mService.mProcessesReady) { return; } long timeoutTime = r.receiverTime + mTimeoutPeriod; if (timeoutTime > now) { setBroadcastTimeoutLocked(timeoutTime); return; } }
他调用了setBroadcastTimeoutLocked(long timeoutTime): final void setBroadcastTimeoutLocked(long timeoutTime) { if (! mPendingBroadcastTimeoutMessage) { Message msg = mHandler.obtainMessage(BROADCAST_TIMEOUT_MSG, this); mHandler.sendMessageAtTime(msg, timeoutTime); mPendingBroadcastTimeoutMessage = true; } }
传入setBroadcastTimeoutLocked(long timeoutTime)的时间xxx + mTimeoutPeriod,mTimeoutPeriod就是onRecieve()可以执行的时间,在BroadcastQueue初始化时候被赋值,前台队列为10s后台队列为60s:
ActivityManagerService.java: public ActivityManagerService(Context systemContext) { … static final int BROADCAST_FG_TIMEOUT = 10 * 1000; static final int BROADCAST_BG_TIMEOUT = 60 * 1000; … mFgBroadcastQueue = new BroadcastQueue(this, mHandler, “foreground”, BROADCAST_FG_TIMEOUT, false); mBgBroadcastQueue = new BroadcastQueue(this, mHandler, “background”, BROADCAST_BG_TIMEOUT, true); … }
performReceiveLocked()为广播的实际处理,就不展开了
cancelBroadcastTimeoutLocked() :
该方法的主要工作是当service启动完成,则移除服务超时消息SERVICE_TIMEOUT_MSG。
final void cancelBroadcastTimeoutLocked() {
if (mPendingBroadcastTimeoutMessage) {
mHandler.removeMessages(BROADCAST_TIMEOUT_MSG, this);
mPendingBroadcastTimeoutMessage = false;
}