Android ANR log trace分析实例

一、ANR说明和原因

1.1 简介

ANR全称:Application Not Responding,也就是应用程序无响应。

1.2 原因

Android系统中,ActivityManagerService(简称AMS)WindowManagerService(简称WMS)会检测App的响应时间,如果App在特定时间无法相应屏幕触摸或键盘输入时间,或者特定事件没有处理完毕,就会出现ANR。

以下四个条件都可以造成ANR发生:

  • InputDispatching Timeout:5秒内无法响应屏幕触摸事件或键盘输入事件
  • BroadcastQueue Timeout :在执行前台广播(BroadcastReceiver)的onReceive()函数时10秒没有处理完成,后台为60秒。
  • Service Timeout :前台服务20秒内,后台服务在200秒内没有执行完毕。
  • ContentProvider Timeout :ContentProvider的publish在10s内没进行完。

1.3 避免

尽量避免在主线程(UI线程)中作耗时操作。

那么耗时操作就放在子线程中。
关于多线程可以参考:Android多线程:理解和简单使用总结

二.ANR一般分析步骤

1.首先从main.log找到进程出现anr对应的大体时间,如在log中查询"anr in"字段

2.根据出现anr的进程名到anr文件夹中找出trance.txt文件,根据文件的信息首先判断anr的类型,是app自身还是系统问题,如果是app应用问题,则根据对用的调用解决问题

3.如不是app应用问题,则从main.log中判断anr的类型,如是keydispatch time out,则应该根据anr准确的时间点上推5s钟,看看此时对用进程正在进程何操作(具体anr的准确时间可以在event.log中搜索anr)

4.trace中无明显异常,可以从下面的情况考虑

a.是否由于io,数据库处理导致cpu使用率过高从而导致其他应用进程无法抢占cpu时间片
b.是否是低内存导致anr(如低内存,可以从system.log中查看进程被kill, 输入某某进程died)
c.是否由于输入法交互处理不当导致不能返回出现anr
d.是否由于进程锁等待,死锁情况出现anr

三.实例分析

例子一

Android ANR 实例分析

例子二

Android ANR log trace日志文件分析

例子三

看 /data/anr/traces.tx. 文件
直接看main thread

----- pid 20663 at 2020-08-25 18:11:47 -----
Cmd line: com.xxxxxing.xxxxxsettings
Build fingerprint: 'Android/xxx/xxx:7.1.2/NHG47L/xxx:user/test-keys'
ABI: 'arm'
Build type: optimized
Zygote loaded classes=4381 post zygote classes=88
Intern table: 41765 strong; 330 weak
JNI: CheckJNI is off; globals=458 (plus 132 weak)
// ...
"main" prio=5 tid=1 Sleeping
  | group="main" sCount=1 dsCount=0 obj=0x74751f58 self=0xf4785400
  | sysTid=20663 nice=-10 cgrp=default sched=0/0 handle=0xf7437534
  | state=S schedstat=( 179097380 5860498 155 ) utm=9 stm=8 core=2 HZ=100
  | stack=0xff77c000-0xff77e000 stackSize=8MB
  | held mutexes=
  at java.lang.Thread.sleep!(Native method)
  - sleeping on <0x09d961ef> (a java.lang.Object)
  at java.lang.Thread.sleep(Thread.java:371)
  - locked <0x09d961ef> (a java.lang.Object)
  at java.lang.Thread.sleep(Thread.java:313)
  at com.xxxxxing.xxxxxsettings.CrashHandler.uncaughtException(CrashHandler.java:81)
  at java.lang.ThreadGroup.uncaughtException(ThreadGroup.java:1068)
  at java.lang.ThreadGroup.uncaughtException(ThreadGroup.java:1063)

其实这份log 比较容易看出来是哪里有问题了,是在调用了thread sleep 造成的无响应

 at com.xxxxxing.xxxxxsettings.CrashHandler.uncaughtException(CrashHandler.java:81)

看logcat, 直接在log中搜索ANR关键字

[2020/8/25 18:13:42] (20663): Wrote stack traces to '/data/anr/traces.txt'
[2020/8/25 18:13:42] 08-25 18:11:48.197 I/art     ( 4713): Wrote stack traces to '/data/anr/traces.txt'
[2020/8/25 18:13:42] 08-25 18:11:48.339 I/art     ( 4701): Wrote stack traces to '/data/anr/traces.txt'
[2020/8/25 18:13:43] 08-25 18:11:48.411 I/art     ( 4688): Wrote stack traces to '/data/anr/traces.txt'
[2020/8/25 18:13:43] 08-25 18:11:48.467 I/art     ( 4668): Wrote stack traces to '/data/anr/traces.txt'
[2020/8/25 18:13:43] 08-25 18:11:48.530 I/art     ( 4653): Wrote stack traces to '/data/anr/traces.txt'
[2020/8/25 18:13:43] 08-25 18:11:48.588 I/art     ( 4638): Wrote stack traces to '/data/anr/traces.txt'
[2020/8/25 18:13:43] 08-25 18:11:48.701 I/art     ( 4461): Wrote stack traces to '/data/anr/traces.txt'
[2020/8/25 18:13:43] 08-25 18:11:48.878 I/art     ( 4379): Wrote stack traces to '/data/anr/traces.txt'
[2020/8/25 18:13:43] o=700386 scontext=u:r:untrusted_app:s0:c512,c768 tcontext=u:object_r:anr_data_file:s0 tclass=file permissive=1
[2020/8/25 18:13:43] mcblk0p14" ino=700386 scontext=u:r:untrusted_app:s0:c512,c768 tcontext=u:object_r:anr_data_file:s0 tclass=file permissive=1
[2020/8/25 18:13:44] 08-25 18:11:51.547 E/ActivityManager( 4271): ANR in com.xxxxxing.xxxxxsettings(com.xxxxxing.xxxxxsettings/.ConnectionInfoActivity)

看ANR 附近log 信息

[2020/8/25 18:13:44] 08-25 18:11:51.547 E/ActivityManager( 4271): PID: 20663

直接搜20663 这个进程做了什么,如下log 可以看到是 signal 9 信号kill 掉的,关于signal 更新介绍可以看 FreeBSD Manual Pages

同时我们先看到anr 的打印是在process killed 之前打印的,说明是先发生了ANR后再杀掉了进程

    Line 1637: [2020/8/25 18:13:42] 08-25 18:11:47.492 I/art     (20663): 
    Line 1642: [2020/8/25 18:13:42] (20663): Wrote stack traces to '/data/anr/traces.txt'
    Line 1651: [2020/8/25 18:13:42] 08-25 18:11:47.844 I/Process (20663): Sending signal. PID: 20663 SIG: 9
    Line 1651: [2020/8/25 18:13:42] 08-25 18:11:47.844 I/Process (20663): Sending signal. PID: 20663 SIG: 9
    Line 1663: [2020/8/25 18:13:42] 08-25 18:11:48.114 I/ActivityManager( 4271): Process com.xxxxxing.xxxxxsettings (pid 20663) has died
    Line 1664: [2020/8/25 18:13:42] 08-25 18:11:48.115 D/ActivityManager( 4271): cleanUpApplicationRecord -- 20663
    Line 1866: [2020/8/25 18:13:44] 08-25 18:11:51.547 E/ActivityManager( 4271): PID: 20663

结合trace 和 logcat 以及源码 得出的结论为
1、程序封装的exception 类,收到了exception 后进入到重载的函数
使用方法参考 UncaughtException 的使用

2、重载的函数中实现 调用了sleep 后导致了ANR,然后在kill process

    @Override
    public void uncaughtException(Thread thread, Throwable ex) {
        if(!handleException(thread) && mDefaultHandler != null){
          mDefaultHandler.uncaughtException(thread, ex);// 系统默认异常处理器
        } else {
          Thread.sleep(3000);
          android.os.Process.killProcess(android.os.Process.myPid);
          System.exit(1);
        }
    }

例子四 Android ANR 处理之Handler

背景:播放多媒体的时候seek 卡住了,操作其他按键导致ANR

这里有两个问题要分析,一个是为什么seek 会卡住,另一个问题是即使seek卡住了也不应该导致ANR

先看ANR的main thread 如下

main" prio=5 tid=1 Native
 | group="main" sCount=1 dsCount=0 flags=1 obj=0x74d45000 self=0xed1d1000
 | sysTid=13857 nice=-10 cgrp=default sched=0/0 handle=0xf13ee494
 | state=S schedstat=( 0 0 0 ) utm=156 stm=22 core=1 HZ=100
 | stack=0xff34a000-0xff34c000 stackSize=8MB
 | held mutexes=
 kernel: __switch_to+0xa4/0xc4
 kernel: binder_thread_read+0x3e0/0x1104
 kernel: binder_ioctl+0x8e0/0xac4
 kernel: compat_SyS_ioctl+0xd4/0xed8
 kernel: el0_svc_naked+0x34/0x38
 native: #00 pc 00053b8c  /system/lib/libc.so (__ioctl+8)
 native: #01 pc 00021b63  /system/lib/libc.so (ioctl+30)
 native: #02 pc 0003d3f5  /system/lib/libbinder.so (android::IPCThreadState::talkWithDriver(bool)+204)
 native: #03 pc 0003dde3  /system/lib/libbinder.so (android::IPCThreadState::waitForResponse(android::Parcel*, int*)+26)
 native: #04 pc 0003713d  /system/lib/libbinder.so (android::BpBinder::transact(unsigned int, android::Parcel const&, android::Parcel*, unsigned int)+36)
 native: #05 pc 00039f19  /system/lib/libmedia.so (android::BpMediaPlayer::seekTo(int, android::MediaTrack::ReadOptions::SeekMode)+84)
 native: #06 pc 00033025  /system/lib/libmedia.so (android::MediaPlayer::seekTo_l(int, android::MediaTrack::ReadOptions::SeekMode)+184)
 native: #07 pc 00033081  /system/lib/libmedia.so (android::MediaPlayer::seekTo(int, android::MediaTrack::ReadOptions::SeekMode)+40)
 native: #08 pc 0003049b  /system/lib/libmedia_jni.so (android_media_MediaPlayer_seekTo(_JNIEnv*, _jobject*, long long, int)+82)
 at android.media.MediaPlayer._seekTo(Native method)
 at android.media.MediaPlayer.seekTo(MediaPlayer.java:1961)
 at android.media.MediaPlayer.seekTo(MediaPlayer.java:1973)
 at com.xxx.media.xxxAndroidMediaPlayer.seekTo(xxxAndroidMediaPlayer.kt:310)
 at com.xxx.media.xxx.seekTo(xxxVideoView.kt:883)
 at com.xxx.media.MediaController$5.handleMessage(MediaController.java:587)
 at android.os.Handler.dispatchMessage(Handler.java:106)
 at android.os.Looper.loop(Looper.java:193)
 at android.app.ActivityThread.main(ActivityThread.java:6669)
 at java.lang.reflect.Method.invoke(Native method)
 at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:493)
 at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:866)

Apk的代码时别人写,一时看代码感觉没有发现问题,收到按键后,通过Handler 发送消息处理

 private Handler mHandler = new Handler() {
 @Override
 public void handleMessage(Message msg) {
 super.handleMessage(msg);
 switch (msg.what) {
 case SEEK_MSG:
 if (mPlayer != null) {
 Log.d(LOG_TAG, "msg.arg1=="+msg.arg1);
 mDragging = false;
 seekSave = -1;
 mPlayer.seekTo(msg.arg1); // 播放器seek
 }
 break;
 }
 }
 };
 @Override
 public boolean dispatchKeyEvent(KeyEvent event) {
 if (event.getAction() == KeyEvent.ACTION_DOWN ) {
 Log.d(LOG_TAG, "dispatchKeyEvent: SEEK_MSG");
 mHandler.removeMessages(SEEK_MSG);
 mHandler.sendMessageDelayed(mHandler.obtainMessage(SEEK_MSG, mNewPosition, 0),1000);
 }
 }

为了模拟这个问题很简单,在 mPlayer.seekTo(msg.arg1); // 播放器seek 前面加一段sleep 30s,seek后再操作按键就好

同样能够模拟出问题来,堆栈基本也一样

"main" prio=5 tid=1 Sleeping
 | group="main" sCount=1 dsCount=0 flags=1 obj=0x742ee000 self=0xeffd1000
 | sysTid=9639 nice=-10 cgrp=default sched=0/0 handle=0xf41e2494
 | state=S schedstat=( 0 0 0 ) utm=38 stm=12 core=3 HZ=100
 | stack=0xff1b0000-0xff1b2000 stackSize=8MB
 | held mutexes=
 at java.lang.Thread.sleep(Native method)
 - sleeping on <0x06bc3a7d> (a java.lang.Object)
 at java.lang.Thread.sleep(Thread.java:373)
 - locked <0x06bc3a7d> (a java.lang.Object)
 at java.lang.Thread.sleep(Thread.java:314)
 at com.xxx.media.MediaController$5.handleMessage(MediaController.java:591)
 at android.os.Handler.dispatchMessage(Handler.java:106)
 at android.os.Looper.loop(Looper.java:193)
 at android.app.ActivityThread.main(ActivityThread.java:6669)
 at java.lang.reflect.Method.invoke(Native method)
 at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:493)
 at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:866)

然后又再细看了一遍,其实是自己对Android Handler的理解不够,上面apk的代码是有问题的,虽然handler处了理seek msg,但是这个是主线程的looper,其实就是在主线程的looper中做了耗时的操作,这时候有按键进来是无法响应的

更多关于hanlder的知识可以看这里 https://www.tqwba.com/x_d/jishu/269458.html

https://blog.csdn.net/ly502541243/article/details/52062179/

https://blog.csdn.net/javazejian/article/details/52426353

你可能感兴趣的:(Android ANR log trace分析实例)