Traces文件分析ANR问题

Traces文件分析ANR问题

我们在开发调试过程中遇到ANR问题大多是可以通过DDMS方法来分析问题原因的,但是所有的ANR问题不一定会在开发阶段出现,如果在测试或者发版之后出现了ANR问题,那么就需要通过traces文件来分析。根据之前的分析我们知道,traces文件位于/data/anr目录下,即便是没有root的手机也是可以通过adb命令将该文件pull出来,一个traces文件中包含了出现ANR时当前系统的所有活动进程的情况,其中每一个进程会包含所有线程的情况,因此文件的内容量往往比较大。但是一般造成ANR的进程会被记录在头一段,因此尽可能详细的分析头一段进程是解析traces文件的重要方法。

1) 文件内容解析

在traces文件中我们会看到很多段类似于如下文本的内容,其中每一段是一个进程,N段表示N个进程,共同描述了出现ANR时系统进程的状况

----- pid 4280 at 2016-05-30 00:17:13 -----
Cmd line: com.quicinc.cne.CNEService
Build fingerprint: 'Xiaomi/virgo/virgo:6.0.1/MMB29M/6.3.21:user/release-keys'
ABI: 'arm'
Build type: optimized
Zygote loaded classes=4124 post zygote classes=18
Intern table: 51434 strong; 17 weak
JNI: CheckJNI is off; globals=286 (plus 277 weak)
Libraries: /system/lib/libandroid.so /system/lib/libcompiler_rt.so /system/lib/libjavacrypto.so /system/lib/libjnigraphics.so /system/lib/libmedia_jni.so /system/lib/libmiuinative.so /system/lib/libsechook.so /system/lib/libwebviewchromium_loader.so libjavacore.so (9)
Heap: 50% free, 16MB/33MB; 33690 objects
Dumping cumulative Gc timings
Total number of allocations 33690
Total bytes allocated 16MB
Total bytes freed 0B
Free memory 16MB
Free memory until GC 16MB
Free memory until OOME 111MB
Total memory 33MB
Max memory 128MB
Zygote space size 1624KB
Total mutator paused time: 0
Total time waiting for GC to complete: 0
Total GC count: 0
Total GC time: 0
Total blocking GC count: 0
Total blocking GC time: 0
suspend all histogram:  Sum: 102us 99% C.I. 3us-25us Avg: 8.500us Max: 25us
DALVIK THREADS (10):
"Signal Catcher" daemon prio=5 tid=2 Runnable
  | group="system" sCount=0 dsCount=0 obj=0x12c470a0 self=0xaeb8b000
  | sysTid=4319 nice=0 cgrp=default sched=0/0 handle=0xb424f930
  | state=R schedstat=( 111053493 34114006 33 ) utm=6 stm=5 core=0 HZ=100
  | stack=0xb4153000-0xb4155000 stackSize=1014KB
  | held mutexes= "mutator lock"(shared held)
  native: #00 pc 00370e89  /system/lib/libart.so (art::DumpNativeStack(std::__1::basic_ostream >&, int, char const*, art::ArtMethod*, void*)+160)
  native: #01 pc 003504f7  /system/lib/libart.so (art::Thread::Dump(std::__1::basic_ostream >&) const+150)
  native: #02 pc 0035a3fb  /system/lib/libart.so (art::DumpCheckpoint::Run(art::Thread*)+442)
  native: #03 pc 0035afb9  /system/lib/libart.so (art::ThreadList::RunCheckpoint(art::Closure*)+212)
  native: #04 pc 0035b4e7  /system/lib/libart.so (art::ThreadList::Dump(std::__1::basic_ostream >&)+142)
  native: #05 pc 0035bbf7  /system/lib/libart.so (art::ThreadList::DumpForSigQuit(std::__1::basic_ostream >&)+334)
  native: #06 pc 00333d3f  /system/lib/libart.so (art::Runtime::DumpForSigQuit(std::__1::basic_ostream >&)+74)
  native: #07 pc 0033b0a5  /system/lib/libart.so (art::SignalCatcher::HandleSigQuit()+928)
  native: #08 pc 0033b989  /system/lib/libart.so (art::SignalCatcher::Run(void*)+340)
  native: #09 pc 0003f54f  /system/lib/libc.so (__pthread_start(void*)+30)
  native: #10 pc 00019c2f  /system/lib/libc.so (__start_thread+6)
  (no managed stack frames)
"main" prio=5 tid=1 Native
  | group="main" sCount=1 dsCount=0 obj=0x7541b3c0 self=0xb4cf6500
  | sysTid=4280 nice=-1 cgrp=default sched=0/0 handle=0xb6f5cb34
  | state=S schedstat=( 52155108 81807757 159 ) utm=2 stm=3 core=0 HZ=100
  | stack=0xbe121000-0xbe123000 stackSize=8MB
  | held mutexes=
  native: #00 pc 00040984  /system/lib/libc.so (__epoll_pwait+20)
  native: #01 pc 00019f5b  /system/lib/libc.so (epoll_pwait+26)
  native: #02 pc 00019f69  /system/lib/libc.so (epoll_wait+6)
  native: #03 pc 00012c57  /system/lib/libutils.so (android::Looper::pollInner(int)+102)
  native: #04 pc 00012ed3  /system/lib/libutils.so (android::Looper::pollOnce(int, int*, int*, void**)+130)
  native: #05 pc 00082bed  /system/lib/libandroid_runtime.so (android::NativeMessageQueue::pollOnce(_JNIEnv*, _jobject*, int)+22)
  native: #06 pc 0000055d  /data/dalvik-cache/arm/system@[email protected] (Java_android_os_MessageQueue_nativePollOnce__JI+96)
  at android.os.MessageQueue.nativePollOnce(Native method)
  at android.os.MessageQueue.next(MessageQueue.java:323)
  at android.os.Looper.loop(Looper.java:135)
  at android.app.ActivityThread.main(ActivityThread.java:5435)
  at java.lang.reflect.Method.invoke!(Native method)
  at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:735)
  at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:625)


// ...


----- end 4280 -----

a) 进程头部信息

----- pid 4280 at 2016-05-30 00:17:13 -----
Cmd line: com.quicinc.cne.CNEService

这里记录了出现ANR时该进程的pid号以及当前时间,进程名是com.quicinc.cne.CNEService

b) 进程资源状态信息

Build fingerprint: 'Xiaomi/virgo/virgo:6.0.1/MMB29M/6.3.21:user/release-keys'
ABI: 'arm'
Build type: optimized
Zygote loaded classes=4124 post zygote classes=18
Intern table: 51434 strong; 17 weak
JNI: CheckJNI is off; globals=286 (plus 277 weak)
Libraries: /system/lib/libandroid.so /system/lib/libcompiler_rt.so /system/lib/libjavacrypto.so /system/lib/libjnigraphics.so /system/lib/libmedia_jni.so /system/lib/libmiuinative.so /system/lib/libsechook.so /system/lib/libwebviewchromium_loader.so libjavacore.so (9)
Heap: 50% free, 16MB/33MB; 33690 objects
Dumping cumulative Gc timings
Total number of allocations 33690
Total bytes allocated 16MB
Total bytes freed 0B
Free memory 16MB
Free memory until GC 16MB
Free memory until OOME 111MB
Total memory 33MB
Max memory 128MB
Zygote space size 1624KB
Total mutator paused time: 0
Total time waiting for GC to complete: 0
Total GC count: 0
Total GC time: 0
Total blocking GC count: 0
Total blocking GC time: 0

这里打印了一大段关于硬件状态的信息,虽然目前我还没用到这里的信息,不过我觉得在某些时候这里的数据是会有作用的

c) 每条线程的信息

"main" prio=5 tid=1 Native // 输出了线程名,优先级,线程号,线程状态,带有『deamon』字样的线程表示守护线程,即DDMS中『*』线程
  | group="main" sCount=1 dsCount=0 obj=0x7541b3c0 self=0xb4cf6500 // 输出了线程组名,sCount被挂起次数,dsCount被调试器挂起次数,obj表示线程对象的地址,self表示线程本身的地址
  | sysTid=4280 nice=-1 cgrp=default sched=0/0 handle=0xb6f5cb34 // sysTid是Linux下的内核线程id,nice是线程的调度优先级,sched分别标志了线程的调度策略和优先级,cgrp是调度属组,handle是线程的处理函数地址。
  | state=S schedstat=( 52155108 81807757 159 ) utm=2 stm=3 core=0 HZ=100 // state是调度状态;schedstat从 /proc/[pid]/task/[tid]/schedstat读出,三个值分别表示线程在cpu上执行的时间、线程的等待时间和线程执行的时间片长度,有的android内核版本不支持这项信息,得到的三个值都是0;utm是线程用户态下使用的时间值(单位是jiffies);stm是内核态下的调度时间值;core是最后执行这个线程的cpu核的序号。
  | stack=0xbe121000-0xbe123000 stackSize=8MB
  | held mutexes=
  native: #00 pc 00040984  /system/lib/libc.so (__epoll_pwait+20)
  native: #01 pc 00019f5b  /system/lib/libc.so (epoll_pwait+26)
  native: #02 pc 00019f69  /system/lib/libc.so (epoll_wait+6)
  native: #03 pc 00012c57  /system/lib/libutils.so (android::Looper::pollInner(int)+102)
  native: #04 pc 00012ed3  /system/lib/libutils.so (android::Looper::pollOnce(int, int*, int*, void**)+130)
  native: #05 pc 00082bed  /system/lib/libandroid_runtime.so (android::NativeMessageQueue::pollOnce(_JNIEnv*, _jobject*, int)+22)
  native: #06 pc 0000055d  /data/dalvik-cache/arm/system@[email protected] (Java_android_os_MessageQueue_nativePollOnce__JI+96)
  at android.os.MessageQueue.nativePollOnce(Native method)
  at android.os.MessageQueue.next(MessageQueue.java:323)
  at android.os.Looper.loop(Looper.java:135)
  at android.app.ActivityThread.main(ActivityThread.java:5435)
  at java.lang.reflect.Method.invoke!(Native method)
  at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:735)
  at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:625)

该有的描述上面已经用注释的方式放入了,最后的部分是调用栈信息。

2) 再举个梨子

----- pid 12838 at 2016-05-30 10:41:04 -----
Cmd line: 略
// 进程状态信息省略

suspend all histogram:  Sum: 1.456ms 99% C.I. 3us-508.799us Avg: 97.066us Max: 523us
DALVIK THREADS (19):
"Signal Catcher" daemon prio=5 tid=2 Runnable
  | group="system" sCount=0 dsCount=0 obj=0x32c02100 self=0xb82f1d40
  | sysTid=12843 nice=0 cgrp=bg_non_interactive sched=0/0 handle=0xb39ec930
  | state=R schedstat=( 10914800 1156480 11 ) utm=0 stm=0 core=2 HZ=100
  | stack=0xb38f0000-0xb38f2000 stackSize=1014KB
  | held mutexes= "mutator lock"(shared held)
  native: #00 pc 00371069  /system/lib/libart.so (_ZN3art15DumpNativeStackERNSt3__113basic_ostreamIcNS0_11char_traitsIcEEEEiPKcPNS_9ArtMethodEPv+160)
  native: #01 pc 003508c3  /system/lib/libart.so (_ZNK3art6Thread4DumpERNSt3__113basic_ostreamIcNS1_11char_traitsIcEEEE+150)
  native: #02 pc 0035a5bb  /system/lib/libart.so (_ZN3art14DumpCheckpoint3RunEPNS_6ThreadE+442)
  native: #03 pc 0035b179  /system/lib/libart.so (_ZN3art10ThreadList13RunCheckpointEPNS_7ClosureE+212)
  native: #04 pc 0035b6a7  /system/lib/libart.so (_ZN3art10ThreadList4DumpERNSt3__113basic_ostreamIcNS1_11char_traitsIcEEEE+142)
  native: #05 pc 0035bdb7  /system/lib/libart.so (_ZN3art10ThreadList14DumpForSigQuitERNSt3__113basic_ostreamIcNS1_11char_traitsIcEEEE+334)
  native: #06 pc 00331179  /system/lib/libart.so (_ZN3art7Runtime14DumpForSigQuitERNSt3__113basic_ostreamIcNS1_11char_traitsIcEEEE+72)
  native: #07 pc 0033b27d  /system/lib/libart.so (_ZN3art13SignalCatcher13HandleSigQuitEv+928)
  native: #08 pc 0033bb61  /system/lib/libart.so (_ZN3art13SignalCatcher3RunEPv+340)
  native: #09 pc 00041737  /system/lib/libc.so (_ZL15__pthread_startPv+30)
  native: #10 pc 00019433  /system/lib/libc.so (__start_thread+6)
  (no managed stack frames)
"main" prio=5 tid=1 Blocked
  | group="main" sCount=1 dsCount=0 obj=0x759002c0 self=0xb737fee8
  | sysTid=12838 nice=-1 cgrp=bg_non_interactive sched=0/0 handle=0xb6f1eb38
  | state=S schedstat=( 743081924 64813008 709 ) utm=50 stm=23 core=4 HZ=100
  | stack=0xbe54e000-0xbe550000 stackSize=8MB
  | held mutexes=
  kernel: (couldn't read /proc/self/task/12838/stack)
  native: #00 pc 00016aa4  /system/lib/libc.so (syscall+28)
  native: #01 pc 000f739d  /system/lib/libart.so (_ZN3art17ConditionVariable4WaitEPNS_6ThreadE+96)
  native: #02 pc 002bcd8d  /system/lib/libart.so (_ZN3art7Monitor4LockEPNS_6ThreadE+408)
  native: #03 pc 002bed73  /system/lib/libart.so (_ZN3art7Monitor4WaitEPNS_6ThreadExibNS_11ThreadStateE+922)
  native: #04 pc 002bfbaf  /system/lib/libart.so (_ZN3art7Monitor4WaitEPNS_6ThreadEPNS_6mirror6ObjectExibNS_11ThreadStateE+142)
  native: #05 pc 002d1403  /system/lib/libart.so (_ZN3artL11Object_waitEP7_JNIEnvP8_jobject+38)
  native: #06 pc 0000036f  /data/dalvik-cache/arm/system@[email protected] (Java_java_lang_Object_wait__+74)
  at java.lang.Object.wait!(Native method)
  - waiting to lock <0x0520de84> (a java.lang.Object) held by thread 22
  at com.xx(unavailable:-1)
  - locked <0x00e3266d> 
  - locked <0x0520de84> (a java.lang.Object)
  at com.xx.R(unavailable:-1)
  at com.xx.ux(unavailable:-1)
  // 其余栈略
  at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:684)
// 其他线程省略
"Thread-654" prio=5 tid=22 Blocked
  | group="main" sCount=1 dsCount=0 obj=0x32c027c0 self=0xb83e9750
  | sysTid=12891 nice=0 cgrp=bg_non_interactive sched=0/0 handle=0x9cf1c930
  | state=S schedstat=( 50601200 1215760 62 ) utm=4 stm=0 core=7 HZ=100
  | stack=0x9ce1a000-0x9ce1c000 stackSize=1038KB
  | held mutexes=
  at com.yy(unavailable:-1)
  - waiting to lock <0x00e3266d> held by thread 1
  at com.yy.MX(unavailable:-1)
  at com.yy.run(unavailable:-1)
  - locked <0x0520de84> (a java.lang.Object)
  at java.lang.Thread.run(Thread.java:833)

从traces文件种可以很明显的看到我们的主线程处于Blcoked状态,详细查看Blcoked的原因知道,它在等待一个被22号线程持有的对象锁,于是我们查看tid=22的线程,可以看出这个线程的确锁住了一个对象,该对象正是主线程正在等待上锁的对象,那这个线程为何没有释放锁呢,因为它在等一个被1号线程持有的对象锁,因此死锁问题导致了ANR现象。

你可能感兴趣的:(log分析)