在分析ANR问题时,第一步就是把/data/anr/traces.txt这个文件adb pull出来分析, 它记录了手机发生ANR时, 各个进程里的所有线程在当时的状态.
典型的情况是:
----- pid 9644 at 2015-12-18 18:06:11 -----
Cmd line: com.tencent.androidqqmail:Push
DALVIK THREADS:
(mutexes: tll=0 tsl=0 tscl=0 ghl=0)
"main" prio=5 tid=1 MONITOR
| group="main" sCount=1 dsCount=0 obj=0x41e42b50 self=0x41e321f0
| sysTid=9644 nice=0 sched=0/0 cgrp=apps handle=1074578908
| state=S schedstat=( 211181635 61197479253 655 ) utm=13 stm=8 core=0
at com.tencent.qqmail.utilities.qmnetwork.service.QMPushService.onStartCommand(SourceFile:~288)
- waiting to lock <0x421809c0> (a com.tencent.qqmail.utilities.qmnetwork.service.QMPushService) held by tid=15 (Thread-2740)
at android.app.ActivityThread.handleServiceArgs(ActivityThread.java:2679)
at android.app.ActivityThread.access$1900(ActivityThread.java:149)
at android.app.ActivityThread$H.handleMessage(ActivityThread.java:1343)
at android.os.Handler.dispatchMessage(Handler.java:99)
at android.os.Looper.loop(Looper.java:213)
at android.app.ActivityThread.main(ActivityThread.java:5092)
at java.lang.reflect.Method.invokeNative(Native Method)
at java.lang.reflect.Method.invoke(Method.java:511)
at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:797)
at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:564)
at dalvik.system.NativeStart.main(Native Method)
"Timer-0" daemon prio=5 tid=23 TIMED_WAIT
| group="main" sCount=1 dsCount=0 obj=0x42313248 self=0x72be2018
| sysTid=9693 nice=0 sched=0/0 cgrp=apps handle=1074262912
| state=S schedstat=( 178436271 61426971433 856 ) utm=7 stm=10 core=0
at java.lang.Object.wait(Native Method)
- waiting on <0x42313248> (a java.util.Timer$TimerImpl)
at java.lang.Object.wait(Object.java:401)
at java.util.Timer$TimerImpl.run(Timer.java:238)
"QMThreadPool #3" daemon prio=3 tid=22 WAIT
| group="main" sCount=1 dsCount=0 obj=0x42310418 self=0x700656f8
| sysTid=9692 nice=13 sched=0/0 cgrp=apps/bg_non_interactive handle=1924222744
| state=S schedstat=( 25787332 62892517091 453 ) utm=0 stm=2 core=0
at java.lang.Object.wait(Native Method)
- waiting on <0x42310538> (a java.lang.VMThread) held by tid=22 (QMThreadPool #3)
at java.lang.Thread.parkFor(Thread.java:1231)
at sun.misc.Unsafe.park(Unsafe.java:323)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:159)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2019)
at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:413)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1013)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1073)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:573)
at java.lang.Thread.run(Thread.java:856)
"WifiManager" prio=5 tid=20 NATIVE
| group="main" sCount=1 dsCount=0 obj=0x42213ba0 self=0x7363b008
| sysTid=9686 nice=0 sched=0/0 cgrp=apps handle=1923316488
| state=S schedstat=( 41564937 61108032226 478 ) utm=1 stm=3 core=0
#00 pc 00017fe8 /system/lib/libc.so (epoll_wait+12)
#01 pc 00014b29 /system/lib/libutils.so (android::Looper::pollInner(int)+96)
#02 pc 00014d91 /system/lib/libutils.so (android::Looper::pollOnce(int, int*, int*, void**)+104)
#03 pc 0006725b /system/lib/libandroid_runtime.so (android::NativeMessageQueue::pollOnce(_JNIEnv*, int)+22)
#04 pc 00020050 /system/lib/libdvm.so (dvmPlatformInvoke+112)
#05 pc 0004f8d1 /system/lib/libdvm.so (dvmCallJNIMethod(unsigned int const*, JValue*, Method const*, Thread*)+396)
#06 pc 000294e0 /system/lib/libdvm.so
#07 pc 0002d7a0 /system/lib/libdvm.so (dvmInterpret(Thread*, Method const*, JValue*)+184)
#08 pc 000620ed /system/lib/libdvm.so (dvmCallMethodV(Thread*, Method const*, Object*, bool, JValue*, std::__va_list)+272)
#09 pc 00062117 /system/lib/libdvm.so (dvmCallMethod(Thread*, Method const*, Object*, JValue*, ...)+20)
#10 pc 00056c8f /system/lib/libdvm.so
#11 pc 0000e4b8 /system/lib/libc.so (__thread_entry+72)
#12 pc 0000dba4 /system/lib/libc.so (pthread_create+160)
at android.os.MessageQueue.nativePollOnce(Native Method)
at android.os.MessageQueue.next(MessageQueue.java:125)
at android.os.Looper.loop(Looper.java:197)
at android.os.HandlerThread.run(HandlerThread.java:60)
"Signal Catcher" daemon prio=5 tid=3 RUNNABLE
| group="system" sCount=0 dsCount=0 obj=0x4211bac8 self=0x7306f9e8
| sysTid=9649 nice=0 sched=0/0 cgrp=apps handle=1074354432
| state=R schedstat=( 35980238 61091796864 464 ) utm=0 stm=3 core=0
at dalvik.system.NativeStart.run(Native Method)
这里的MONITOR,NATIVE等线程状态很是让我们迷惑, 因为我们学习到的java thread的6种线程状态并不包含它们.
java的6种线程状态定义在/java/lang/Thread.java中:
//Thread.java
public class Thread implements Runnable {
...
public enum State {
/**
* The thread has been created, but has never been started.
*/
NEW,
/**
* The thread may be run.
*/
RUNNABLE,
/**
* The thread is blocked and waiting for a lock.
*/
BLOCKED,
/**
* The thread is waiting.
*/
WAITING,
/**
* The thread is waiting for a specified amount of time.
*/
TIMED_WAITING,
/**
* The thread has been terminated.
*/
TERMINATED
}
...
}
我们却发现这里的MONITOR并不在这6种状态中, 那么MONITOR到底代表的是什么意思呢? 就需要我们深入源码去找答案了.
"main" prio=5 tid=1 MONITOR
在VMThread.java中, 可以看到下面的代码, native thread有10种状态, 对应着java thread的6种状态.
//VMThread.java
/**
* Holds a mapping from native Thread statuses to Java one. Required for
* translating back the result of getStatus().
*/
static final Thread.State[] STATE_MAP = new Thread.State[] {
Thread.State.TERMINATED, // ZOMBIE
Thread.State.RUNNABLE, // RUNNING
Thread.State.TIMED_WAITING, // TIMED_WAIT
Thread.State.BLOCKED, // MONITOR
Thread.State.WAITING, // WAIT
Thread.State.NEW, // INITIALIZING
Thread.State.NEW, // STARTING
Thread.State.RUNNABLE, // NATIVE
Thread.State.WAITING, // VMWAIT
Thread.State.RUNNABLE // SUSPENDED
};
这段代码就给出了答案, traces.txt文件中的MONITOR状态就对应着java的BLOCKED状态, 也就是"The thread is blocked and waiting for a lock."
在之前的文章中, 已经分析了android thread的底层实现其实就是linux下的pthread.
我们再看一下native层的Thread.cpp, 有这段代码:
http://osxr.org/android/source/dalvik/vm/Thread.cpp
const char* dvmGetThreadStatusStr(ThreadStatus status)
{
switch (status) {
case THREAD_ZOMBIE: return "ZOMBIE";
case THREAD_RUNNING: return "RUNNABLE";
case THREAD_TIMED_WAIT: return "TIMED_WAIT";
case THREAD_MONITOR: return "MONITOR";
case THREAD_WAIT: return "WAIT";
case THREAD_INITIALIZING: return "INITIALIZING";
case THREAD_STARTING: return "STARTING";
case THREAD_NATIVE: return "NATIVE";
case THREAD_VMWAIT: return "VMWAIT";
case THREAD_SUSPENDED: return "SUSPENDED";
default: return "UNKNOWN";
}
}
实际上, 写入traces.txt中的线程状态值就是这个函数返回的字符串.
所以我们就知道了如下的事实:
"main" prio=5 tid=1 MONITOR 其实就是java中的BLOCKED状态.
"Timer-0" daemon prio=5 tid=23 TIMED_WAIT 其实就是java中的TIMED_WAITING状态
"QMThreadPool #3" daemon prio=3 tid=22 WAIT 其实就是java中的WAITING状态
"WifiManager" prio=5 tid=20 NATIVE 其实就是java中的RUNNABLE状态
"Signal Catcher" daemon prio=5 tid=3 RUNNABLE 其实就是java中的RUNNABLE状态
到这里, 我们也同样理解了DDMS中各个线程状态的含义.
refer :
https://docs.oracle.com/javase/7/docs/api/java/lang/Thread.State.html