Android ANR分析-笔记

1.什么是ANR

ANR:Application Not Responding,即应用无响应

2.ANR的类型

1.KeyDispatchTimeout(5 seconds)。按键或触摸事件在5s内无响应。

2.BroadcastTimeout(10 seconds)。BroadcastReceiver在10s内无法完成操作。

3.ServiceTimeout(20 seconds) 。Service在20s内无法处理完成任务。

3. 产生ANR原因

         主线程耗时操做函数

         主线程被其它线程Block

         系统级响应阻塞

         系统或进程自身可用内存紧张

         CPU资源被抢占

4.ANR 日志分析

_exp_main.text

Backtrace: 
Process: com.xxxx.xxxx //包名
PID: 8541
UID: 10166
Flags: 0x28d83ec5
Package: com.xxxx.xxxx v2181 (V5.9.4.1) //包名,版本号
Foreground: Yes
Process-Runtime: 13711627
Activity: com.xxxx.xxxx/.repair.MainActivity  //发生ANR时,应用所在的Activity
Subject: Input dispatching timed out (6f8f132 com.xxxx.xxxx/comxxxx.xxxx.repair.MainActivity (server) is not responding. Waited 5000ms for KeyEvent) //ANR 类型
Build: TECNO/xxxx/xxxxx/xxxxx/xxx:user/release-keys

在_exp_main.text最上方的一段Log描述:应用进程、pid、uid、应用包名、版本号、ANR类型(触摸事件在5s内无响应)。

CPU usage from 246337ms to 0ms ago (2022-09-26 23:52:10.872 to 2022-09-26 23:56:17.209):
  15% 676/surfaceflinger: 9% user + 6.3% kernel / faults: 3411 minor
  13% 1308/system_server: 8% user + 5% kernel / faults: 31673 minor
  9.3% 8541/com.transsion.carlcare: 6.6% user + 2.6% kernel / faults: 136116 minor
  5.7% 1764/com.android.systemui: 3.4% user + 2.3% kernel / faults: 8443 minor
  5.7% 627/[email protected]: 2.4% user + 3.2% kernel / faults: 8238 minor
  2.3% 2420/com.google.android.inputmethod.latin: 1.6% user + 0.6% kernel / faults: 30125 minor
  1.9% 943/[email protected]: 1% user + 0.8% kernel
  1.6% 4093/com.transsion.smartpanel: 1.2% user + 0.3% kernel / faults: 7857 minor
  1.5% 453/logd: 0.5% user + 1% kernel / faults: 18 minor
  1.4% 891/mobile_log_d: 0.3% user + 1% kernel / faults: 3578 minor
  1% 3033/com.google.android.gms: 0.7% user + 0.3% kernel / faults: 6351 minor
  0.9% 644/[email protected]: 0.2% user + 0.7% kernel
  0.7% 6365/com.transsion.hilauncher:BranchSearch: 0.4% user + 0.2% kernel / faults: 995 minor
  0.6% 5541/kworker/u17:3: 0% user + 0.6% kernel
  0.6% 7858/com.hoffnung.mobile.service: 0.3% user + 0.2% kernel / faults: 4051 minor
  0.6% 797/adbd: 0.1% user + 0.5% kernel / faults: 14573 minor
  0.6% 10976/kworker/u16:0: 0% user + 0.6% kernel / faults: 274 minor
  0.5% 2585/com.google.android.gms.persistent: 0.3% user + 0.1% kernel / faults: 2747 minor
  0.4% 15641/kworker/u16:1: 0% user + 0.4% kernel / faults: 274 minor
  0.4% 200/pbm: 0% user + 0.4% kernel
  0.4% 23315/com.android.commands.monkey: 0.2% user + 0.1% kernel / faults: 444 minor
  0.3% 26534/kworker/u16:2: 0% user + 0.3% kernel / faults: 227 minor
  0.3% 25937/kworker/u17:0: 0% user + 0.3% kernel
  0.3% 1877/com.android.networkstack.process: 0.3% user + 0% kernel / faults: 4200 minor
  0.3% 26562/kworker/u16:3: 0% user + 0.3% kernel / faults: 172 minor
  0.3% 5467/kworker/u16:6: 0% user + 0.3% kernel / faults: 175 minor
  0.2% 327/ipi_cpu_dvfs_rt: 0% user + 0.2% kernel
  0.2% 8/rcu_preempt: 0% user + 0.2% kernel
  0.2% 1745/hif_thread: 0% user + 0.2% kernel
  0.2% 1743/main_thread: 0% user + 0.2% kernel
  0.2% 18676/com.transsion.carlcare:ccWebview: 0.1% user + 0% kernel / faults: 3164 minor
  0.2% 27300/kworker/u17:2: 0% user + 0.2% kernel
  0.1% 27240/kworker/0:2: 0% user + 0.1% kernel
  0.1% 611/statsd: 0% user + 0% kernel / faults: 339 minor
  0.1% 134/ion_mm_heap_for: 0% user + 0.1% kernel
  0.1% 455/servicemanager: 0% user + 0% kernel
  0.1% 19980/kworker/1:1: 0% user + 0.1% kernel
  0.1% 298/irq/39-5-0034: 0% user + 0.1% kernel
  0.1% 269/disp_idlemgr: 0% user + 0.1% kernel
  0.1% 268/present_fence_w: 0% user + 0.1% kernel
  0.1% 660/audioserver: 0% user + 0% kernel
  0.1% 2012/com.transsion.hilauncher: 0% user + 0% kernel / faults: 587 minor
  0.1% 27670/kworker/4:1: 0% user + 0.1% kernel
  0.1% 1050/emdlogger1: 0% user + 0% kernel / faults: 44 minor
  0.1% 5125/kworker/u16:5: 0% user + 0.1% kernel / faults: 74 minor
  0.1% 23352/kworker/6:0: 0% user + 0.1% kernel
  0.1% 266/disp_delay_trig: 0% user + 0.1% kernel
  0.1% 626/[email protected]: 0% user + 0.1% kernel / faults: 4544 minor
  0.1% 25345/kworker/2:1: 0% user + 0.1% kernel
  0% 25427/kworker/3:0: 0% user + 0% kernel
  0% 906/[email protected]: 0% user + 0% kernel
  0% 629/[email protected]: 0% user + 0% kernel
  0% 352/charger_thread: 0% user + 0% kernel
  0% 1980/com.android.phone: 0% user + 0% kernel / faults: 794 minor
  0% 1746/rx_thread: 0% user + 0% kernel
  0% 31074/kworker/5:0: 0% user + 0% kernel
  0% 454/lmkd: 0% user + 0% kernel
  0% 1198/mtkfusionrild: 0% user + 0% kernel
  0% 216/chre_kthread: 0% user + 0% kernel
  0% 2516/iomanager: 0% user + 0% kernel
  0% 284/resource_monito: 0% user + 0% kernel
  0% 9/rcu_sched: 0% user + 0% kernel
  0% 5699/com.google.android.gms.unstable: 0% user + 0% kernel / faults: 12 minor
  0% 23267/kworker/7:0: 0% user + 0% kernel
  0% 129/dlpt_notify_thr: 0% user + 0% kernel
  0% 6548/com.transsion.plat.appupdate: 0% user + 0% kernel / faults: 592 minor
  0% 859/connsyslogger: 0% user + 0% kernel
  0% 899/wificond: 0% user + 0% kernel
  0% 11/migration/0: 0% user + 0% kernel
  0% 201/btif_rxd: 0% user + 0% kernel
  0% 261/disp_check: 0% user + 0% kernel
  0% 509/[email protected]: 0% user + 0% kernel
  0% 612/netd: 0% user + 0% kernel / faults: 40 minor
  0% 657/android.hardware.vibrator-service.mediatek: 0% user + 0% kernel
  0% 854/mtk_wmtd: 0% user + 0% kernel
  0% 914/thermal: 0% user + 0% kernel
  0% 14/migration/1: 0% user + 0% kernel
  0% 19/migration/2: 0% user + 0% kernel
  0% 136/ion_history: 0% user + 0% kernel
  0% 422/ueventd: 0% user + 0% kernel
  0% 684/fuelgauged: 0% user + 0% kernel
  0% 1051/gsm0710muxd: 0% user + 0% kernel
  0% 1739/[email protected]: 0% user + 0% kernel
  0% 8455/trace: 0% user + 0% kernel
  0% 7/ksoftirqd/0: 0% user + 0% kernel
  0% 15/ksoftirqd/1: 0% user + 0% kernel
  0% 24/migration/3: 0% user + 0% kernel
  0% 40/ksoftirqd/6: 0% user + 0% kernel
  0% 160/kswapd0: 0% user + 0% kernel
  0% 312/battery_thread: 0% user + 0% kernel
  0% 364/kworker/6:1H: 0% user + 0% kernel
  0% 852/mtk_stp_psm: 0% user + 0% kernel
  0% 880/media.metrics: 0% user + 0% kernel
  0% 2715/com.google.android.providers.media.module: 0% user + 0% kernel / faults: 6 minor
  0% 8896/android.process.acore: 0% user + 0% kernel / faults: 9 minor
  0% 20/ksoftirqd/2: 0% user + 0% kernel
  0% 25/ksoftirqd/3: 0% user + 0% kernel
  0% 29/migration/4: 0% user + 0% kernel
  0% 30/ksoftirqd/4: 0% user + 0% kernel
  0% 34/migration/5: 0% user + 0% kernel
  0% 45/ksoftirqd/7: 0% user + 0% kernel
  0% 222/ccci_poll1: 0% user + 0% kernel
  0% 254/teei_switch_thr: 0% user + 0% kernel
  0% 276/krtatm: 0% user + 0% kernel
  0% 357/wdtk-3: 0% user + 0% kernel
  0% 363/kworker/7:1H: 0% user + 0% kernel
  0% 367/kworker/1:1H: 0% user + 0% kernel
  0% 368/kworker/0:1H: 0% user + 0% kernel
  0% 416/kworker/3:1H: 0% user + 0% kernel
  0% 456/hwservicemanager: 0% user + 0% kernel
  0% 628/[email protected]: 0% user + 0% kernel
  0% 800/traced_probes: 0% user + 0% kernel
  0% 802/tne: 0% user + 0% kernel / faults: 48 minor
  0% 1812/com.transsnet.store: 0% user + 0% kernel / faults: 3 minor
  0% 2330/com.transsion.trancare:subservice: 0% user + 0% kernel / faults: 6 minor
  0% 2517/pnpmgr: 0% user + 0% kernel
  0% 2519/tranlog: 0% user + 0% kernel
  0% 2693/com.google.android.googlequicksearchbox:interactor: 0% user + 0% kernel / faults: 3 minor
  0% 2818/com.transsion.camera: 0% user + 0% kernel
 +0% 27811/kworker/6:2: 0% user + 0% kernel
 +0% 27970/kworker/2:2: 0% user + 0% kernel
 +0% 28296/kworker/0:1: 0% user + 0% kernel
8.8% TOTAL: 4.8% user + 3.9% kernel + 0% iowait + 0% softirq

 从上面的log可以发现:

第一行表明负载信息抓取在ANR发生之后的0~246337ms。同时也指明了ANR的时间段:2022-09-26 23:52:10.872 to 2022-09-26 23:56:17.209。

然后下面是各个进程占用的CPU的详细情况。

最后一行是各个进程合计占用的CPU信息。

涉及的一些重要名词:

ser:用户态,kernel:内核态
​faults:内存缺页,minor——轻微的,major——重度,需要从磁盘拿数据
​iowait:IO使用(等待)占比
​irq:硬中断,softirq:软中断

  • 进程创建到现在一共创建了多少对象
Total number of allocations 87509197
  • 进程创建到现在一共申请了多少内存
Total bytes allocated 5510MB
  •  Total bytes freed 52MB
Total bytes freed 5374MB
  • 不扩展堆的情况下可用的内存 
Free memory 23MB
  • GC前的可用内存 
Free memory until GC 23MB
  • OOM之前的可用内存
Free memory until OOME 376MB
  • 当前总内存(已用+可用) 
Total memory 159MB
  • 进程最多能申请的内存
Max memory 512MB

注意:Free memory until OOME 的值很小的时候,已经处于内存紧张状态。应用可能是占用了过多内存。

除了trace文件中有内存信息,普通的eventlog日志中,也有内存信息。

SYS_ANDROID_EVENT_LOG.txt

09-26 23:56:41.661  1308  7845 I am_meminfo: [3266568192,240267264,77824,245788672,910894080]

以上四个值分别指的是:Cached,Free,Zram,Kernel,Native。

Cached+Free的内存代表着当前整个手机的可用内存,如果值很小,意味着处于内存紧张状态。一般低内存的判定阈值为:4G 内存手机以下阀值:350MB,以上阀值则为:450MB。如果ANR时间点前后,日志里有打印onTrimMemory,也可以作为内存紧张的一个参考判断。

5.堆栈消息

堆栈信息是最重要的一个信息,展示了ANR发生的进程当前所有线程的状态。

线程名称:"main"

线程优先级:“prio”,此时优先级5。

JVM线程id: tid = 1

线程状态:Runnable

suspend all histogram:	Sum: 86.658ms 99% C.I. 5.132us-1021.385us Avg: 70.856us Max: 33536us
DALVIK THREADS (67):
"main" prio=5 tid=1 Runnable
  | group="main" sCount=0 dsCount=0 flags=0 obj=0x731f53f8 self=0xb400007b21471be0
  | sysTid=8541 nice=-10 cgrp=default sched=0/0 handle=0x7c484324f8
  | state=R schedstat=( 591843734006 11890654357 895335 ) utm=43825 stm=15359 core=7 HZ=100
  | stack=0x7ff99af000-0x7ff99b1000 stackSize=8192KB
  | held mutexes= "mutator lock"(shared held)
  at android.icu.impl.SoftCache.getInstance(SoftCache.java:51)
  at android.icu.util.ULocale.forLocale(ULocale.java:396)
  at android.text.TextUtils.getLayoutDirectionFromLocale(TextUtils.java:2066)
  at android.content.res.Configuration.setLayoutDirection(Configuration.java:2181)
  at android.content.res.Configuration.setLocales(Configuration.java:2127)
  at android.content.res.ResourcesImpl.calcConfigChanges(ResourcesImpl.java:533)
  at android.content.res.ResourcesImpl.updateConfiguration(ResourcesImpl.java:414)
  - locked <0x079019e2> (a java.lang.Object)
  at android.content.res.Resources.updateConfiguration(Resources.java:2020)
  at android.content.res.Resources.updateConfiguration(Resources.java:2012)
  at com.transsion.common.activity.SionBaseActivity.getResources(SourceFile:4)
  at android.view.LayoutInflater.inflate(LayoutInflater.java:520)
  at k.y.a.c.m(SourceFile:1)
  at k.y.a.c.onCreateViewHolder(SourceFile:1)
  at androidx.recyclerview.widget.RecyclerView$g.createViewHolder(SourceFile:2)
  at androidx.recyclerview.widget.RecyclerView$u.I(SourceFile:28)
  at androidx.recyclerview.widget.RecyclerView$u.p(SourceFile:1)
  at androidx.recyclerview.widget.RecyclerView$u.o(SourceFile:1)
  at androidx.recyclerview.widget.LinearLayoutManager$c.d(SourceFile:3)
  at androidx.recyclerview.widget.LinearLayoutManager.layoutChunk(SourceFile:1)
  at androidx.recyclerview.widget.LinearLayoutManager.fill(SourceFile:9)
  at androidx.recyclerview.widget.LinearLayoutManager.onLayoutChildren(SourceFile:71)
  at androidx.recyclerview.widget.RecyclerView.dispatchLayoutStep2(SourceFile:8)
  at androidx.recyclerview.widget.RecyclerView.dispatchLayout(SourceFile:9)
  at androidx.recyclerview.widget.RecyclerView.consumePendingUpdateOperations(SourceFile:21)
  at androidx.recyclerview.widget.RecyclerView$b0.run(SourceFile:5)
  at android.view.Choreographer$CallbackRecord.run(Choreographer.java:1030)
  at android.view.Choreographer.doCallbacks(Choreographer.java:840)
  at android.view.Choreographer.doFrame(Choreographer.java:763)
  at android.view.Choreographer$FrameDisplayEventReceiver.run(Choreographer.java:1002)
  at android.os.Handler.handleCallback(Handler.java:938)
  at android.os.Handler.dispatchMessage(Handler.java:99)
  at android.os.Looper.loop(Looper.java:268)
  at android.app.ActivityThread.main(ActivityThread.java:8000)
  at java.lang.reflect.Method.invoke(Native method)
  at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:627)
  at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:997)

"OkHttp ConnectionPool" daemon prio=5 tid=42 TimedWaiting
  | group="main" sCount=1 dsCount=0 flags=1 obj=0x12c81c20 self=0xb400007b214ab1b0
  | sysTid=8604 nice=0 cgrp=default sched=0/0 handle=0x792bca0cc0
  | state=S schedstat=( 256337769 38916225 1410 ) utm=13 stm=11 core=7 HZ=100
  | stack=0x792bb9d000-0x792bb9f000 stackSize=1043KB
  | held mutexes=
  kernel: __switch_to+0x134/0x150
  kernel: futex_wait_queue_me+0xc8/0x114
  kernel: futex_wait+0x11c/0x310
  kernel: do_futex+0x120/0x1b60
  kernel: SyS_futex+0x148/0x1c8
  kernel: __sys_trace+0x48/0x48
  at java.lang.Object.wait(Native method)
  - waiting on <0x014b83b7> (a com.android.okhttp.ConnectionPool)
  at com.android.okhttp.ConnectionPool$1.run(ConnectionPool.java:106)
  - locked <0x014b83b7> (a com.android.okhttp.ConnectionPool)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1167)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:641)
  at java.lang.Thread.run(Thread.java:923)

在上面两行log中,一个是主线程main,它的状态是Runnable。另一个是OkHttp ConnectionPool,它的状态是TimeWaiting。

6.Thread状态分析

线程的状态是一个很重要的东西,因此thread dump中会显示这些状态,通过对这些状态的分析,能够得出线程的运行状况,进而发现可能存在的问题。

Java中的线程状态有6种:

  public static enum State {
        NEW,
        RUNNABLE,
        BLOCKED,
        WAITING,
        TIMED_WAITING,
        TERMINATED;

        private State() {
        }
    }
public enum State   
{  
       /** 
        * Thread state for a thread which has not yet started. 
        */  
       NEW,  
         
       /** 
        * Thread state for a runnable thread.  A thread in the runnable 
        * state is executing in the Java virtual machine but it may 
        * be waiting for other resources from the operating system 
        * such as processor. 
        */  
       RUNNABLE,  
         
       /** 
        * Thread state for a thread blocked waiting for a monitor lock. 
        * A thread in the blocked state is waiting for a monitor lock 
        * to enter a synchronized block/method or  
        * reenter a synchronized block/method after calling 
        * {@link Object#wait() Object.wait}. 
        */  
       BLOCKED,  
     
       /** 
        * Thread state for a waiting thread. 
        * A thread is in the waiting state due to calling one of the  
        * following methods: 
        * 
    *
  • {@link Object#wait() Object.wait} with no timeout
  • *
  • {@link #join() Thread.join} with no timeout
  • *
  • {@link LockSupport#park() LockSupport.park}
  • *
* *

A thread in the waiting state is waiting for another thread to * perform a particular action. * * For example, a thread that has called Object.wait() * on an object is waiting for another thread to call * Object.notify() or Object.notifyAll() on * that object. A thread that has called Thread.join() * is waiting for a specified thread to terminate. */ WAITING, /** * Thread state for a waiting thread with a specified waiting time. * A thread is in the timed waiting state due to calling one of * the following methods with a specified positive waiting time: *

    *
  • {@link #sleep Thread.sleep}
  • *
  • {@link Object#wait(long) Object.wait} with timeout
  • *
  • {@link #join(long) Thread.join} with timeout
  • *
  • {@link LockSupport#parkNanos LockSupport.parkNanos}
  • *
  • {@link LockSupport#parkUntil LockSupport.parkUntil}
  • *
*/ TIMED_WAITING, /** * Thread state for a terminated thread. * The thread has completed execution. */ TERMINATED; }

状态 解释
New 线程创建,但是还没有start()。每一个线程,在堆内存中都有一个对应的Thread对象。Thread t = new Thread();当刚刚在堆内存中创建Thread对象,还没有调用t.start()方法之前,线程就处在NEW状态。在这个状态上,线程与普通的java对象没有什么区别,就仅仅是一个堆内存中的对象。
RUNNABLE 线程执行或者在等待系统资源,例如CPU资源。该状态表示线程具备所有运行条件,在运行队列中准备操作系统的调度,或者正在运行。 这个状态的线程比较正常,但如果线程长时间停留在在这个状态就不正常了,这说明线程运行的时间很长(存在性能问题),或者是线程一直得不得执行的机会(存在线程饥饿的问题)。
BLOCKED 线程在等待其它线程释放一个锁,则会进入 Blocked 状态。例如在等待一个同步代码块的锁,获得锁则进入 Runnable 状态。线程正在等待获取java对象的监视器(也叫内置锁),即线程正在等待进入由synchronized保护的方法或者代码块。synchronized用来保证原子性,任意时刻最多只能由一个线程进入该临界区域,其他线程只能排队等待。
WAITING

线程在等待一个其它线程唤醒它。可能是调用了 Object.wait(),Thread.join() ,LockSupport.part()。

处在该线程的状态,正在等待某个事件的发生,只有特定的条件满足,才能获得执行机会。而产生这个特定的事件,通常都是另一个线程。也就是说,如果不发生特定的事件,那么处在该状态的线程一直等待,不能获取执行的机会。比如:

A线程调用了obj对象的obj.wait()方法,如果没有线程调用obj.notify或obj.notifyAll,那么A线程就没有办法恢复运行;(WAITING)
如果A线程调用了LockSupport.park(),没有别的线程调用LockSupport.unpark(A),那么A没有办法恢复运行。(WAITING)

TIMED_WAITING 同上,只不过方法换成了 Object.wait(timeOut) ,Thread.join(millis) ,LockSupport.part(deadline)。J.U.C中很多与线程相关类,都提供了限时版本和不限时版本的API。TIMED_WAITING意味着线程调用了限时版本的API,正在等待时间流逝。当等待时间过去后,线程一样可以恢复运行。如果线程进入了WAITING状态,一定要特定的事件发生才能恢复运行;而处在TIMED_WAITING的线程,如果特定的事件发生或者是时间流逝完毕,都会恢复运行。
TERMINATED 线程销毁。线程执行完毕,执行完run方法正常返回,或者抛出了运行时异常而结束,线程都会停留在这个状态。这个时候线程只剩下Thread对象了,没有什么用了。

trace文件中的状态是是CPP代码中定义的状态(native状态)。

java thread状态 C++ 层Thread 状态 解释
NEW INITIALIZING 新建,正在分配资源
NEW STARING 新建,正在启动
RUNNABLE RUNING/RUNNABLE 线程创建或者正在运行
RUNNABLE NATIVE 正在执行 JNI 方法
RUNNABLE SUSPENDED 现在挂起,一般是GC导致
BLOCKED MOINITOR 线程阻塞,等待其它线程释放对象锁
WAITING WAIT 执行了无超时参数的 wait() 等方法
WAITING VMWAIT 正在等待 VM 资源
TERMINATED ZOMBIE 线程销毁

如果主线程是MOINITOR状态,那么 ANR 的原因就是主线程被阻塞。如果主线程是WAIT状态,那么 ANR 的原因就是等待子线程释放锁。

7.Java thread statck trace

suspend all histogram:	Sum: 86.658ms 99% C.I. 5.132us-1021.385us Avg: 70.856us Max: 33536us
DALVIK THREADS (67):
"main" prio=5 tid=1 Runnable
  | group="main" sCount=0 dsCount=0 flags=0 obj=0x731f53f8 self=0xb400007b21471be0
  | sysTid=8541 nice=-10 cgrp=default sched=0/0 handle=0x7c484324f8
  | state=R schedstat=( 591843734006 11890654357 895335 ) utm=43825 stm=15359 core=7 HZ=100
  | stack=0x7ff99af000-0x7ff99b1000 stackSize=8192KB
  | held mutexes= "mutator lock"(shared held)
  at android.icu.impl.SoftCache.getInstance(SoftCache.java:51)
  at android.icu.util.ULocale.forLocale(ULocale.java:396)
  at android.text.TextUtils.getLayoutDirectionFromLocale(TextUtils.java:2066)
  at android.content.res.Configuration.setLayoutDirection(Configuration.java:2181)
  at android.content.res.Configuration.setLocales(Configuration.java:2127)
  at android.content.res.ResourcesImpl.calcConfigChanges(ResourcesImpl.java:533)
  at android.content.res.ResourcesImpl.updateConfiguration(ResourcesImpl.java:414)
  - locked <0x079019e2> (a java.lang.Object)
  at android.content.res.Resources.updateConfiguration(Resources.java:2020)
  at android.content.res.Resources.updateConfiguration(Resources.java:2012)
  at com.transsion.common.activity.SionBaseActivity.getResources(SourceFile:4)
  at android.view.LayoutInflater.inflate(LayoutInflater.java:520)
  at k.y.a.c.m(SourceFile:1)
  at k.y.a.c.onCreateViewHolder(SourceFile:1)
  at androidx.recyclerview.widget.RecyclerView$g.createViewHolder(SourceFile:2)
  at androidx.recyclerview.widget.RecyclerView$u.I(SourceFile:28)
  at androidx.recyclerview.widget.RecyclerView$u.p(SourceFile:1)
  at androidx.recyclerview.widget.RecyclerView$u.o(SourceFile:1)
  at androidx.recyclerview.widget.LinearLayoutManager$c.d(SourceFile:3)
  at androidx.recyclerview.widget.LinearLayoutManager.layoutChunk(SourceFile:1)
  at androidx.recyclerview.widget.LinearLayoutManager.fill(SourceFile:9)
  at androidx.recyclerview.widget.LinearLayoutManager.onLayoutChildren(SourceFile:71)
  at androidx.recyclerview.widget.RecyclerView.dispatchLayoutStep2(SourceFile:8)
  at androidx.recyclerview.widget.RecyclerView.dispatchLayout(SourceFile:9)
  at androidx.recyclerview.widget.RecyclerView.consumePendingUpdateOperations(SourceFile:21)
  at androidx.recyclerview.widget.RecyclerView$b0.run(SourceFile:5)
  at android.view.Choreographer$CallbackRecord.run(Choreographer.java:1030)
  at android.view.Choreographer.doCallbacks(Choreographer.java:840)
  at android.view.Choreographer.doFrame(Choreographer.java:763)
  at android.view.Choreographer$FrameDisplayEventReceiver.run(Choreographer.java:1002)
  at android.os.Handler.handleCallback(Handler.java:938)
  at android.os.Handler.dispatchMessage(Handler.java:99)
  at android.os.Looper.loop(Looper.java:268)
  at android.app.ActivityThread.main(ActivityThread.java:8000)
  at java.lang.reflect.Method.invoke(Native method)
  at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:627)
  at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:997)

Java stack trace提供了大部分信息来精确定位问题根源。

例子:

java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
-waiting on <0xb3885f60> (a java.util.TaskQueue)     # 继续wait 
at java.util.TimerThread.mainLoop(Timer.java:509)
-locked <0xb3885f60> (a java.util.TaskQueue)         # 已经locked
at java.util.TimerThread.run(Timer.java:462)

堆栈信息应该逆向解读:

- locked <0xb3885f60> (a java.util.ArrayList)
- waiting on <0xb3885f60> (a java.util.ArrayList) 

 对象先上锁,锁住对象0xb3885f60,然后释放该对象锁,进入waiting状态。对应代码如下:

synchronized(obj) {  
   .........  
   obj.wait();  
   .........  
}

 先用 synchronized 获得了这个对象的 Monitor-------> locked <0xb3885f60>

当执行到 obj.wait(),线程即放弃了 Monitor的所有权,进入 “wait set”队列(-------> waiting on <0xb3885f60> )

8. 关键状态分析

1.当前线程等待另一个条件的发生,来把自己唤醒,或者当前线程调用了 sleep(n)。

此时线程处于的状态:

java.lang.Thread.State: WAITING (parking):一直等那个条件发生;
java.lang.Thread.State: TIMED_WAITING (parking或sleeping):定时的,那个条件不到来,也将定时唤醒自己。
 

2.Waiting for Monitor Entry 和 in Object.wait()。

Monitor是Java中用以实现线程之间的互斥与协作的主要手段,它可以看成是对象或者Class的锁。每一个对象都有,也仅有一个 Monitor。

synchronized(obj) {
   .........
}

一般CPU很忙时,则关注runnable的线程,CPU很闲时,则关注waiting for monitor entry的线程。

9.Android性能指标CPU关注点 

1.CPU总体使用率
2.应用程序CPU占用率

获取App CPU指标值的方式:

1.proc文件获取方式

/proc文件系统是一个伪文件系统,它只存在内存当中,而不占用外存空间。用户和应用程序可以通过/proc得到系统的信息,并可以改变内核的某些参数。安卓性能指标cpu总体使用率和应用程序cpu占用率主要与两个proc文件相关,分别是/proc/stat和/proc/进程id/stat文件.。
通过adb shell进入到手机内部shell模式,再通过cat /proc/stat 查看结果如下:

Android ANR分析-笔记_第1张图片

 前面几行cpu cpu0 ... cpu7是我们需要关注的重点,cpu0~cpu7表示当前CPU的核心(多核),cpu为总的Jiffies,这里引入了Jiffies(时间片)的概念,Jiffies的介绍如下:

Jiffies 为Linux核心变数,是一个unsigned long类型的变量,它被用来记录系统自开机以来,已经过了多少tick。每发生一次timer interrupt,Jiffies变数会被加1。

上面每一列的数值含义如下:
user :从系统启动开始累计到当前时刻,用户态的jiffies ,不包含nice值为负进程;
nice :从系统启动开始累计到当前时刻,nice值为负的进程所占用的jiffies;
system :从系统启动开始累计到当前时刻,系统态的jiffies;
idle :从系统启动开始累计到当前时刻,除硬盘IO等待时间以外其它等待的jiffies;
iowait : 从系统启动开始累计到当前时刻,硬盘IO等待的jiffies;
irq : 从系统启动开始累计到当前时刻,硬中断的jiffies;
softirq :从系统启动开始累计到当前时刻,软中断的jiffies。

总的Jiffies就是上面所有项加起来的总和。因此我们计算一段时间的CPU占用率的时候就可以使用:
total=user+system+nice+idle+iowait+irq+softirq
cpu usage=[(user_end +sys_end+nice_end) - (user_begin + sys_begin+nice_begin)]/(total_end - total_begin)*100

上述方法统计的是当前系统所有进程的CPU总和使用率。

如果我们要统计某个App的使用率可以进入到/proc/进程id/stat目录,其中就有包含某进程的CPU信息。

首先,我们需要查询到App的进程ID:

其中的进程ID为5451,再查询stat文件信息:

在第14行、15行有记录当前进程的Jiffies信息
utime=1565 该任务在用户态运行的时间,单位为jiffies
stime=455 该任务在核心态运行的时间,单位为jiffies
所以当前进程的Jiffies计算方式为utime+stime
通过shell脚本获取的方式:
cat /proc/进程id/stat | awk -F " " '{print $14,$15}'
要计算出一段时间内该进程的CPU使用率信息即可通过:
utime+stime(当前时间点)-utime+stime(旧时间点)/ total cpu Jiffies

top命令获取CPU使用率
top命令是Linux下常用的性能分析工具,能够实时显示系统中各个进程的资源占用状况,类似于Windows的任务管理器。top是一个动态显示过程,即可以通过用户按键来不断刷新当前状态。如果在前台执行该命令,它将独占前台,直到用户终止该程序为止。top命令提供了实时的对系统处理器的状态监视。
使用方式:
可查看占用cpu最高的前10个程序(-t 显示进程名称,-s 按指定行排序,-n 在退出前刷新几次,-d 刷新间隔,-m 显示最大数量):top -m 10 -s cpu
如果你想筛选出你自己的应用的话可以用下面这一命令:
adb shell top -n 1| grep PackageName

10. 问题场景

1.CPU飙高,load高,响应很慢

一个请求过程中多次dump;
对比多次dump文件的runnable线程,如果执行的方法有比较大变化,说明比较正常。如果在执行同一个方法,就有一些问题了;

2.查找占用CPU最多的线程

使用命令:top -H -p pid(pid为被测系统的进程号),找到导致CPU高的线程ID,对应thread dump信息中线程的nid,只不过一个是十进制,一个是十六进制;
在thread dump中,根据top命令查找的线程id,查找对应的线程堆栈信息;

3.CPU使用率不高但是响应很慢

进行dump,查看是否有很多thread struck在了i/o、数据库等地方,定位瓶颈原因;

4.请求无法响应

多次dump,对比是否所有的runnable线程都一直在执行相同的方法,如果是的,恭喜你,锁住了!

5. 死锁

死锁经常表现为程序的停顿,或者不再响应用户的请求。从操作系统上观察,对应进程的CPU占用率为零,很快会从top或prstat的输出中消失。

比如在下面这个示例中,是个较为典型的死锁情况:

"Thread-1" prio=5 tid=0x00acc490 nid=0xe50 waiting for monitor entry [0x02d3f000 ..0x02d3fd68] 
at deadlockthreads.TestThread.run(TestThread.java:31) 
- waiting to lock <0x22c19f18> (a java.lang.Object) 
- locked <0x22c19f20> (a java.lang.Object) 

"Thread-0" prio=5 tid=0x00accdb0 nid=0xdec waiting for monitor entry [0x02cff000 ..0x02cff9e8] 
at deadlockthreads.TestThread.run(TestThread.java:31) 
- waiting to lock <0x22c19f20> (a java.lang.Object) 
- locked <0x22c19f18> (a java.lang.Object) 

在 JAVA 5中加强了对死锁的检测。线程 Dump中可以直接报告出 Java级别的死锁,如下所示:


Found one Java-level deadlock: 
============================= 
"Thread-1": 
waiting to lock monitor 0x0003f334 (object 0x22c19f18, a java.lang.Object), 
which is held by "Thread-0" 

"Thread-0": 
waiting to lock monitor 0x0003f314 (object 0x22c19f20, a java.lang.Object), 
which is held by "Thread-1" 

6.热锁

热锁,也往往是导致系统性能瓶颈的主要因素。其表现特征为:由于多个线程对临界区,或者锁的竞争,可能出现:

1.频繁的线程的上下文切换:从操作系统对线程的调度来看,当线程在等待资源而阻塞的时候,操作系统会将之切换出来,放到等待的队列,
当线程获得资源之后,调度算法会将这个线程切换进去,放到执行队列中。

2.大量的系统调用:因为线程的上下文切换,以及热锁的竞争,或者临界区的频繁的进出,都可能导致大量的系统调用。

3.大部分CPU开销用在“系统态”:线程上下文切换,和系统调用,都会导致 CPU在 “系统态 ”运行,换而言之,虽然系统很忙碌,但是CPU用在 “用户态 ”的比例较小,应用程序得不到充分的 CPU资源。

4.随着CPU数目的增多,系统的性能反而下降。因为CPU数目多,同时运行的线程就越多,可能就会造成更频繁的线程上下文切换和系统态的CPU开销,从而导致更糟糕的性能。


上面的描述,都是一个 scalability(可扩展性)很差的系统的表现。从整体的性能指标看,由于线程热锁的存在,程序的响应时间会变长,吞吐量会降低。

参考:

啃碎并发(四):Java线程Dump分析 - 掘金 (juejin.cn)

你可能感兴趣的:(Android,Java线程Dump分析,Android,ANR分析)