ANR原因分析及解决方法

 

转载自:http://wenku.baidu.com/link?url=fUXG2wO3ECS2nZorr6MtTH8_PNCQESk-h8UpQdt1NH0_AjtgwbwW2b7PIO7oJ0OyJJzr62uB0202rXyNBK3tQtqGV3hxr_WVw6UrB7imO_a


内容目录


1 ANR是如何产生的................................................................................................................................. 5

1.1 ANR产生条件............................................................................................................................ 5

1.2 引起ANR的根本原因............................................................................................................ 5

2 ANR原因分析方法................................................................................................................................. 5

2.1 通过logcat日志分析原因.................................................................................................... 5

2.2 通过trace文件分析原因...................................................................................................... 9

3 ANR解决方法......................................................................................................................................... 12

4 参考资料.................................................................................................................................................. 12


 

 

 

 

1       ANR是如何产生的

1.1       ANR产生条件

 

ANR的产生需要同时满足三个条件:

  • 主线程:只有应用程序进程的主线程响应超时才会产生ANR;
  • 超时时间:产生ANR的上下文不同,超时时间也不同,但只要超过这个时间上限没有响应就会产生ANR;
  • 输入事件/特定操作:输入事件是指按键、触屏等设备输入事件,特定操作是指BroadcastReceiver和Service的生命周期中的各个函数调用。

 

产生ANR的上下文不同,导致ANR原因也不同,主要有以下三种情况:

  • 应用进程的主线程对输入事件在5s内没有处理完毕;
  • 应用进程的主线程在执行BroadcastRecevier的onReceive函数时10s内没有处理完毕;
  • 应用进程的主线程在执行Service的各个生命周期函数时20s内没有处理完毕;
1.2       引起ANR的根本原因

引起ANR的根本原因,总的来说可以归纳为两类:

  • 应用进程自身引起的,比如:主线程阻塞、挂起、死循环,执行耗时操作等;
  • 其他进程引起的,比如:其他进程CPU占用率过高,导致当前应用进程无法抢占到CPU时间片。常见的问题如文件读写频繁,io进程CPU占用率过高,导致当前应用出现ANR;

2       ANR原因分析方法

2.1       通过logcat日志分析原因

举一个开发过程中的ANR实例,《从云盘上下载视频到本地,进入下载中心删除任务和文件,进入文件管理提示‘文件管理无响应’》,在logcat日志中输出了以下ANR信息:

01-21 23:58:40.265   975   991 E ActivityManager: ANR in com.letv.filemanager

01-21 23:58:40.265   975   991 E ActivityManager: Reason: Executing service com.letv.filemanager/.netstorage.service.DlnaService

01-21 23:58:40.265   975   991 E ActivityManager: Load: 5.62 / 4.3 / 3.55

01-21 23:58:40.265   975   991 E ActivityManager: CPU usage from 30331ms to 0ms ago:

01-21 23:58:40.265   975   991 E ActivityManager:   26% 1760/com.letv.airplay: 24% user + 1.6% kernel / faults: 1094 minor

01-21 23:58:40.265   975   991 E ActivityManager:   15% 787/cdnclient: 0% user + 15% kernel

01-21 23:58:40.265   975   991 E ActivityManager:   13% 298/kswapd0: 0% user + 13% kernel

01-21 23:58:40.265   975   991 E ActivityManager:   9.5% 1437/com.letv.dmr: 9% user + 0.5% kernel / faults: 572 minor

01-21 23:58:40.265   975   991 E ActivityManager:   9.1% 953/flush-179:0: 0% user + 9.1% kernel

01-21 23:58:40.265   975   991 E ActivityManager:   5.8% 788/collector: 0.8% user + 4.9% kernel

01-21 23:58:40.265   975   991 E ActivityManager:   5% 419/mmcqd: 0% user + 5% kernel

01-21 23:58:40.265   975   991 E ActivityManager:   2.9% 814/ld-linux.so.3: 1.1% user + 1.8% kernel

01-21 23:58:40.265   975   991 E ActivityManager:   2.5% 822/dd: 0.2% user + 2.2% kernel

01-21 23:58:40.265   975   991 E ActivityManager:   1.4% 2533/com.letv.threeScreen: 0.7% user + 0.7% kernel / faults: 134 minor

01-21 23:58:40.265   975   991 E ActivityManager:   0.5% 817/logcat: 0% user + 0.4% kernel

01-21 23:58:40.265   975   991 E ActivityManager:   0.4% 7/events/0: 0% user + 0.4% kernel

01-21 23:58:40.265   975   991 E ActivityManager:   0.4% 1254/com.letv.t2.launcher: 0.4% user + 0% kernel / faults: 20 minor

01-21 23:58:40.265   975   991 E ActivityManager:   0.2% 975/system_server: 0% user + 0.1% kernel / faults: 34 minor 1 major

01-21 23:58:40.265   975   991 E ActivityManager:   0.2% 1452/com.letv.downloads: 0.1% user + 0.1% kernel / faults: 40 minor

01-21 23:58:40.265   975   991 E ActivityManager:   0.1% 808/virtualkeypad: 0% user + 0.1% kernel

01-21 23:58:40.265   975   991 E ActivityManager:   0.1% 1292/RTW_CMD_THREAD: 0% user + 0.1% kernel

01-21 23:58:40.265   975   991 E ActivityManager:   0% 753/jbd2/mmcblk9-8: 0% user + 0% kernel

01-21 23:58:40.265   975   991 E ActivityManager:   0% 8/events/1: 0% user + 0% kernel

01-21 23:58:40.265   975   991 E ActivityManager:   0% 177/bdi-default: 0% user + 0% kernel

01-21 23:58:40.265   975   991 E ActivityManager:   0% 786/letvmanager: 0% user + 0% kernel

01-21 23:58:40.265   975   991 E ActivityManager:   0% 1472/com.letv.filemanager: 0% user + 0% kernel

01-21 23:58:40.265   975   991 E ActivityManager:   0% 1737/com.letv.medialog: 0% user + 0% kernel / faults: 48 minor

01-21 23:58:40.265   975   991 E ActivityManager: 62% TOTAL: 7.2% user + 10% kernel + 43% iowait + 1% softirq

01-21 23:58:40.265   975   991 E ActivityManager: CPU usage from 3195ms to 3733ms later:

01-21 23:58:40.265   975   991 E ActivityManager:   15% 975/system_server: 3.7% user + 11% kernel / faults: 1 minor

01-21 23:58:40.265   975   991 E ActivityManager:     15% 991/ActivityManager: 3.7% user + 11% kernel

01-21 23:58:40.265   975   991 E ActivityManager:   7.5% 1760/com.letv.airplay: 7.5% user + 0% kernel

01-21 23:58:40.265   975   991 E ActivityManager:     5.6% 2615/SocketListener(: 5.6% user + 0% kernel

01-21 23:58:40.265   975   991 E ActivityManager:   5.5% 419/mmcqd: 0% user + 5.5% kernel

01-21 23:58:40.265   975   991 E ActivityManager:   3.7% 298/kswapd0: 0% user + 3.7% kernel

01-21 23:58:40.265   975   991 E ActivityManager:   3.7% 788/collector: 0% user + 3.7% kernel

01-21 23:58:40.265   975   991 E ActivityManager:     3.7% 866/collector: 0% user + 3.7% kernel

01-21 23:58:40.265   975   991 E ActivityManager:     1.8% 865/collector: 0% user + 1.8% kernel

01-21 23:58:40.265   975   991 E ActivityManager:   3.7% 814/ld-linux.so.3: 1.8% user + 1.8% kernel

01-21 23:58:40.265   975   991 E ActivityManager:     1.8% 1009/814Linux hotplu: 1.8% user + 0% kernel

01-21 23:58:40.265   975   991 E ActivityManager:     1.8% 1013/IRQThread_157: 0% user + 1.8% kernel

01-21 23:58:40.265   975   991 E ActivityManager:   0.3% 1254/com.letv.t2.launcher: 0.3% user + 0% kernel / faults: 1 minor

01-21 23:58:40.265   975   991 E ActivityManager:     0.3% 1254/etv.t2.launcher: 0.3% user + 0% kernel

01-21 23:58:40.265   975   991 E ActivityManager: 54% TOTAL: 2.9% user + 5.9% kernel + 45% iowait

 

我们可以通过ActivityManagerService的appNotResponding方法看一下ANR信息是如何记录到logcat日志中的:

 

.\framework\base\services\java\com\android\server\am\ActivityManagerService.java

    final void appNotResponding(ProcessRecord app, ActivityRecord activity,

            ActivityRecord parent, boolean aboveSystem, final String annotation) {

        ......

        synchronized (this) {

            // PowerManager.reboot() can block for a long time, so ignore ANRs while shutting down.

            if (mShuttingDown) {

                //重启时忽略ANR

                Slog.i(TAG, "During shutdown skipping ANR: " + app + " " + annotation);

                return;

            } else if (app.notResponding) {

                //之前已处理过则不再处理

                Slog.i(TAG, "Skipping duplicate ANR: " + app + " " + annotation);

                return;

            } else if (app.crashing) {

                //已经Crash的进程则不再处理ANR

                Slog.i(TAG, "Crashing app skipping ANR: " + app + " " + annotation);

                return;

            }

            

            //设置ANR标志,用于防止重复处理

            // In case we come through here for the same app before completing

            // this one, mark as anring now so we will bail out.

            app.notResponding = true;

           

            // Log the ANR to the event log.

            EventLog.writeEvent(EventLogTags.AM_ANR, app.userId, app.pid,

                    app.processName, app.info.flags, annotation);

           

            //将发生ANR的应用pid放入firstPids集合中

            // Dump thread traces as quickly as we can, starting with "interesting" processes.

            firstPids.add(app.pid);

           

            //将父PID也放入firstPids集合

            int parentPid = app.pid;

            if (parent != null && parent.app != null && parent.app.pid > 0) parentPid = parent.app.pid;

            if (parentPid != app.pid) firstPids.add(parentPid);

           

            //将主PID放入firstPids集合

            if (MY_PID != app.pid && MY_PID != parentPid) firstPids.add(MY_PID);

           

            //将persistent进程放入firstPids集合

            for (int i = mLruProcesses.size() - 1; i >= 0; i--) {

                ProcessRecord r = mLruProcesses.get(i);

                if (r != null && r.thread != null) {

                    int pid = r.pid;

                    if (pid > 0 && pid != app.pid && pid != parentPid && pid != MY_PID) {

                        if (r.persistent) {

                            firstPids.add(pid);

                        } else {

                            lastPids.put(pid, Boolean.TRUE);

                        }

                    }

                }

            }

        }

        // 将ANR信息写入log

        // Log the ANR to the main log.

        StringBuilder info = new StringBuilder();

        info.setLength(0);

        info.append("ANR in ").append(app.processName);

        if (activity != null && activity.shortComponentName != null) {

            info.append(" (").append(activity.shortComponentName).append(")");

        }

        info.append("\n");

        info.append("PID: ").append(app.pid).append("\n");

        if (annotation != null) {

            info.append("Reason: ").append(annotation).append("\n");

        }

        if (parent != null && parent != activity) {

            info.append("Parent: ").append(parent.shortComponentName).append("\n");

        }

 

        final ProcessCpuTracker processCpuTracker = new ProcessCpuTracker(true);

 

        File tracesFile = dumpStackTraces(true, firstPids, processCpuTracker, lastPids,

                NATIVE_STACKS_OF_INTEREST);

 

        String cpuInfo = null;

        if (MONITOR_CPU_USAGE) {

            updateCpuStatsNow();

            synchronized (mProcessCpuThread) {

                cpuInfo = mProcessCpuTracker.printCurrentState(anrTime);

            }

            info.append(processCpuTracker.printCurrentLoad());

            info.append(cpuInfo);

        }

 

        info.append(processCpuTracker.printCurrentState(anrTime));

 

        Slog.e(TAG, info.toString());

        ......

    }

 

 

AMS在发生ANR时,会按以下格式将ANR信息记录到logcat日志中:

ActivityManager: ANR in [Process Name]([short component name])

ActivityManager: PID: [Application Pid]

ActivityManager: Reason: [Annotation]

ActivityManager: Parent: [short component name of parent]

[Process CPU state]

 

其中的Reason信息可以给出ANR产生原因的一些详细信息,

·        输入事件处理引起的ANR,提示信息格式为:“Inputdispatching timed out [anr reason]

·        Service处理引起的ANR,提示信息格式为:“Executingservice [Package name]/[Short class name]

·        Broadcast处理引起的ANR,提示信息格式为:“Broadcast of [Intent focused byBroadcast receiver]

其中的Process CPU state 信息格式如下:

ActivityManager: Load: [Load1] / [Load5] / [Load15]

ActivityManager: CPU usage from [上次采样与现在的时间差] ms to [当前采样与现在的时间差] ms ago/later:

ActivityManager: [总的CPU时间占用率]% [PID]/[Process Name]: [用户CPU时间占用率]% user + [系统CPU时间占用率]% kernel + [IO等待CPU时间占用率]% iowait + [硬中断CPU时间占用率]% irq + [软中断CPU时间占用率]% softirq / faults: [次要页错误/主要页错误] minor/major

ActivityManager: .....

ActivityManager: [CPU时间占用率合计统计]

 

其中Load1, Load5, Load15分别为CPU 1分钟平均任务负载数,5分钟平均任务负载数,15分钟平均任务负载数,平均任务负载数和CPU占用率并没有必然联系,可以作为参考信息。可以通过分析各进程的CPU时间占用率,来判断是否为某些进程长期占用CPU导致该进程无法获取到足够的CPU处理时间,而导致发生ANR。

这里需要重点关注的是Load1,各进程总的CPU时间占用率,用户CPU时间占用率,系统CPU时间占用率,以及iowait CPU时间占用率。

2.2       通过trace文件分析原因

ActivityManagerService的appNotResponding方法在写入logcat日志的同时,还会将ANR时的stack trace信息写入到trace文件,具体代码流程如下:

 

 final void appNotResponding(ProcessRecord app, ActivityRecord activity,

            ActivityRecord parent, boolean aboveSystem, final String annotation) {

        ......

          File tracesFile = dumpStackTraces(true, firstPids, processCpuTracker, lastPids,

                NATIVE_STACKS_OF_INTEREST);

          ......

}

public static File dumpStackTraces(boolean clearTraces, ArrayList firstPids,

            ProcessCpuTracker processCpuTracker, SparseArray lastPids, String[] nativeProcs) {

        //判断系统属性,确认trace文件路径

        String tracesPath = SystemProperties.get("dalvik.vm.stack-trace-file", null);

        if (tracesPath == null || tracesPath.length() == 0) {

            return null;

        }

        //若trace文件不存在则创建,并设置可读写属性

        File tracesFile = new File(tracesPath);

        try {

            File tracesDir = tracesFile.getParentFile();

            if (!tracesDir.exists()) {

                tracesFile.mkdirs();

                if (!SELinux.restorecon(tracesDir)) {

                    return null;

                }

            }

            FileUtils.setPermissions(tracesDir.getPath(), 0775, -1, -1);  // drwxrwxr-x

 

            if (clearTraces && tracesFile.exists()) tracesFile.delete();

            tracesFile.createNewFile();

            FileUtils.setPermissions(tracesFile.getPath(), 0666, -1, -1); // -rw-rw-rw-

        } catch (IOException e) {

            Slog.w(TAG, "Unable to prepare ANR traces file: " + tracesPath, e);

            return null;

        }

        //执行dump操作

        dumpStackTraces(tracesPath, firstPids, processCpuTracker, lastPids, nativeProcs);

        return tracesFile;

    }

 

    private static void dumpStackTraces(String tracesPath, ArrayList firstPids,

            ProcessCpuTracker processCpuTracker, SparseArray lastPids, String[] nativeProcs) {

        //设置file observer

        // Use a FileObserver to detect when traces finish writing.

        // The order of traces is considered important to maintain for legibility.

        FileObserver observer = new FileObserver(tracesPath, FileObserver.CLOSE_WRITE) {

            @Override

            public synchronized void onEvent(int event, String path) { notify(); }

        };

 

        try {

            observer.startWatching();

          //向firstPids进程集合发送关闭信号,虚拟机接收到SIGNAL_QUIT信号后会将进程中各个线程的函//数堆栈信息输出到traces.txt文件中

            // First collect all of the stacks of the most important pids.

            if (firstPids != null) {

                try {

                    int num = firstPids.size();

                    for (int i = 0; i < num; i++) {

                        synchronized (observer) {

                            Process.sendSignal(firstPids.get(i), Process.SIGNAL_QUIT);

                            observer.wait(200);  // Wait for write-close, give up after 200msec

                        }

                    }

                } catch (InterruptedException e) {

                    Log.wtf(TAG, e);

                }

            }

 

            // Next collect the stacks of the native pids

            if (nativeProcs != null) {

                int[] pids = Process.getPidsForCommands(nativeProcs);

                if (pids != null) {

                    for (int pid : pids) {

                        Debug.dumpNativeBacktraceToFile(pid, tracesPath);

                    }

                }

            }

            //进行CPU占用率采样,并将占用率排在前几位的进程的stack trace信息dump到文件

            // Lastly, measure CPU usage.

            if (processCpuTracker != null) {

                processCpuTracker.init();

                System.gc();

                processCpuTracker.update();

                try {

                    synchronized (processCpuTracker) {

                        processCpuTracker.wait(500); // measure over 1/2 second.

                    }

                } catch (InterruptedException e) {

                }

                processCpuTracker.update();

 

                // We'll take the stack crawls of just the top apps using CPU.

                final int N = processCpuTracker.countWorkingStats();

                int numProcs = 0;

                for (int i=0; i

                    ProcessCpuTracker.Stats stats = processCpuTracker.getWorkingStats(i);

                    if (lastPids.indexOfKey(stats.pid) >= 0) {

                        numProcs++;

                        try {

                            synchronized (observer) {

                                Process.sendSignal(stats.pid, Process.SIGNAL_QUIT);

                                observer.wait(200);  // Wait for write-close, give up after 200msec

                            }

                        } catch (InterruptedException e) {

                            Log.wtf(TAG, e);

                        }

 

                    }

                }

            }

        } finally {

            observer.stopWatching();

        }

    }

 

其中firstPids的Pid内容可以参考appNotResponding方法中的实现逻辑。

3       ANR解决方法

因为ANR主要是因为主线程由于耗时操作被阻塞而产生的,所以常见的解决方法是不在主线程做耗时操作,具体实现时需要注意以下几点:

·        主线程需要做耗时操作时,比如网络访问、数据库操作及位图变换等,必须启动一子线程处理,并利用handler来更新UI;

·        子线程尽量使用Android提供的API,比如HandlerThread,AsyncTask,AsyncQueryHandler等,这些API都提供了对于线程的系统级管理。如果应用直接使用Thread实现的话,则需要对这些子线程进行显式管理,比如线程池及线程周期的控制,以防止系统资源和内存泄漏;

·        Broadcast Receiver中如果有耗时操作,可以放到service中来处理;

·        在后台子线程处理耗时操作时,为了提高用户体验,可以在前台界面显示某些动画或者progress bar;

4      参考资料

http://rayleeya.iteye.com/blog/1955652

http://rayleeya.iteye.com/blog/1955657

http://rayleeya.iteye.com/blog/1956056

你可能感兴趣的:(ANR)