Android系统(245)---SystemServer进程的创建流程

Android进程系列第三篇---SystemServer进程的创建流程

 

一、内容预览

Android系统(245)---SystemServer进程的创建流程_第1张图片

SystemServer进程的启动.png

二、概述

前面进程系列已经更新了两篇,本文(基于Android O源码)主要讲解SystemServer进程创建流程上半部分,下半部梳理一下SytemServer进程创建之后的启动阶段以及运行的核心服务。
Android进程系列第一篇---进程基础
Android进程系列第二篇---Zygote进程的创建流程

简要回顾上一篇的重点的内容

  • Zygote进程实质是一种C/S架构,Zygote进程作为Server端,处理四面八方的客户端通过Socket发送来的创建进程的请求;
  • 总结了Socket通信的框架,Init进程add了socket的fd,Zygote进程get到这个fd,创建了LocalServerSocket;
  • 总结了Zygote进程做为所有应用进程的原因是什么;
  • 总结Zygote进程如何进行资源的预加载,以及Zygote进程为什么不能在子线程中加载进程的资源

本篇文章主要写SystemServer进程的创建,SystemServer进程是Zygote进程的大弟子,是Zygote进程fork的第一个进程,Zygote和SystemServer这两个进程顶起了Java世界的半边天,任何一个进程的死亡,都会导致Java世界的崩溃。通常我们大多数死机重启问题也是发生在了SystemServer进程中。SystemServer进程运行了几十种核心服务,为了防止应用进程对系统造成破坏,应用进程没有权限访问系统的资源,只能通过SystemServer进程的代理来访问,从这几点可见SystemServer进程相当重要。

三、SystemServer的创建流程

Android系统(245)---SystemServer进程的创建流程_第2张图片

SystemServer进程的创建.png

3.1、ZygoteInit的main方法

上图是SystemServer的创建序列图,我们仍然从ZygoteInit的main方法开始说起,再次亮出下面的“模板”代码。

  frameworks/base/core/java/com/android/internal/os/ZygoteInit.java
 public static void main(String argv[]) {
        //1、创建ZygoteServer
        ZygoteServer zygoteServer = new ZygoteServer();
        try {
            //2、创建一个Server端的Socket
            zygoteServer.registerServerSocket(socketName);
            //3、加载进程的资源和类
            preload(bootTimingsTraceLog);
            if (startSystemServer) {
                //4、开启SystemServer进程,这是受精卵进程的第一次分裂
                startSystemServer(abiList, socketName, zygoteServer);
            }
            //5、启动一个死循环监听来自Client端的消息
            zygoteServer.runSelectLoop(abiList);
             //6、关闭SystemServer的Socket
            zygoteServer.closeServerSocket();
        } catch (Zygote.MethodAndArgsCaller caller) {
             //7、这里捕获这个异常调用MethodAndArgsCaller的run方法。
            caller.run();
        } catch (Throwable ex) {
            Log.e(TAG, "System zygote died with exception", ex);
            zygoteServer.closeServerSocket();
            throw ex;
        }
    }

ZygoteInit的main方法有7个关键点,1,2,3小点我们在上一篇已经进行了梳理,现在从第四点开始分析。


590    /**
591     * Prepare the arguments and fork for the system server process.
592     */
593    private static boolean startSystemServer(String abiList, String socketName, ZygoteServer zygoteServer)
594            throws Zygote.MethodAndArgsCaller, RuntimeException {
              .........
613        /* Hardcoded command line to start the system server */
614        String args[] = {
615            "--setuid=1000",
616            "--setgid=1000",
617            "--setgroups=1001,1002,1003,1004,1005,1006,1007,1008,1009,1010,1018,1021,1023,1032,3001,3002,3003,3006,3007,3009,3010",
618            "--capabilities=" + capabilities + "," + capabilities,
619            "--nice-name=system_server",
620            "--runtime-args",
621            "com.android.server.SystemServer",
622        };
623        ZygoteConnection.Arguments parsedArgs = null;
624
625        int pid;
626
627        try {
628            parsedArgs = new ZygoteConnection.Arguments(args);
629            ZygoteConnection.applyDebuggerSystemProperty(parsedArgs);
630            ZygoteConnection.applyInvokeWithSystemProperty(parsedArgs);
631
632            //创建System进程,底层调用fork函数,见3.2小节
633            pid = Zygote.forkSystemServer(
634                    parsedArgs.uid, parsedArgs.gid,
635                    parsedArgs.gids,
636                    parsedArgs.debugFlags,
637                    null,
638                    parsedArgs.permittedCapabilities,
639                    parsedArgs.effectiveCapabilities);
640        } catch (IllegalArgumentException ex) {
641            throw new RuntimeException(ex);
642        }
643
644        //fork函数会返回两次,pid==0意味着子进程创建成功
645        if (pid == 0) {
               //如果机器支持32位应用,需要等待32位的Zygote连接成功
646            if (hasSecondZygote(abiList)) {
647                waitForSecondaryZygote(socketName);
648            }
649            //关闭从Zygote进程继承来的Socket
650            zygoteServer.closeServerSocket();
                 //处理SytemServer进程接下来的事情,见3.4小节
651            handleSystemServerProcess(parsedArgs);
652        }
653
654        return true;
655    }
656
  • 1、将数组args转换成 ZygoteConnection.Arguments的形式,实质就是给 ZygoteConnection.Arguments中成员变量赋值,那么这些参数是什么意思呢?
614        String args[] = {
615            "--setuid=1000",
616            "--setgid=1000",
617            "--setgroups=1001,1002,1003,1004,1005,1006,1007,1008,1009,1010,1018,1021,1023,1032,3001,3002,3003,3006,3007,3009,3010",
618            "--capabilities=" + capabilities + "," + capabilities,
619            "--nice-name=system_server",
620            "--runtime-args",
621            "com.android.server.SystemServer",
622        };

SystemServer进程的pid和gid都设置成1000,setgroups指定进程所属组,capabilities可设定进程的权限,nice-names是进程的名称,执行类是com.android.server.SystemServer。

  • 2、调用forkSystemServer fork出系统进程,实质还是调用C层的fork函数(基于写时复制机制),如果返回的pid=0,代表成功fork出System进程。
  • 3 、当Zygote复制出新的进程时,由于复制出的新进程与Zygote进程共享内存空间,而在Zygote进程中创建的服务端Socket是新进程不需要的,所以新创建的进程需调用 zygoteServer.closeServerSocket()方法关闭该Socket服务端。

3.2、Zygote的forkSystemServer方法

/frameworks/base/core/java/com/android/internal/os/Zygote.java
146    public static int forkSystemServer(int uid, int gid, int[] gids, int debugFlags,
147            int[][] rlimits, long permittedCapabilities, long effectiveCapabilities) {
148        VM_HOOKS.preFork();
149        // Resets nice priority for zygote process.
150        resetNicePriority();
151        int pid = nativeForkSystemServer(
152                uid, gid, gids, debugFlags, rlimits, permittedCapabilities, effectiveCapabilities);
153        // Enable tracing as soon as we enter the system_server.
154        if (pid == 0) {
155            Trace.setTracingEnabled(true);
156        }
157        VM_HOOKS.postForkCommon();
158        return pid;
159    }

nativeForkSystemServer是一个JNI方法,是在AndroidRuntime.cpp中注册的,调用com_android_internal_os_Zygote.cpp中的register_com_android_internal_os_Zygote()方法建立native方法的映射关系。

/frameworks/base/core/jni/com_android_internal_os_Zygote.cpp
728static jint com_android_internal_os_Zygote_nativeForkSystemServer(
729        JNIEnv* env, jclass, uid_t uid, gid_t gid, jintArray gids,
730        jint debug_flags, jobjectArray rlimits, jlong permittedCapabilities,
731        jlong effectiveCapabilities) {
732  pid_t pid = ForkAndSpecializeCommon(env, uid, gid, gids,
733                                      debug_flags, rlimits,
734                                      permittedCapabilities, effectiveCapabilities,
735                                      MOUNT_EXTERNAL_DEFAULT, NULL, NULL, true, NULL,
736                                      NULL, NULL, NULL);
737  if (pid > 0) {
738      // The zygote process checks whether the child process has died or not.
739      ALOGI("System server process %d has been created", pid);
740      gSystemServerPid = pid;
741      // There is a slight window that the system server process has crashed
742      // but it went unnoticed because we haven't published its pid yet. So
743      // we recheck here just to make sure that all is well.
744      int status;
745      if (waitpid(pid, &status, WNOHANG) == pid) {
746          ALOGE("System server process %d has died. Restarting Zygote!", pid);
747          RuntimeAbort(env, __LINE__, "System server process has died. Restarting Zygote!");
748      }
749  }
750  return pid;
751}

这里需要解释一下waitpid函数

  • 如果在调用waitpid()函数时,当指定等待的子进程已经停止运行或结束了,则waitpid()会立即返回;但是如果子进程还没有停止运行或结束,则调用waitpid()函数的父进程则会被阻塞,暂停运行。

  • status这个参数将保存子进程的状态信息,有了这个信息父进程就可以了解子进程为什么会退出,是正常退出还是出了什么错误。如果status不是空指针,则状态信息将被写入。

  • waitpid()函数第三个参数有两个选项,一是WNOHANG,如果pid指定的子进程没有结束,则waitpid()函数立即返回0,而不是阻塞在这个函数上等待;如果结束了,则返回该子进程的进程号。二是WUNTRACED,如果子进程进入暂停状态,则马上返回。

所以(waitpid(pid, &status, WNOHANG) == pid成立的时候,这意味着SytemServer进程died了,需要重启Zygote进程。继续看ForkAndSpecializeCommon函数。


474// Utility routine to fork zygote and specialize the child process.
475static pid_t ForkAndSpecializeCommon(JNIEnv* env, uid_t uid, gid_t gid, jintArray javaGids,
476                                     jint debug_flags, jobjectArray javaRlimits,
477                                     jlong permittedCapabilities, jlong effectiveCapabilities,
478                                     jint mount_external,
479                                     jstring java_se_info, jstring java_se_name,
480                                     bool is_system_server, jintArray fdsToClose,
481                                     jintArray fdsToIgnore,
482                                     jstring instructionSet, jstring dataDir) {
       //设置子进程的signal信号处理函数,见3.3小节
483  SetSigChldHandler();        
516  ......
        //fork子进程
517  pid_t pid = fork();
518
519  if (pid == 0) {
520    // The child process.
       ......
576    if (!is_system_server) {
577        int rc = createProcessGroup(uid, getpid());
578        if (rc != 0) {
579            if (rc == -EROFS) {
580                ALOGW("createProcessGroup failed, kernel missing CONFIG_CGROUP_CPUACCT?");
581            } else {
582                ALOGE("createProcessGroup(%d, %d) failed: %s", uid, pid, strerror(-rc));
583            }
584        }
585    }
586
587    SetGids(env, javaGids);//设置设置group
588
589    SetRLimits(env, javaRlimits);//设置资源limit
590
597    int rc = setresgid(gid, gid, gid);
598    if (rc == -1) {
599      ALOGE("setresgid(%d) failed: %s", gid, strerror(errno));
600      RuntimeAbort(env, __LINE__, "setresgid failed");
601    }
602
603    rc = setresuid(uid, uid, uid);//设置uid
      .......
617
618    SetCapabilities(env, permittedCapabilities, effectiveCapabilities, permittedCapabilities);
619
620    SetSchedulerPolicy(env);//设置调度策略
621
          .......
         //创建selinux上下文
640    rc = selinux_android_setcontext(uid, is_system_server, se_info_c_str, se_name_c_str);
          .......
666  } else if (pid > 0) {
          .......
673    }
674  }
675  return pid;
676}
677}  // anonymous namespace
678

值得注意的是在fork之前,调用了SetSigChldHandler,SetSigChldHandler定义了信号处理函数SigChldHandler,当信号SIGCHLD到来的时候,会进入3.3中的信号处理函数。

3.3、SystemServer与Zygote共存亡

141// Configures the SIGCHLD handler for the zygote process. This is configured
142// very late, because earlier in the runtime we may fork() and exec()
143// other processes, and we want to waitpid() for those rather than
144// have them be harvested immediately.
145//
146// This ends up being called repeatedly before each fork(), but there's
147// no real harm in that.
148static void SetSigChldHandler() {
149  struct sigaction sa;
150  memset(&sa, 0, sizeof(sa));
151  sa.sa_handler = SigChldHandler;
152
153  int err = sigaction(SIGCHLD, &sa, NULL);
154  if (err < 0) {
155    ALOGW("Error setting SIGCHLD handler: %s", strerror(errno));
156  }
157}
89// This signal handler is for zygote mode, since the zygote must reap its children
90static void SigChldHandler(int /*signal_number*/) {
91  pid_t pid;
92  int status;
93
94  // It's necessary to save and restore the errno during this function.
95  // Since errno is stored per thread, changing it here modifies the errno
96  // on the thread on which this signal handler executes. If a signal occurs
97  // between a call and an errno check, it's possible to get the errno set
98  // here.
99  // See b/23572286 for extra information.
100  int saved_errno = errno;
101
102  while ((pid = waitpid(-1, &status, WNOHANG)) > 0) {
103     // Log process-death status that we care about.  In general it is
104     // not safe to call LOG(...) from a signal handler because of
105     // possible reentrancy.  However, we know a priori that the
106     // current implementation of LOG() is safe to call from a SIGCHLD
107     // handler in the zygote process.  If the LOG() implementation
108     // changes its locking strategy or its use of syscalls within the
109     // lazy-init critical section, its use here may become unsafe.
110    if (WIFEXITED(status)) {
111      if (WEXITSTATUS(status)) {
112        ALOGI("Process %d exited cleanly (%d)", pid, WEXITSTATUS(status));
113      }
114    } else if (WIFSIGNALED(status)) {
115      if (WTERMSIG(status) != SIGKILL) {
116        ALOGI("Process %d exited due to signal (%d)", pid, WTERMSIG(status));
117      }
118      if (WCOREDUMP(status)) {
119        ALOGI("Process %d dumped core.", pid);
120      }
121    }
122
123    // If the just-crashed process is the system_server, bring down zygote
124    // so that it is restarted by init and system server will be restarted
125    // from there.
126    if (pid == gSystemServerPid) {
127      ALOGE("Exit zygote because system server (%d) has terminated", pid);
128      kill(getpid(), SIGKILL);
129    }
130  }
131
132  // Note that we shouldn't consider ECHILD an error because
133  // the secondary zygote might have no children left to wait for.
134  if (pid < 0 && errno != ECHILD) {
135    ALOGW("Zygote SIGCHLD error in waitpid: %s", strerror(errno));
136  }
137
138  errno = saved_errno;
139}

system_server进程是zygote的大弟子,是zygote进程fork的第一个进程,zygote和system_server这两个进程可以说是Java世界的半边天,任何一个进程的死亡,都会导致Java世界的崩溃。所以如果子进程SystemServer挂了,Zygote就会自杀,导致Zygote重启。也是Zygote和SystemServer是共存亡的。

3.4、handleSystemServerProcess方法处理fork的新进程

/frameworks/base/core/java/com/android/internal/os/ZygoteInit.java
446    /**
447     * Finish remaining work for the newly forked system server process.
448     */
449    private static void handleSystemServerProcess(
450            ZygoteConnection.Arguments parsedArgs)
451            throws Zygote.MethodAndArgsCaller {
452
453        // set umask to 0077 so new files and directories will default to owner-only permissions.
454        Os.umask(S_IRWXG | S_IRWXO);
455        //设置新进程的名字
456        if (parsedArgs.niceName != null) {
457            Process.setArgV0(parsedArgs.niceName);
458        }
459       //获取systemServerClasspath
460        final String systemServerClasspath = Os.getenv("SYSTEMSERVERCLASSPATH");
461        if (systemServerClasspath != null) {
                  //优化systemServerClasspath路径之下的dex文件,看延伸阅读
462            performSystemServerDexOpt(systemServerClasspath);
463            // Capturing profiles is only supported for debug or eng builds since selinux normally
464            // prevents it.
465            boolean profileSystemServer = SystemProperties.getBoolean(
466                    "dalvik.vm.profilesystemserver", false);
467            if (profileSystemServer && (Build.IS_USERDEBUG || Build.IS_ENG)) {
468                try {
469                    File profileDir = Environment.getDataProfilesDePackageDirectory(
470                            Process.SYSTEM_UID, "system_server");
471                    File profile = new File(profileDir, "primary.prof");
472                    profile.getParentFile().mkdirs();
473                    profile.createNewFile();
474                    String[] codePaths = systemServerClasspath.split(":");
475                    VMRuntime.registerAppInfo(profile.getPath(), codePaths);
476                } catch (Exception e) {
477                    Log.wtf(TAG, "Failed to set up system server profile", e);
478                }
479            }
480        }
481       //此处是空,所以是eles分之
482        if (parsedArgs.invokeWith != null) {
483            String[] args = parsedArgs.remainingArgs;
484            // If we have a non-null system server class path, we'll have to duplicate the
485            // existing arguments and append the classpath to it. ART will handle the classpath
486            // correctly when we exec a new process.
487            if (systemServerClasspath != null) {
488                String[] amendedArgs = new String[args.length + 2];
489                amendedArgs[0] = "-cp";
490                amendedArgs[1] = systemServerClasspath;
491                System.arraycopy(args, 0, amendedArgs, 2, args.length);
492                args = amendedArgs;
493            }
494
495            WrapperInit.execApplication(parsedArgs.invokeWith,
496                    parsedArgs.niceName, parsedArgs.targetSdkVersion,
497                    VMRuntime.getCurrentInstructionSet(), null, args);
498        } else {
499            ClassLoader cl = null;
500            if (systemServerClasspath != null) {
501                cl = createPathClassLoader(systemServerClasspath, parsedArgs.targetSdkVersion);
502
503                Thread.currentThread().setContextClassLoader(cl);
504            }
505
506            /*
507             * Pass the remaining arguments to SystemServer.见3.5小节
508             */
509            ZygoteInit.zygoteInit(parsedArgs.targetSdkVersion, parsedArgs.remainingArgs, cl);
510        }
511
512        /* should never reach here */
513    }

延伸阅读:
在Android系统中,一个App的所有代码都在一个Dex文件里面。Dex是一个类似Jar的存储了多有Java编译字节码的归档文件。因为Android系统使用Dalvik虚拟机,所以需要把使用Java Compiler编译之后的class文件转换成Dalvik能够执行的class文件。这里需要强调的是,Dex和Jar一样是一个归档文件,里面仍然是Java代码对应的字节码文件。当Android系统启动一个应用的时候,有一步是对Dex进行优化,这个过程有一个专门的工具来处理,叫DexOpt。DexOpt的执行过程是在第一次加载Dex文件的时候执行的。这个过程会生成一个ODEX文件,即Optimised Dex。执行ODex的效率会比直接执行Dex文件的效率要高很多。但是在早期的Android系统中,DexOpt有一个问题,DexOpt会把每一个类的方法id检索起来,存在一个链表结构里面。但是这个链表的长度是用一个short类型来保存的,导致了方法id的数目不能够超过65536个。当一个项目足够大的时候,显然这个方法数的上限是不够的。尽管在新版本的Android系统中,DexOpt修复了这个问题,但是我们仍然需要对老系统做兼容。

Android提供了一个专门验证与优化dex文件的工具dexopt。其源码位于Android系统源码的dalvik/dexopt目录下classPath中的内容如下

systemServerClasspath = /system/framework/services.jar:/system/framework/ethernet-service.jar:/system/framework/wifi-service.jar

之后会将这三个jar从路径中获取出来,判断是否要进行dexopt优化. 如果需要就调用installer进行优化。

3.5、zygoteInit方法

/frameworks/base/core/java/com/android/internal/os/ZygoteInit.java
816    /**
817     * The main function called when started through the zygote process. This
818     * could be unified with main(), if the native code in nativeFinishInit()
819     * were rationalized with Zygote startup.

820 * 821 * Current recognized args: 822 *

    823 *
  • [--] <start class name> <args> 824 *
825 * 826 * @param targetSdkVersion target SDK version 827 * @param argv arg strings 828 */ 829 public static final void zygoteInit(int targetSdkVersion, String[] argv, 830 ClassLoader classLoader) throws Zygote.MethodAndArgsCaller { 831 if (RuntimeInit.DEBUG) { 832 Slog.d(RuntimeInit.TAG, "RuntimeInit: Starting application from zygote"); 833 } 834 835 Trace.traceBegin(Trace.TRACE_TAG_ACTIVITY_MANAGER, "ZygoteInit"); //见3.5.1 836 RuntimeInit.redirectLogStreams(); 837 //见3.5.2 838 RuntimeInit.commonInit(); //见3.5.3 839 ZygoteInit.nativeZygoteInit(); //见3.5.4 840 RuntimeInit.applicationInit(targetSdkVersion, argv, classLoader); 841 } 842

3.5.1、RuntimeInit的redirectLogStreams方法

/frameworks/base/core/java/com/android/internal/os/RuntimeInit.java
319    /**
320     * Redirect System.out and System.err to the Android log.
321     */
322    public static void redirectLogStreams() {
323        System.out.close();
324        System.setOut(new AndroidPrintStream(Log.INFO, "System.out"));
325        System.err.close();
326        System.setErr(new AndroidPrintStream(Log.WARN, "System.err"));
327    }

初始化Android LOG输出流, 并且将system.out, system.err关闭, 将两者重新定向到Android log中 。

3.5.2、RuntimeInit的commonInit方法

/frameworks/base/core/java/com/android/internal/os/RuntimeInit.java

135    protected static final void commonInit() {
136        if (DEBUG) Slog.d(TAG, "Entered RuntimeInit!");
137
138        /*
139         * set handlers; these apply to all threads in the VM. Apps can replace
140         * the default handler, but not the pre handler.
141         */
             //设置进程的uncaught exception的处理方法,默认是设置LoggingHandler,输出函数的出错堆栈。见3.5.2.1
142        Thread.setUncaughtExceptionPreHandler(new LoggingHandler());
            //进入异常崩溃的处理流程,通知AMS弹窗,见3.5.2.2
143        Thread.setDefaultUncaughtExceptionHandler(new KillApplicationHandler());
144
145        /*
146         * Install a TimezoneGetter subclass for ZoneInfo.db,设置时区
147         */
148        TimezoneGetter.setInstance(new TimezoneGetter() {
149            @Override
150            public String getId() {
151                return SystemProperties.get("persist.sys.timezone");
152            }
153        });
154        TimeZone.setDefault(null);
155
156        /*
157         * Sets handler for java.util.logging to use Android log facilities.
158         * The odd "new instance-and-then-throw-away" is a mirror of how
159         * the "java.util.logging.config.class" system property works. We
160         * can't use the system property here since the logger has almost
161         * certainly already been initialized.
162         */
163        LogManager.getLogManager().reset();
164        new AndroidConfig();
165
166        /*
167         * Sets the default HTTP User-Agent used by HttpURLConnection.
168         */
169        String userAgent = getDefaultUserAgent();
170        System.setProperty("http.agent", userAgent);
171
172        /*
173         * Wire socket tagging to traffic stats.
174         */
175        NetworkManagementSocketTagger.install();
176
177        /*
178         * If we're running in an emulator launched with "-trace", put the
179         * VM into emulator trace profiling mode so that the user can hit
180         * F9/F10 at any time to capture traces.  This has performance
181         * consequences, so it's not something you want to do always.
182         */
183        String trace = SystemProperties.get("ro.kernel.android.tracing");
184        if (trace.equals("1")) {
185            Slog.i(TAG, "NOTE: emulator trace profiling enabled");
186            Debug.enableEmulatorTraceOutput();
187        }
188
189        initialized = true;
190    }

3.5.2.1、 设置进程出错堆栈的捕获方式。

 /frameworks/base/core/java/com/android/internal/os/RuntimeInit.java
63    /**
64     * Logs a message when a thread encounters an uncaught exception. By
65     * default, {@link KillApplicationHandler} will terminate this process later,
66     * but apps can override that behavior.
67     */
68    private static class LoggingHandler implements Thread.UncaughtExceptionHandler {
69        @Override
70        public void uncaughtException(Thread t, Throwable e) {
71            // Don't re-enter if KillApplicationHandler has already run
72            if (mCrashing) return;
73            if (mApplicationObject == null) {
74                // The "FATAL EXCEPTION" string is still used on Android even though
75                // apps can set a custom UncaughtExceptionHandler that renders uncaught
76                // exceptions non-fatal.
77                Clog_e(TAG, "*** FATAL EXCEPTION IN SYSTEM PROCESS: " + t.getName(), e);
78            } else {
79                StringBuilder message = new StringBuilder();
80                // The "FATAL EXCEPTION" string is still used on Android even though
81                // apps can set a custom UncaughtExceptionHandler that renders uncaught
82                // exceptions non-fatal.
83                message.append("FATAL EXCEPTION: ").append(t.getName()).append("\n");
84                final String processName = ActivityThread.currentProcessName();
85                if (processName != null) {
86                    message.append("Process: ").append(processName).append(", ");
87                }
88                message.append("PID: ").append(Process.myPid());
89                Clog_e(TAG, message.toString(), e);
90            }
91        }
92    }

应用的JAVA的crash问题是FATAL EXCEPTION开头的,比如:

01-16 11:48:50.525 10026 17635 17635 E AndroidRuntime: FATAL EXCEPTION: main
01-16 11:48:50.525 10026 17635 17635 E AndroidRuntime: Process: com.xiaomi.scanner, PID: 17635
01-16 11:48:50.525 10026 17635 17635 E AndroidRuntime: java.lang.IllegalArgumentException: View=DecorView@77ff3a0[] not attached to window manager
01-16 11:48:50.525 10026 17635 17635 E AndroidRuntime: at android.view.WindowManagerGlobal.findViewLocked(WindowManagerGlobal.java:491)
01-16 11:48:50.525 10026 17635 17635 E AndroidRuntime: at android.view.WindowManagerGlobal.removeView(WindowManagerGlobal.java:400)
01-16 11:48:50.525 10026 17635 17635 E AndroidRuntime: at android.view.WindowManagerImpl.removeViewImmediate(WindowManagerImpl.java:125)
01-16 11:48:50.525 10026 17635 17635 E AndroidRuntime: at android.app.Dialog.dismissDialog(Dialog.java:374)
01-16 11:48:50.525 10026 17635 17635 E AndroidRuntime: at android.app.Dialog.dismiss(Dialog.java:357)
01-16 11:48:50.525 10026 17635 17635 E AndroidRuntime: at com.alibaba.imagesearch.ui.SearchResultActivity.b(Unknown Source:14)
01-16 11:48:50.525 10026 17635 17635 E AndroidRuntime: at com.alibaba.imagesearch.a.d.c(Unknown Source:39)
01-16 11:48:50.525 10026 17635 17635 E AndroidRuntime: at com.alibaba.imagesearch.a.d.a(Unknown Source:53)
01-16 11:48:50.525 10026 17635 17635 E AndroidRuntime: at com.alibaba.imagesearch.ui.SearchResultActivity.b(Unknown Source:30)
01-16 11:48:50.525 10026 17635 17635 E AndroidRuntime: at com.alibaba.imagesearch.ui.SearchResultActivity.b(Unknown Source:0)
01-16 11:48:50.525 10026 17635 17635 E AndroidRuntime: at com.alibaba.imagesearch.ui.SearchResultActivity$6.onJsPrompt(Unknown 

系统的JAVA的crash问题是FATAL EXCEPTION IN SYSTEM PROCESS开头的,比如:

logcat.log.01:2211: 08-27 16:41:16.664  2999  3026 E AndroidRuntime: *** FATAL EXCEPTION IN SYSTEM PROCESS: android.bg
logcat.log.01:2212: 08-27 16:41:16.664  2999  3026 E AndroidRuntime: java.lang.NullPointerException: Attempt to get length of null array
logcat.log.01:2213: 08-27 16:41:16.664  2999  3026 E AndroidRuntime:     at com.android.server.net.NetworkPolicyManagerService.isUidIdle(NetworkPolicyManagerService.java:2318)
logcat.log.01:2214: 08-27 16:41:16.664  2999  3026 E AndroidRuntime:     at com.android.server.net.NetworkPolicyManagerService.updateRuleForAppIdleLocked(NetworkPolicyManagerService.java:2244)
logcat.log.01:2215: 08-27 16:41:16.664  2999  3026 E AndroidRuntime:     at com.android.server.net.NetworkPolicyManagerService.updateRulesForTempWhitelistChangeLocked(NetworkPolicyManagerService.java:2298)
logcat.log.01:2216: 08-27 16:41:16.664  2999  3026 E AndroidRuntime:     at com.android.server.net.NetworkPolicyManagerService$3.run(NetworkPolicyManagerService.java:572)
logcat.log.01:2217: 08-27 16:41:16.664  2999  3026 E AndroidRuntime:     at android.os.Handler.handleCallback(Handler.java:739)
logcat.log.01:2218: 08-27 16:41:16.664  2999  3026 E AndroidRuntime:     at android.os.Handler.dispatchMessage(Handler.java:95)
logcat.log.01:2219: 08-27 16:41:16.664  2999  3026 E AndroidRuntime:     at android.os.Looper.loop(Looper.java:148)
logcat.log.01:2220: 08-27 16:41:16.664  2999  3026 E AndroidRuntime:     at android.os.HandlerThread.run(HandlerThread.java:61)
logcat.log.01:2221: 08-27 16:41:16.665  2999  3026 I am_crash: [2999,0,system_server,-1,java.lang.NullPointerException,Attempt to get length of null array,NetworkPolicyManagerService.java,2318]
logcat.log.01:2224: 08-27 16:41:16.696  2999  3026 I MitvActivityManagerService: handleApplicationCrash, processName: system_server
logcat.log.01:2225: 08-27 16:41:16.696  2999  3026 I Process : Sending signal. PID: 2999 SIG: 9

3.5.2.1、 发生JE问题,弹窗提醒用户。

100    private static class KillApplicationHandler implements Thread.UncaughtExceptionHandler {
101        public void uncaughtException(Thread t, Throwable e) {
102            try {
103                // Don't re-enter -- avoid infinite loops if crash-reporting crashes.
104                if (mCrashing) return;
105                mCrashing = true;
106
107                // Try to end profiling. If a profiler is running at this point, and we kill the
108                // process (below), the in-memory buffer will be lost. So try to stop, which will
109                // flush the buffer. (This makes method trace profiling useful to debug crashes.)
110                if (ActivityThread.currentActivityThread() != null) {
111                    ActivityThread.currentActivityThread().stopProfiling();
112                }
113
114                // Bring up crash dialog, wait for it to be dismissed,通知AMS弹窗
115                ActivityManager.getService().handleApplicationCrash(
116                        mApplicationObject, new ApplicationErrorReport.ParcelableCrashInfo(e));
117            } catch (Throwable t2) {
118                if (t2 instanceof DeadObjectException) {
119                    // System process is dead; ignore
120                } else {
121                    try {
122                        Clog_e(TAG, "Error reporting crash", t2);
123                    } catch (Throwable t3) {
124                        // Even Clog_e() fails!  Oh well.
125                    }
126                }
127            } finally {
128                // Try everything to make sure this process goes away.
129                Process.killProcess(Process.myPid());
130                System.exit(10);
131            }
132        }
133    }

3.5.3、ZygoteInit的nativeZygoteInit方法

nativeZygoteInit方法是个JNI方法,在AndroidRuntime.cpp中注册。

/frameworks/base/core/jni/AndroidRuntime.cpp
1281
1282static const RegJNIRec gRegJNI[] = {
1283    REG_JNI(register_com_android_internal_os_RuntimeInit),
1284    REG_JNI(register_com_android_internal_os_ZygoteInit),
  .....
/frameworks/base/core/jni/AndroidRuntime.cpp
48int register_com_android_internal_os_ZygoteInit(JNIEnv* env)
249{
250    const JNINativeMethod methods[] = {
251        { "nativeZygoteInit", "()V",
252            (void*) com_android_internal_os_ZygoteInit_nativeZygoteInit },
253    };
254    return jniRegisterNativeMethods(env, "com/android/internal/os/ZygoteInit",
255        methods, NELEM(methods));
256}

所以实际调用的是com_android_internal_os_ZygoteInit_nativeZygoteInit函数。

/frameworks/base/core/jni/AndroidRuntime.cpp
221static void com_android_internal_os_ZygoteInit_nativeZygoteInit(JNIEnv* env, jobject clazz)
222{
223    gCurRuntime->onZygoteInit();
224}

com_android_internal_os_ZygoteInit_nativeZygoteInit调用的是AndroidRuntime的onZygoteInit函数,但是onZygoteInit函数是个虚函数,它的实现是app_main.cpp中。

/frameworks/base/cmds/app_process/app_main.cpp
91    virtual void onZygoteInit()
92    {
93        sp proc = ProcessState::self();
94        ALOGV("App process: starting thread pool.\n");
            //开启Binder线程池
95        proc->startThreadPool();
96    }
 /frameworks/native/libs/binder/ProcessState.cpp
145void ProcessState::startThreadPool()
146{
147    AutoMutex _l(mLock);
148    if (!mThreadPoolStarted) {
149        mThreadPoolStarted = true;
150        spawnPooledThread(true);
151    }
152}
153
 /frameworks/native/libs/binder/ProcessState.cpp
300void ProcessState::spawnPooledThread(bool isMain)
301{
302    if (mThreadPoolStarted) {
303        String8 name = makeBinderThreadName();
304        ALOGV("Spawning new pooled thread, name=%s\n", name.string());
305        sp t = new PoolThread(isMain);
306        t->run(name.string());
307    }
308}
 /frameworks/native/libs/binder/ProcessState.cpp
292String8 ProcessState::makeBinderThreadName() {
293    int32_t s = android_atomic_add(1, &mThreadPoolSeq);
294    pid_t pid = getpid();
295    String8 name;
296    name.appendFormat("Binder:%d_%X", pid, s);
297    return name;
298}

3.5.4、RuntimeInit的applicationInit方法

/frameworks/base/core/java/com/android/internal/os/RuntimeInit.java
289    protected static void applicationInit(int targetSdkVersion, String[] argv, ClassLoader classLoader)
290            throws Zygote.MethodAndArgsCaller {
291        // If the application calls System.exit(), terminate the process
292        // immediately without running any shutdown hooks.  It is not possible to
293        // shutdown an Android application gracefully.  Among other things, the
294        // Android runtime shutdown hooks close the Binder driver, which can cause
295        // leftover running threads to crash before the process actually exits.
296        nativeSetExitWithoutCleanup(true);
297
298        // We want to be fairly aggressive about heap utilization, to avoid
299        // holding on to a lot of memory that isn't needed.
300        VMRuntime.getRuntime().setTargetHeapUtilization(0.75f);
301        VMRuntime.getRuntime().setTargetSdkVersion(targetSdkVersion);
302
303        final Arguments args;
304        try {
               //将com.android.server.SystemServer赋值给startClass
305            args = new Arguments(argv);
306        } catch (IllegalArgumentException ex) {
307            Slog.e(TAG, ex.getMessage());
308            // let the process exit
309            return;
310        }
311
312        // The end of of the RuntimeInit event (see #zygoteInit).
313        Trace.traceEnd(Trace.TRACE_TAG_ACTIVITY_MANAGER);
314
315        // Remaining arguments are passed to the start class's static main
316        invokeStaticMain(args.startClass, args.startArgs, classLoader);
317    }

经过applicationInit中的Arguments构造方法,args.startClass的值就是com.android.server.SystemServer。

 /frameworks/base/core/java/com/android/internal/os/RuntimeInit.java
231    private static void invokeStaticMain(String className, String[] argv, ClassLoader classLoader)
232            throws Zygote.MethodAndArgsCaller {
233        Class cl;
234
235        try {
236            cl = Class.forName(className, true, classLoader);
237        } catch (ClassNotFoundException ex) {
238            throw new RuntimeException(
239                    "Missing class when invoking static main " + className,
240                    ex);
241        }
242
243        Method m;
244        try {
245            m = cl.getMethod("main", new Class[] { String[].class });
246        } catch (NoSuchMethodException ex) {
247            throw new RuntimeException(
248                    "Missing static main on " + className, ex);
249        } catch (SecurityException ex) {
250            throw new RuntimeException(
251                    "Problem getting static main on " + className, ex);
252        }
253
254        int modifiers = m.getModifiers();
255        if (! (Modifier.isStatic(modifiers) && Modifier.isPublic(modifiers))) {
256            throw new RuntimeException(
257                    "Main method is not public and static on " + className);
258        }
259
260        /*
261         * This throw gets caught in ZygoteInit.main(), which responds
262         * by invoking the exception's run() method. This arrangement
263         * clears up all the stack frames that were required in setting
264         * up the process.
265         */
266        throw new Zygote.MethodAndArgsCaller(m, argv);
267    }

加载com.android.server.SystemServer的字节码,反射此类的main方法,得到Method对象,抛出Zygote.MethodAndArgsCaller异常。回到最开始的ZygoteInit的main方法。经过层层调用,ZygoteInit.main-->ZygoteInit.startSystemServer-->Zygote.forkSystemServer-->com_android_internal_os_Zygote_nativeForkSystemServer-->ForkAndSpecializeCommon-->fork-->ZygoteInit.handleSystemServerProcess--> ZygoteInit.zygoteInit-->RuntimeInit.applicationInit-->RuntimeInit.invokeStaticMain。最终来到invokeStaticMain方法,抛出一个Zygote.MethodAndArgsCaller异常被ZygoteInit.main方法所捕获。

  frameworks/base/core/java/com/android/internal/os/ZygoteInit.java
 public static void main(String argv[]) {
        //1、创建ZygoteServer
        ZygoteServer zygoteServer = new ZygoteServer();
        try {
            //2、创建一个Server端的Socket
            zygoteServer.registerServerSocket(socketName);
            //3、加载进程的资源和类
            preload(bootTimingsTraceLog);
            if (startSystemServer) {
                //4、开启SystemServer进程,这是受精卵进程的第一次分裂
                startSystemServer(abiList, socketName, zygoteServer);
            }
            //5、启动一个死循环监听来自Client端的消息
            zygoteServer.runSelectLoop(abiList);
             //6、关闭SystemServer的Socket
            zygoteServer.closeServerSocket();
        } catch (Zygote.MethodAndArgsCaller caller) {
             //7、这里捕获这个异常调用MethodAndArgsCaller的run方法。
            caller.run();
        } catch (Throwable ex) {
            Log.e(TAG, "System zygote died with exception", ex);
            zygoteServer.closeServerSocket();
            throw ex;
        }
    }
 /frameworks/base/core/java/com/android/internal/os/Zygote.java
225    public static class MethodAndArgsCaller extends Exception
226            implements Runnable {
227        /** method to call */
228        private final Method mMethod;
229
230        /** argument array */
231        private final String[] mArgs;
232
233        public MethodAndArgsCaller(Method method, String[] args) {
234            mMethod = method;//构造函数, 将SystemServer的main函数赋值给mMethod

235            mArgs = args;
236        }
237
238        public void run() {
239            try {
                    //执行SystemServer的main函数, 从而进入到SystemServer的main方法。
240                mMethod.invoke(null, new Object[] { mArgs });
241            } catch (IllegalAccessException ex) {
242                throw new RuntimeException(ex);
243            } catch (InvocationTargetException ex) {
244                Throwable cause = ex.getCause();
245                if (cause instanceof RuntimeException) {
246                    throw (RuntimeException) cause;
247                } else if (cause instanceof Error) {
248                    throw (Error) cause;
249                }
250                throw new RuntimeException(ex);
251            }
252        }
253    }
254}
  • 思考:为什么这里要有抛出异常的方式调用SytemServer的main方法呢?
    因为从ZygoteInit的main开始fork一个进程出来,经过了层层调用,系统中累积了不少栈帧,为了一个创建一个干干净净的进程,需要清除里面的栈帧,故抛出这个异常。

四、总结

本文主要梳理了SystemServer进程的启动,这是受精卵进程的第一次分裂,有几个重点需要把握。

  • 1、waitpid方法的特殊使用
  • 2、SystemServer与Zygote共存亡
  • 3、进程出错堆栈是怎么输出的,以及错误Dialog是怎么弹出的
  • 4、为什么要有抛出异常的方式调用SytemServer的main方法

下篇将会梳理SytemServer的main里面做了哪些事情。

你可能感兴趣的:(android,系统)