Android5 Zygote 与 SystemServer 启动流程分析

Android5 Zygote 与 SystemServer 启动流程分析

  • Android5 Zygote 与 SystemServer 启动流程分析
    • 前言
    • zygote 进程
      • 解析 zygoterc
      • 启动 SystemServer
      • 执行 ZygoteInitrunSelectLoop
    • SystemServer 启动过程
    • Zygote 的 fork 本地方法分析
      • forkSystemServer
        • ZygoteHookspreFork
        • 创建 system_server 进程
        • ZygoteHookspostForkCommon
      • forkAndSpecialize

前言

Android5.0.1 的启动流程与之前的版本相比变化并不大,OK,变化虽然还是有:SystemServer 启动过程的 init1(), init2()没有了,但主干流程依然不变:Linux 内核加载完毕之后,首先启动 init 进程,然后解析 init.rc,并根据其内容由 init 进程装载 Android 文件系统、创建系统目录、初始化属性系统、启动一些守护进程,其中最重要的守护进程就是 Zygote 进程。Zygote 进程初始化时会创建 Dalvik 虚拟机、预装载系统的资源和 Java 类。所有从 Zygote 进程 fork 出来的用户进程都将继承和共享这些预加载的资源。init 进程是 Android 的第一个进程,而 Zygote 进程则是所有用户进程的根进程。SystemServer 是 Zygote 进程 fork 出的第一个进程,也是整个 Android 系统的核心进程。

zygote 进程

解析 zygote.rc

在文件中 /system/core/rootdir/init.rc 中包含了 zygote.rc:

import /init.${ro.zygote}.rc

${ro.zygote}是平台相关的参数,实际可对应到 init.zygote32.rc, init.zygote64.rc, init.zygote64_32.rc, init.zygote32_64.rc,前两个只会启动单一app_process(64) 进程,而后两个则会启动两个app_process进程:第二个app_process进程称为 secondary,在后面的代码中可以看到相应 secondary socket 的创建过程。为简化起见,在这里就不考虑这种创建两个app_process进程的情形。

以 /system/core/rootdir/init.zygote32.rc 为例:

    service zygote /system/bin/app_process -Xzygote /system/bin --zygote --start-system-server
    class main
    socket zygote stream 660 root system
    onrestart write /sys/android_power/request_state wake
    onrestart write /sys/power/state on
    onrestart restart media
    onrestart restart netd

第一行创建了名为 zygote 的进程,这个进程是通过 app_process 的 main 启动并以”-Xzygote /system/bin –zygote –start-system-server”作为main的入口参数。

app_process 对应代码为 framework/base/cmds/app_process/app_main.cpp。在这个文件的main函数中:

AppRuntime runtime(argv[0], computeArgBlockSize(argc, argv));

    if (zygote) {
        runtime.start("com.android.internal.os.ZygoteInit", args);
    } else if (className) {
        runtime.start("com.android.internal.os.RuntimeInit", args);
    }

根据入口参数,我们知道 zygote 为true,args参数中包含了”start-system-server”。

AppRuntime 继承自 AndroidRuntime,因此下一步就执行到 AndroidRuntime 的 start 函数。

void AndroidRuntime::start(const char* className, const Vector& options)
{
    /* start the virtual machine */ // 创建虚拟机
    JniInvocation jni_invocation;
    jni_invocation.Init(NULL);
    JNIEnv* env;
    if (startVm(&mJavaVM, &env) != 0) {
        return;
    }
    onVmCreated(env);

    ...
    //调用className对应类的静态main()函数
    char* slashClassName = toSlashClassName(className);
    jclass startClass = env->FindClass(slashClassName);
    jmethodID startMeth = env->GetStaticMethodID(startClass, "main",
    env->CallStaticVoidMethod(startClass, startMeth, strArray);
    ...
}

start函数主要做两件事:创建虚拟机和调用传入类名对应类的 main 函数。因此下一步就执行到 com.android.internal.os.ZygoteInit 的 main 函数。

    public static void main(String argv[]) {
        try {
            boolean startSystemServer = false;
            String socketName = "zygote";
            for (int i = 1; i < argv.length; i++) {
                if ("start-system-server".equals(argv[i])) {
                    startSystemServer = true;
                }
                ...
            }

            registerZygoteSocket(socketName);
            ...
            preload();
            ...

            if (startSystemServer) {
                startSystemServer(abiList, socketName);
            }

            Log.i(TAG, "Accepting command socket connections");
            runSelectLoop(abiList);

            closeServerSocket();
        } catch (MethodAndArgsCaller caller) {
            caller.run();
        } catch (RuntimeException ex) {
            Log.e(TAG, "Zygote died with exception", ex);
            closeServerSocket();
            throw ex;
        }
    }

它主要做了三件事情:
1. 调用 registerZygoteSocket 函数创建了一个 socket 接口,用来和 ActivityManagerService 通讯;
2. 调用 startSystemServer 函数来启动 SystemServer;
3. 调用 runSelectLoop 函数进入一个无限循环在前面创建的 socket 接口上等待 ActivityManagerService 请求创建新的应用程序进程。

这里要留意 catch (MethodAndArgsCaller caller) 这一行,android 在这里通过抛出一个异常来处理正常的业务逻辑。

    socket zygote stream 660 root system

系统启动脚本文件 init.rc 是由 init 进程来解释执行的,而 init 进程的源代码位于 system/core/init 目录中,在 init.c 文件中,是由 service_start 函数来解释 init.zygote32.rc 文件中的 service 命令的:

    void service_start(struct service *svc, const char *dynamic_args)
    {
        ...
        pid = fork();

        if (pid == 0) {
            struct socketinfo *si;
            ...

            for (si = svc->sockets; si; si = si->next) {
                int socket_type = (
                        !strcmp(si->type, "stream") ? SOCK_STREAM :
                            (!strcmp(si->type, "dgram") ? SOCK_DGRAM : SOCK_SEQPACKET));
                int s = create_socket(si->name, socket_type,
                                      si->perm, si->uid, si->gid, si->socketcon ?: scon);
                if (s >= 0) {
                    publish_socket(si->name, s);
                }
            }
            ...
        }
        ...
    }

每一个 service 命令都会促使 init 进程调用 fork 函数来创建一个新的进程,在新的进程里面,会分析里面的 socket 选项,对于每一个 socket 选项,都会通过 create_socket 函数来在 /dev/socket 目录下创建一个文件,在 zygote 进程中 socket 选项为“socket zygote stream 660 root system”,因此这个文件便是 zygote了,然后得到的文件描述符通过 publish_socket 函数写入到环境变量中去:

    static void publish_socket(const char *name, int fd)
    {
        char key[64] = ANDROID_SOCKET_ENV_PREFIX;
        char val[64];

        strlcpy(key + sizeof(ANDROID_SOCKET_ENV_PREFIX) - 1,
                name,
                sizeof(key) - sizeof(ANDROID_SOCKET_ENV_PREFIX));
        snprintf(val, sizeof(val), "%d", fd);
        add_environment(key, val);

        /* make sure we don't close-on-exec */
        fcntl(fd, F_SETFD, 0);
    }

这里传进来的参数name值为”zygote”,而 ANDROID_SOCKET_ENV_PREFIX 在 system/core/include/cutils/sockets.h 定义为:

#define ANDROID_SOCKET_ENV_PREFIX   "ANDROID_SOCKET_"
#define ANDROID_SOCKET_DIR          "/dev/socket"

因此,这里就把上面得到的文件描述符写入到以 “ANDROID_SOCKET_zygote” 为 key 值的环境变量中。又因为上面的 ZygoteInit.registerZygoteSocket 函数与这里创建 socket 文件的 create_socket 函数是运行在同一个进程中,因此,上面的 ZygoteInit.registerZygoteSocket 函数可以直接使用这个文件描述符来创建一个 Java层的LocalServerSocket 对象。如果其它进程也需要打开这个 /dev/socket/zygote 文件来和 zygote 进程进行通信,那就必须要通过文件名来连接这个 LocalServerSocket了。也就是说创建 zygote socket 之后,ActivityManagerService 就能够通过该 socket 与 zygote 进程通信从而 fork 创建新进程,android 中的所有应用进程都是通过这种方式 fork zygote 进程创建的。在 ActivityManagerService中 的 startProcessLocked 中调用了Process.start()方法,进而调用 Process.startViaZygote 和 Process.openZygoteSocketIfNeeded。

启动 SystemServer

socket 创建完成之后,紧接着就通过 startSystemServer 函数来启动 SystemServer 进程。

    private static boolean startSystemServer(String abiList, String socketName)
    {
        long capabilities = posixCapabilitiesAsBits(
            OsConstants.CAP_BLOCK_SUSPEND,
            OsConstants.CAP_KILL,
            OsConstants.CAP_NET_ADMIN,
            OsConstants.CAP_NET_BIND_SERVICE,
            OsConstants.CAP_NET_BROADCAST,
            OsConstants.CAP_NET_RAW,
            OsConstants.CAP_SYS_MODULE,
            OsConstants.CAP_SYS_NICE,
            OsConstants.CAP_SYS_RESOURCE,
            OsConstants.CAP_SYS_TIME,
            OsConstants.CAP_SYS_TTY_CONFIG
        );
        /* Hardcoded command line to start the system server */
        String args[] = {
            "--setuid=1000",
            "--setgid=1000",
            "--setgroups=1001,1002,1003,1004,1005,1006,1007,1008,1009,1010,1018,1032,3001,3002,3003,3006,3007",
            "--capabilities=" + capabilities + "," + capabilities,
            "--runtime-init",
            "--nice-name=system_server",
            "com.android.server.SystemServer",
        };
        ZygoteConnection.Arguments parsedArgs = null;

        int pid;

        try {
            parsedArgs = new ZygoteConnection.Arguments(args);
            ...

            /* Request to fork the system server process */
            pid = Zygote.forkSystemServer(
                    parsedArgs.uid, parsedArgs.gid,
                    parsedArgs.gids,
                    parsedArgs.debugFlags,
                    null,
                    parsedArgs.permittedCapabilities,
                    parsedArgs.effectiveCapabilities);
        } catch (IllegalArgumentException ex) {
            throw new RuntimeException(ex);
        }

        /* For child process */
        if (pid == 0) {
            if (hasSecondZygote(abiList)) {
                waitForSecondaryZygote(socketName);
            }

            handleSystemServerProcess(parsedArgs);
        }

        return true;
    }

这里我们可以从参数推测出:创建名为“system_server”的进程,其入口是: com.android.server.SystemServer 的 main 函数。zygote 进程通过 Zygote.forkSystemServer 函数来创建一个新的进程来启动 SystemServer 组件,返回值 pid 等 0 的地方就是新的进程要执行的路径,即新创建的进程会执行 handleSystemServerProcess 函数。hasSecondZygote 是针对 init.zygote64_32.rc, init.zygote32_64.rc 这两者情况的,在这里跳过不谈。接下来来看 handleSystemServerProcess:

    /**
     * Finish remaining work for the newly forked system server process.
     */
    private static void handleSystemServerProcess(
            ZygoteConnection.Arguments parsedArgs)
            throws ZygoteInit.MethodAndArgsCaller
    {
        closeServerSocket();

        // set umask to 0077 so new files and directories will default to owner-only permissions.
        Os.umask(S_IRWXG | S_IRWXO);

        if (parsedArgs.niceName != null) {
            Process.setArgV0(parsedArgs.niceName);
        }

        final String systemServerClasspath = Os.getenv("SYSTEMSERVERCLASSPATH");

        ClassLoader cl = null;
        if (systemServerClasspath != null) {
            cl = new PathClassLoader(systemServerClasspath, ClassLoader.getSystemClassLoader());
            Thread.currentThread().setContextClassLoader(cl);
        }

        /*
         * Pass the remaining arguments to SystemServer.
         */
        RuntimeInit.zygoteInit(parsedArgs.targetSdkVersion, parsedArgs.remainingArgs, cl);

        /* should never reach here */
    }

handleSystemServerProcess 会抛出 MethodAndArgsCaller 异常,前面提到这个异常其实是处理正常业务逻辑的,相当于一个回调。由于由 zygote 进程创建的子进程会继承 zygote 进程在前面创建的 socket 文件描述符,而这里的子进程又不会用到它,因此,这里就调用 closeServerSocket 函数来关闭它。SYSTEMSERVERCLASSPATH 是包含 /system/framework/framework.jar 的环境变量,它定义在 system/core/rootdir/init.environ.rc.in 中:

    on init
        export PATH /sbin:/vendor/bin:/system/sbin:/system/bin:/system/xbin
        export ANDROID_BOOTLOGO 1
        export ANDROID_ROOT /system
        export SYSTEMSERVERCLASSPATH %SYSTEMSERVERCLASSPATH%
        export LD_PRELOAD libsigchain.so

handleSystemServerProcess 函数接着调用 RuntimeInit.zygoteInit 函数来进一步执行启动 SystemServer 组件的操作。

    public static final void zygoteInit(int targetSdkVersion, String[] argv, ClassLoader classLoader)
            throws ZygoteInit.MethodAndArgsCaller {

        commonInit();
        nativeZygoteInit();

        applicationInit(targetSdkVersion, argv, classLoader);
    }

commonInit 设置线程未处理异常handler,时区等,JNI 方法 nativeZygoteInit 实现在 frameworks/base/core/jni/AndroidRuntime.cpp 中:

static AndroidRuntime* gCurRuntime = NULL;

static void com_android_internal_os_RuntimeInit_nativeZygoteInit(JNIEnv* env, jobject clazz)
{
    gCurRuntime->onZygoteInit();
}

AndroidRuntime 是个带虚函数的基类,真正的实现是在 app_main.cpp 中的 AppRuntime:

class AppRuntime : public AndroidRuntime
{
    virtual void onStarted()
    {
        sp proc = ProcessState::self();
        ALOGV("App process: starting thread pool.\n");
        proc->startThreadPool();

        AndroidRuntime* ar = AndroidRuntime::getRuntime();
        ar->callMain(mClassName, mClass, mArgs);

        IPCThreadState::self()->stopProcess();
    }

    virtual void onZygoteInit()
    {
        // Re-enable tracing now that we're no longer in Zygote.
        atrace_set_tracing_enabled(true);

        sp proc = ProcessState::self();
        ALOGV("App process: starting thread pool.\n");
        proc->startThreadPool();
    }

    virtual void onExit(int code)
    {
        if (mClassName.isEmpty()) {
            // if zygote
            IPCThreadState::self()->stopProcess();
        }

        AndroidRuntime::onExit(code);
    }
};

通过执行 AppRuntime::onZygoteInit 函数,这个进程的 Binder 进程间通信机制基础设施就准备好了,参考代码 frameworks/native/libs/binder/ProcessState.cpp。

接下来,看 applicationInit :

    private static void applicationInit(int targetSdkVersion, String[] argv, ClassLoader classLoader)
            throws ZygoteInit.MethodAndArgsCaller {

        final Arguments args;
        try {
            args = new Arguments(argv);
        } catch (IllegalArgumentException ex) {
            Slog.e(TAG, ex.getMessage());
            // let the process exit
            return;
        }

        // Remaining arguments are passed to the start class's static main
        invokeStaticMain(args.startClass, args.startArgs, classLoader);
    }

applicationInit 仅仅是转调 invokeStaticMain:

    private static void invokeStaticMain(String className, String[] argv, ClassLoader classLoader)
            throws ZygoteInit.MethodAndArgsCaller
    {
        Class cl;
        cl = Class.forName(className, true, classLoader);
        Method m;
        m = cl.getMethod("main", new Class[] { String[].class });


        /*
         * This throw gets caught in ZygoteInit.main(), which responds
         * by invoking the exception's run() method. This arrangement
         * clears up all the stack frames that were required in setting
         * up the process.
         */
        throw new ZygoteInit.MethodAndArgsCaller(m, argv);
    }

invokeStaticMain 也很简单,通过反射找到参数 className 对应的类的静态 main 方法,然后将该方法与参数生成 ZygoteInit.MethodAndArgsCaller 对象当做异常抛出,这个异常对象在 ZygoteInit 的 main 函数被捕获并执行该对象的 run 方法。

    /**
     * Helper exception class which holds a method and arguments and
     * can call them. This is used as part of a trampoline to get rid of
     * the initial process setup stack frames.
     */
    public static class MethodAndArgsCaller extends Exception
            implements Runnable {

        public void run() {
            ...
            mMethod.invoke(null, new Object[] { mArgs });
            ...
        }
    }

这么复杂的跳转,其实就做了一件简单的事情:根据 className 反射调用该类的静态 main 方法。这个类名是 ZygoteInit.startSystemServer 方法中写死的 com.android.server.SystemServer。 从而进入 SystemServer 类的 main()方法。

执行 ZygoteInit.runSelectLoop

在 startSystemServer 函数中,创建 system_server 进程之后,pid 等于 0 时在该新进程中执行 SystemServer.main,否则回到 zygote 进程进行执行 ZygoteInit.runSelectLoop:

    private static void runSelectLoop(String abiList) throws MethodAndArgsCaller {
        ArrayList fds = new ArrayList();
        ArrayList peers = new ArrayList();
        FileDescriptor[] fdArray = new FileDescriptor[4];

        fds.add(sServerSocket.getFileDescriptor());
        peers.add(null);

        int loopCount = GC_LOOP_COUNT;
        while (true) {
            int index;

            /*
             * Call gc() before we block in select().
             * It's work that has to be done anyway, and it's better
             * to avoid making every child do it.  It will also
             * madvise() any free memory as a side-effect.
             *
             * Don't call it every time, because walking the entire
             * heap is a lot of overhead to free a few hundred bytes.
             */
            if (loopCount <= 0) {
                gc();
                loopCount = GC_LOOP_COUNT;
            } else {
                loopCount--;
            }

            try {
                fdArray = fds.toArray(fdArray);
                index = selectReadable(fdArray);
            } catch (IOException ex) {
                throw new RuntimeException("Error in select()", ex);
            }

            if (index < 0) {
                throw new RuntimeException("Error in select()");
            } else if (index == 0) {
                ZygoteConnection newPeer = acceptCommandPeer(abiList);
                peers.add(newPeer);
                fds.add(newPeer.getFileDescriptor());
            } else {
                boolean done;
                done = peers.get(index).runOnce();

                if (done) {
                    peers.remove(index);
                    fds.remove(index);
                }
            }
        }
    }

runSelectLoop函数的逻辑比较简单,主要有两点:
1、 处理客户端的连接和请求。前面创建的 LocalServerSocket 对象保存 sServerSocket,这个 socket 通过 selectReadable 等待 ActivityManagerService(简写 AMS) 与之通信。selectReadable 是一个native函数,内部调用select等待 AMS 连接,AMS 连接上之后就会返回: 返回值 < 0:内部发生错误;返回值 = 0:第一次连接到服务端 ;返回值 > 0:与服务端已经建立连接,并开始发送数据。每一个链接在 zygote 进程中使用 ZygoteConnection 对象表示。

2、 客户端的请求由 ZygoteConnection.runOnce 来处理,这个方法也抛出 MethodAndArgsCaller 异常,从而进入 MethodAndArgsCaller.run 中调用根据客户请求数据反射出的类的 main 方法。

    private String[] readArgumentList()
    {
        int argc;

        try {
            String s = mSocketReader.readLine();

            if (s == null) {
                // EOF reached.
                return null;
            }
            argc = Integer.parseInt(s);
        } catch (NumberFormatException ex) {
            Log.e(TAG, "invalid Zygote wire format: non-int at argc");
            throw new IOException("invalid wire format");
        }

        String[] result = new String[argc];
        for (int i = 0; i < argc; i++) {
            result[i] = mSocketReader.readLine();
            if (result[i] == null) {
                // We got an unexpected EOF.
                throw new IOException("truncated request");
            }
        }

        return result;
    }

    boolean runOnce() throws ZygoteInit.MethodAndArgsCaller {
        String args[];
        Arguments parsedArgs = null;
        args = readArgumentList();
        parsedArgs = new Arguments(args);

        ...
        pid = Zygote.forkAndSpecialize(parsedArgs.uid, parsedArgs.gid, parsedArgs.gids,
                parsedArgs.debugFlags, rlimits, parsedArgs.mountExternal, parsedArgs.seInfo,
                parsedArgs.niceName, fdsToClose, parsedArgs.instructionSet,
                parsedArgs.appDataDir);
        ...
    }

SystemServer 启动过程

在前面启动 SystemServer一节讲到,通过反射调用类 com.android.server.SystemServer main() 函数,从而开始执行 SystemServer 的初始化流程。

SystemServer.main()

    /**
     * The main entry point from zygote.
     */
    public static void main(String[] args) {
        new SystemServer().run();
    }

main 函数创建一个 SystemServer 对象,调用其 run() 方法。

    private void run() {
        // If a device's clock is before 1970 (before 0), a lot of
        // APIs crash dealing with negative numbers, notably
        // java.io.File#setLastModified, so instead we fake it and
        // hope that time from cell towers or NTP fixes it shortly.
        if (System.currentTimeMillis() < EARLIEST_SUPPORTED_TIME) {
            Slog.w(TAG, "System clock is before 1970; setting to 1970.");
            SystemClock.setCurrentTimeMillis(EARLIEST_SUPPORTED_TIME);
        } // 检测时间设置

        // Here we go!
        Slog.i(TAG, "Entered the Android system server!");
        EventLog.writeEvent(EventLogTags.BOOT_PROGRESS_SYSTEM_RUN, SystemClock.uptimeMillis());

        // In case the runtime switched since last boot (such as when
        // the old runtime was removed in an OTA), set the system
        // property so that it is in sync. We can't do this in
        // libnativehelper's JniInvocation::Init code where we already
        // had to fallback to a different runtime because it is
        // running as root and we need to be the system user to set
        // the property. http://b/11463182
        SystemProperties.set("persist.sys.dalvik.vm.lib.2", VMRuntime.getRuntime().vmLibrary());

        // Enable the sampling profiler.
        if (SamplingProfilerIntegration.isEnabled()) {
            SamplingProfilerIntegration.start();
            mProfilerSnapshotTimer = new Timer();
            mProfilerSnapshotTimer.schedule(new TimerTask() {
                @Override
                public void run() {
                    SamplingProfilerIntegration.writeSnapshot("system_server", null);
                }
            }, SNAPSHOT_INTERVAL, SNAPSHOT_INTERVAL);
        } // 启动性能分析采样

        // Mmmmmm... more memory!
        VMRuntime.getRuntime().clearGrowthLimit();

        // The system server has to run all of the time, so it needs to be
        // as efficient as possible with its memory usage.
        VMRuntime.getRuntime().setTargetHeapUtilization(0.8f);

        // Some devices rely on runtime fingerprint generation, so make sure
        // we've defined it before booting further.
        Build.ensureFingerprintProperty();

        // Within the system server, it is an error to access Environment paths without
        // explicitly specifying a user.
        Environment.setUserRequired(true);

        // Ensure binder calls into the system always run at foreground priority.
        BinderInternal.disableBackgroundScheduling(true);

        // Prepare the main looper thread (this thread).
        android.os.Process.setThreadPriority(
                android.os.Process.THREAD_PRIORITY_FOREGROUND);
        android.os.Process.setCanSelfBackground(false);
        Looper.prepareMainLooper(); // 准备主线程循环

        // Initialize native services.
        System.loadLibrary("android_servers");
        nativeInit();

        // Check whether we failed to shut down last time we tried.
        // This call may not return.
        performPendingShutdown();

        // Initialize the system context.
        createSystemContext();

        // Create the system service manager.
        mSystemServiceManager = new SystemServiceManager(mSystemContext);
        LocalServices.addService(SystemServiceManager.class, mSystemServiceManager);

        // Start services.  // 启动服务
        try {
            startBootstrapServices();
            startCoreServices();
            startOtherServices();
        } catch (Throwable ex) {
            Slog.e("System", "******************************************");
            Slog.e("System", "************ Failure starting system services", ex);
            throw ex;
        }

        // For debug builds, log event loop stalls to dropbox for analysis.
        if (StrictMode.conditionallyEnableDebugLogging()) {
            Slog.i(TAG, "Enabled StrictMode for system server main thread.");
        }

        // Loop forever.
        Looper.loop();  // 启动线程循环,等待消息处理
        throw new RuntimeException("Main thread loop unexpectedly exited");
    }

在这个 run 方法中,主要完成三件事情,创建 system context 和 system service manager,启动一些系统服务,进入主线程消息循环。

Zygote 的 fork 本地方法分析

接下来我们仔细分析 Zygote.forkSystemServer 与 Zygote.forkAndSpecialize 两个方法。

forkSystemServer

    private static final ZygoteHooks VM_HOOKS = new ZygoteHooks();

    public static int forkSystemServer(int uid, int gid, int[] gids, int debugFlags,
            int[][] rlimits, long permittedCapabilities, long effectiveCapabilities) {
        VM_HOOKS.preFork();
        int pid = nativeForkSystemServer(
                uid, gid, gids, debugFlags, rlimits, permittedCapabilities, effectiveCapabilities);
        VM_HOOKS.postForkCommon();
        return pid;
    }

在调用 nativeForkSystemServer 创建 system_server 进程之前与之后,都会调用 ZygoteHooks 进行一些前置与后置处理。

ZygoteHooks.preFork

前置处理 ZygoteHooks.preFork:

    public void preFork() {
        Daemons.stop();
        waitUntilAllThreadsStopped();
        token = nativePreFork();
    }

Daemons.stop(); 停止虚拟机中一些守护线程操作:如引用队列、终接器、GC等

    public static void stop() {
        ReferenceQueueDaemon.INSTANCE.stop();
        FinalizerDaemon.INSTANCE.stop();
        FinalizerWatchdogDaemon.INSTANCE.stop();
        HeapTrimmerDaemon.INSTANCE.stop();
        GCDaemon.INSTANCE.stop();
    }

waitUntilAllThreadsStopped 保证被 fork 的进程是单线程,这样可以确保通过 copyonwrite fork 出来的进程也是单线程,从而节省资源。与前面提到的在新建 system_server 进程中调用 closeServerSocket 关闭 sockect 有异曲同工之妙。

    /**
     * We must not fork until we're single-threaded again. Wait until /proc shows we're
     * down to just one thread.
     */
    private static void waitUntilAllThreadsStopped() {
        File tasks = new File("/proc/self/task");
        while (tasks.list().length > 1) {
            try {
                // Experimentally, booting and playing about with a stingray, I never saw us
                // go round this loop more than once with a 10ms sleep.
                Thread.sleep(10);
            } catch (InterruptedException ignored) {
            }
        }
    }

本地方法 nativePreFork 实现在 art/runtime/native/dalvik_system_ZygoteHooks.cc 中。

    static jlong ZygoteHooks_nativePreFork(JNIEnv* env, jclass) {
      Runtime* runtime = Runtime::Current();
      CHECK(runtime->IsZygote()) << "runtime instance not started with -Xzygote";

      runtime->PreZygoteFork();

      // Grab thread before fork potentially makes Thread::pthread_key_self_ unusable.
      Thread* self = Thread::Current();
      return reinterpret_cast(self);
    }

ZygoteHooks_nativePreFork 通过调用 Runtime::PreZygoteFork 来完成 gc 堆的一些初始化,这部分代码在 art/runtime/runtime.cc 中:

    heap_ = new gc::Heap(...);
    void Runtime::PreZygoteFork() {
        heap_->PreZygoteFork();
    }

创建 system_server 进程:

nativeForkSystemServer 实现在 framework/base/core/jni/com_android_internal_os_Zygote.cpp 中:

    static jint com_android_internal_os_Zygote_nativeForkSystemServer(
            JNIEnv* env, jclass, uid_t uid, gid_t gid, jintArray gids,
            jint debug_flags, jobjectArray rlimits, jlong permittedCapabilities,
            jlong effectiveCapabilities) {
        pid_t pid = ForkAndSpecializeCommon(env, uid, gid, gids,
                        debug_flags, rlimits,
                        permittedCapabilities, effectiveCapabilities,
                        MOUNT_EXTERNAL_NONE, NULL, NULL, true, NULL,
                        NULL, NULL);
        if (pid > 0) {
            // The zygote process checks whether the child process has died or not.
            ALOGI("System server process %d has been created", pid);
            gSystemServerPid = pid;
            // There is a slight window that the system server process has crashed
            // but it went unnoticed because we haven't published its pid yet. So
            // we recheck here just to make sure that all is well.
            int status;
            if (waitpid(pid, &status, WNOHANG) == pid) {
                ALOGE("System server process %d has died. Restarting Zygote!", pid);
                RuntimeAbort(env);
            }
        }
        return pid;
    }

它转调 ForkAndSpecializeCommon 来创建新进程,并确保 system_server 创建成功,若不成功便成仁:重启 zygote,因为没有 system_server 就干不了什么事情。ForkAndSpecializeCommon 实现如下:

    static const char kZygoteClassName[] = "com/android/internal/os/Zygote";
    gZygoteClass = (jclass) env->NewGlobalRef(env->FindClass(kZygoteClassName));
    gCallPostForkChildHooks = env->GetStaticMethodID(gZygoteClass, "callPostForkChildHooks",
                                           "(ILjava/lang/String;)V");

    // Utility routine to fork zygote and specialize the child process.
    static pid_t ForkAndSpecializeCommon(JNIEnv* env, uid_t uid, gid_t gid, jintArray javaGids,
                     jint debug_flags, jobjectArray javaRlimits,
                     jlong permittedCapabilities, jlong effectiveCapabilities,
                     jint mount_external,
                     jstring java_se_info, jstring java_se_name,
                     bool is_system_server, jintArray fdsToClose,
                     jstring instructionSet, jstring dataDir)
    {
        SetSigChldHandler();

        pid_t pid = fork();

        if (pid == 0) {
            // The child process.
            ...
            rc = selinux_android_setcontext(uid, is_system_server, se_info_c_str, se_name_c_str);
            ...
            UnsetSigChldHandler();
            ...
            env->CallStaticVoidMethod(gZygoteClass, gCallPostForkChildHooks, debug_flags,
                          is_system_server ? NULL : instructionSet);
        }
        else if (pid > 0) {
            // the parent process
        }

        return pid;
    }

ForkAndSpecializeCommon 首先设置子进程异常处理handler,然后 fork 新进程,在新进程中设置 SELinux,并清除它的子进程异常处理 handler,然后调用 Zygote.callPostForkChildHooks 方法。

    private static void callPostForkChildHooks(int debugFlags, String instructionSet) {
        long startTime = SystemClock.elapsedRealtime();
        VM_HOOKS.postForkChild(debugFlags, instructionSet);
        checkTime(startTime, "Zygote.callPostForkChildHooks");
    }

callPostForkChildHooks 又转调 ZygoteHooks.postForkChild :

    public void postForkChild(int debugFlags, String instructionSet) {
        nativePostForkChild(token, debugFlags, instructionSet);
    }

本地方法 nativePostForkChild 又进到 dalvik_system_ZygoteHooks.cc 中:

    static void ZygoteHooks_nativePostForkChild(JNIEnv* env, jclass, jlong token, jint debug_flags,
                                            jstring instruction_set) {
        Thread* thread = reinterpret_cast(token);
        // Our system thread ID, etc, has changed so reset Thread state.
        thread->InitAfterFork();
        EnableDebugFeatures(debug_flags);

        if (instruction_set != nullptr) {
            ScopedUtfChars isa_string(env, instruction_set);
            InstructionSet isa = GetInstructionSetFromString(isa_string.c_str());
            Runtime::NativeBridgeAction action = Runtime::NativeBridgeAction::kUnload;
            if (isa != kNone && isa != kRuntimeISA) {
                action = Runtime::NativeBridgeAction::kInitialize;
            }
            Runtime::Current()->DidForkFromZygote(env, action, isa_string.c_str());
        } else {
            Runtime::Current()->DidForkFromZygote(env, Runtime::NativeBridgeAction::kUnload, nullptr);
        }
    }

thread->InitAfterFork(); 实现在 art/runtime/thread.cc 中,设置新进程主线程的线程id: tid。DidForkFromZygote 实现在 Runtime.cc 中:

    void Runtime::DidForkFromZygote(JNIEnv* env, NativeBridgeAction action, const char* isa) {
        is_zygote_ = false;

        switch (action) {
        case NativeBridgeAction::kUnload:
            UnloadNativeBridge();
            break;

        case NativeBridgeAction::kInitialize:
            InitializeNativeBridge(env, isa);
            break;
        }

        // Create the thread pool.
        heap_->CreateThreadPool();

        StartSignalCatcher();

        // Start the JDWP thread. If the command-line debugger flags specified "suspend=y",
        // this will pause the runtime, so we probably want this to come last.
        Dbg::StartJdwp();
    }

首先根据 action 参数来卸载或转载用于跨平台桥接用的库。然后启动 gc 堆的线程池。StartSignalCatcher 设置信号 处理 handler,其代码在 signal_catcher.cc 中。

ZygoteHooks.postForkCommon

后置处理 ZygoteHooks.postForkCommon:

    public void postForkCommon() {
        Daemons.start();
    }

postForkCommon 转调 Daemons.start,以初始化虚拟机中引用队列、终接器以及 gc 的守护线程。

    public static void start() {
        ReferenceQueueDaemon.INSTANCE.start();
        FinalizerDaemon.INSTANCE.start();
        FinalizerWatchdogDaemon.INSTANCE.start();
        HeapTrimmerDaemon.INSTANCE.start();
        GCDaemon.INSTANCE.start();
    }

forkAndSpecialize

Zygote.forkAndSpecialize 方法

    public static int forkAndSpecialize(int uid, int gid, int[] gids, int debugFlags,
          int[][] rlimits, int mountExternal, String seInfo, String niceName, int[] fdsToClose,
          String instructionSet, String appDataDir) {
        long startTime = SystemClock.elapsedRealtime();
        VM_HOOKS.preFork();
        checkTime(startTime, "Zygote.preFork");
        int pid = nativeForkAndSpecialize(
                  uid, gid, gids, debugFlags, rlimits, mountExternal, seInfo, niceName, fdsToClose,
                  instructionSet, appDataDir);
        checkTime(startTime, "Zygote.nativeForkAndSpecialize");
        VM_HOOKS.postForkCommon();
        checkTime(startTime, "Zygote.postForkCommon");
        return pid;
    }

前置处理与后置处理与 forkSystemServer 中一样的,这里就跳过不讲了。本地方法 nativeForkAndSpecialize 实现在 framework/base/core/jni/com_android_internal_os_Zygote.cpp 中:

static jint com_android_internal_os_Zygote_nativeForkAndSpecialize(
        JNIEnv* env, jclass, jint uid, jint gid, jintArray gids,
        jint debug_flags, jobjectArray rlimits,
        jint mount_external, jstring se_info, jstring se_name,
        jintArray fdsToClose, jstring instructionSet, jstring appDataDir) {
    // Grant CAP_WAKE_ALARM to the Bluetooth process.
    jlong capabilities = 0;
    if (uid == AID_BLUETOOTH) {
        capabilities |= (1LL << CAP_WAKE_ALARM);
    }

    return ForkAndSpecializeCommon(env, uid, gid, gids, debug_flags,
            rlimits, capabilities, capabilities, mount_external, se_info,
            se_name, false, fdsToClose, instructionSet, appDataDir);
}

这个函数与 com_android_internal_os_Zygote_nativeForkSystemServer 非常类似,只不过少了一个确保子进程创建成功的步骤。

你可能感兴趣的:(Android5 Zygote 与 SystemServer 启动流程分析)