在Android中,zygote是整个系统创建新进程的核心进程。zygote进程在内部会先启动Dalvik虚拟机,继而加载一些必要的系统资源和系统类,最后进入一种监听状态。在之后的运作中,当其他系统模块(比如 AMS)希望创建新进程时,只需向zygote进程发出请求,zygote进程监听到该请求后,会相应地fork出新的进程,于是这个新进程在初生之时,就先天具有了自己的Dalvik虚拟机以及系统资源。
关键类 | 路径 |
---|---|
init.rc | system/core/rootdir/init.rc |
init.cpp | system/core/init/init.cpp |
init.zygote64.rc | system/core/rootdir/init.zygote64.rc |
builtins.cpp | system/core/init/builtins.cpp |
service.cpp | system/core/init/service.cpp |
app_main.cpp | frameworks/base/cmds/app_process/app_main.cpp |
AndroidRuntime.cpp | frameworks/base/core/jni/AndroidRuntime.cpp |
JniInvocation.cpp | libnativehelper/JniInvocation.cpp |
ZygoteInit.java | frameworks/base/core/java/com/android/internal/os/ZygoteInit.java |
ZygoteServer.java | frameworks/base/core/java/com/android/internal/os/ZygoteServer.java |
在Android系统中,JavaVM(Java虚拟机)、应用程序进程以及运行系统的关键服务的SystemServer进程都是由Zygote进程来创建的,我们也将它称为孵化器。它通过fock(复制进程)的形式来创建应用程序进程和SystemServer进程,由于Zygote进程在启动时会创建JavaVM,因此通过fock而创建的应用程序进程和SystemServer进程可以在内部获取一个JavaVM的实例拷贝。
在分析init进程时,我们知道init进程启动后,会解析init.rc文件,然后创建和加载service字段指定的进程。zygote进程就是以这种方式,被init进程加载的。
在system/core/rootdir/init.rc的开始部分,可以看到:
import /init.environ.rc
import /init.usb.rc
import /init.${ro.hardware}.rc
import /vendor/etc/init/hw/init.${ro.hardware}.rc
import /init.usb.configfs.rc
import /init.${ro.zygote}.rc // ${ro.zygote}由厂商定义,与平台相关
on early-init
# Set init and its forked children's oom_adj.
write /proc/1/oom_score_adj -1000
从之前分析的init篇中我们知道,在不同的平台(32、64及64_32)上,init.rc将包含不同的zygote.rc文件。在system/core/rootdir目录下,有init.zygote32_64.rc、init.zyote64.rc、 init.zyote32.rc、init.zygote64_32.rc。
✨ init.zygote32.rc:zygote 进程对应的执行程序是 app_process (纯 32bit 模式)
✨ init.zygote64.rc:zygote 进程对应的执行程序是 app_process64 (纯 64bit 模式)
✨ init.zygote32_64.rc:启动两个 zygote 进程 (名为 zygote 和 zygote_secondary),对应的执行程序分别是 app_process32 (主模式)、app_process64
✨ init.zygote64_32.rc:启动两个 zygote 进程 (名为 zygote 和 zygote_secondary),对应的执行程序分别是 app_process64 (主模式)、app_process32
为什么要定义这么多种情况呢?直接定义一个不就好了,这主要是因为Android 5.0以后开始支持64位程序,为了兼容32位和64位才这样定义。不同的zygote.rc内容大致相同,主要区别体现在启动的是32位,还是64位的进程。init.zygote32_64.rc和init.zygote64_32.rc会启动两个进程,且存在主次之分。
这里拿64位处理器为例,init.zygote64_32.rc的代码如下所示:
// 进程名称是zygote,运行的二进制文件在/system/bin/app_process64
// 启动参数是 -Xzygote /system/bin --zygote --start-system-server --socket-name=zygote
service zygote /system/bin/app_process64 -Xzygote /system/bin --zygote --start-system-server --socket-name=zygote
class main
priority -20
user root
group root readproc
socket zygote stream 660 root system // 创建一个socket,名字叫zygote,以tcp形式
onrestart write /sys/android_power/request_state wake // onrestart 指当进程重启时执行后面的命令
onrestart write /sys/power/state on
onrestart restart audioserver
onrestart restart cameraserver
onrestart restart media
onrestart restart netd
onrestart restart wificond
writepid /dev/cpuset/foreground/tasks // 创建子进程时,向 /dev/cpuset/foreground/tasks 写入pid
// 另一个service ,名字 zygote_secondary
service zygote_secondary /system/bin/app_process32 -Xzygote /system/bin --zygote --socket-name=zygote_secondary --enable-lazy-preload
class main
priority -20
user root
group root readproc
socket zygote_secondary stream 660 root system
onrestart restart zygote
writepid /dev/cpuset/foreground/tasks
定义了service,肯定有地方调用 start zygote。在之前init解析的博客中,我们分析过init进程的启动。init进程启动的最后,会产生”late-init”事件。
std::string bootmode = GetProperty("ro.bootmode", "");
if (bootmode == "charger") {
am.QueueEventTrigger("charger");
} else {
am.QueueEventTrigger("late-init");
}
对应于system/core/rootdir/init.rc配置文件中,我们找到如下代码:
on late-init
trigger early-fs
# Mount fstab in init.{$device}.rc by mount_all command. Optional parameter
# '--early' can be specified to skip entries with 'latemount'.
# /system and /vendor must be mounted by the end of the fs stage,
# while /data is optional.
trigger fs
trigger post-fs
# Mount fstab in init.{$device}.rc by mount_all with '--late' parameter
# to only mount entries with 'latemount'. This is needed if '--early' is
# specified in the previous mount_all command on the fs stage.
# With /system mounted and properties form /system + /factory available,
# some services can be started.
trigger late-fs
# Now we can mount /data. File encryption requires keymaster to decrypt
# /data, which in turn can only be loaded when system properties are present.
trigger post-fs-data
# Now we can start zygote for devices with file based encryption
trigger zygote-start
# Load persist properties and override properties (if enabled) from /data.
trigger load_persist_props_action
# Remove a file to wake up anything waiting for firmware.
trigger firmware_mounts_complete
trigger early-boot
trigger boot
对应于init.rc配置文件中,我们找到如下代码
# It is recommended to put unnecessary data/ initialization from post-fs-data
# to start-zygote in device's init.rc to unblock zygote start.
on zygote-start && property:ro.crypto.state=unencrypted
# A/B update verifier that marks a successful boot.
exec_start update_verifier_nonencrypted
start netd
start zygote
start zygote_secondary
on zygote-start && property:ro.crypto.state=unsupported
# A/B update verifier that marks a successful boot.
exec_start update_verifier_nonencrypted
start netd
start zygote
start zygote_secondary
on zygote-start && property:ro.crypto.state=encrypted && property:ro.crypto.type=file
# A/B update verifier that marks a successful boot.
exec_start update_verifier_nonencrypted
start netd
start zygote
start zygote_secondary
start命令有一个对应的执行函数do_start,定义在platform/system/core/init/builtins.cpp中
const BuiltinFunctionMap::Map& BuiltinFunctionMap::map() const {
constexpr std::size_t kMax = std::numeric_limits::max();
// clang-format off
static const Map builtin_functions = {
{"bootchart", {1, 1, do_bootchart}},
{"chmod", {2, 2, do_chmod}},
{"chown", {2, 3, do_chown}},
{"class_reset", {1, 1, do_class_reset}},
{"class_restart", {1, 1, do_class_restart}},
{"class_start", {1, 1, do_class_start}},
{"class_stop", {1, 1, do_class_stop}},
{"copy", {2, 2, do_copy}},
{"domainname", {1, 1, do_domainname}},
{"enable", {1, 1, do_enable}},
{"exec", {1, kMax, do_exec}},
{"exec_start", {1, 1, do_exec_start}},
{"export", {2, 2, do_export}},
{"hostname", {1, 1, do_hostname}},
{"ifup", {1, 1, do_ifup}},
{"init_user0", {0, 0, do_init_user0}},
{"insmod", {1, kMax, do_insmod}},
{"installkey", {1, 1, do_installkey}},
{"load_persist_props", {0, 0, do_load_persist_props}},
{"load_system_props", {0, 0, do_load_system_props}},
{"loglevel", {1, 1, do_loglevel}},
{"mkdir", {1, 4, do_mkdir}},
{"mount_all", {1, kMax, do_mount_all}},
{"mount", {3, kMax, do_mount}},
{"umount", {1, 1, do_umount}},
{"restart", {1, 1, do_restart}},
{"restorecon", {1, kMax, do_restorecon}},
{"restorecon_recursive", {1, kMax, do_restorecon_recursive}},
{"rm", {1, 1, do_rm}},
{"rmdir", {1, 1, do_rmdir}},
{"setprop", {2, 2, do_setprop}},
{"setrlimit", {3, 3, do_setrlimit}},
{"start", {1, 1, do_start}},
{"stop", {1, 1, do_stop}},
{"swapon_all", {1, 1, do_swapon_all}},
{"symlink", {2, 2, do_symlink}},
{"sysclktz", {1, 1, do_sysclktz}},
{"trigger", {1, 1, do_trigger}},
{"verity_load_state", {0, 0, do_verity_load_state}},
{"verity_update_state", {0, 0, do_verity_update_state}},
{"wait", {1, 2, do_wait}},
{"wait_for_prop", {2, 2, do_wait_for_prop}},
{"write", {2, 2, do_write}},
};
// clang-format on
return builtin_functions;
}
我们来看下do_start():
static int do_start(const std::vector& args) {
Service* svc = ServiceManager::GetInstance().FindServiceByName(args[1]);
if (!svc) {
LOG(ERROR) << "do_start: Service " << args[1] << " not found";
return -1;
}
if (!svc->Start())
return -1;
return 0;
}
do_start首先是通过FindServiceByName去service数组中遍历,根据名字匹配出对应的service,然后调用service的Start函数。
最后,我们来看看service.cpp中定义Start函数:
pid_t pid = -1;
if (namespace_flags_) {
pid = clone(nullptr, nullptr, namespace_flags_ | SIGCHLD, nullptr);
} else {
pid = fork();
}
Start函数主要是fork出一个新进程,然后执行service对应的二进制文件,并将参数传递进去。那么下面我们以init.zygote64.rc为例进行分析。
从上面我们分析的init.zygote64.rc可以看出,zygote64启动文件的地址为app_process64。app_process64对应的代码定义在frameworks/base/cmds/app_process中,
我们来看看对应的Android.mk: frameworks/base/cmds/app_process
LOCAL_MODULE:= app_process
LOCAL_MULTILIB := both
LOCAL_MODULE_STEM_32 := app_process32
LOCAL_MODULE_STEM_64 := app_process64
其实不管是app_process、app_process32还是app_process64,对应的源文件都是app_main.cpp。
接下来我们就看看app_process对应的main函数,该函数定义于app_main.cpp中。
在app_main.cpp的main函数中,主要做的事情就是参数解析. 这个函数有两种启动模式:
✨ 一种是zygote模式,也就是初始化zygote进程,传递的参数有--start-system-server --socket-name=zygote,前者表示启动SystemServer,后者指定socket的名称(Zygote64_32)。
✨ 一种是application模式,也就是启动普通应用程序,传递的参数有class名字以及class带的参数。
两者最终都是调用AppRuntime对象的start函数,加载ZygoteInit或RuntimeInit两个Java类,并将之前整理的参数传入进去。
我们这里暂时只讲解ZygoteInit的加载流程。
if (!LOG_NDEBUG) {
String8 argv_String;
for (int i = 0; i < argc; ++i) {
argv_String.append("\"");
argv_String.append(argv[i]);
argv_String.append("\" ");
}
ALOGV("app_process main with argv: %s", argv_String.string());
}
AppRuntime runtime(argv[0], computeArgBlockSize(argc, argv));
// Process command line arguments
// ignore argv[0]
argc--;
argv++;
// Everything up to '--' or first non '-' arg goes to the vm.
//
// The first argument after the VM args is the "parent dir", which
// is currently unused.
//
// After the parent dir, we expect one or more the following internal
// arguments :
//
// --zygote : Start in zygote mode
// --start-system-server : Start the system server.
// --application : Start in application (stand alone, non zygote) mode.
// --nice-name : The nice name for this process.
//
// For non zygote starts, these arguments will be followed by
// the main class name. All remaining arguments are passed to
// the main method of this class.
//
// For zygote starts, all remaining arguments are passed to the zygote.
// main function.
//
// Note that we must copy argument string values since we will rewrite the
// entire argument block when we apply the nice name to argv0.
//
// As an exception to the above rule, anything in "spaced commands"
// goes to the vm even though it has a space in it.
const char* spaced_commands[] = { "-cp", "-classpath" };
// Allow "spaced commands" to be succeeded by exactly 1 argument (regardless of -s).
bool known_command = false;
int i;
for (i = 0; i < argc; i++) {
if (known_command == true) {
runtime.addOption(strdup(argv[i]));
// The static analyzer gets upset that we don't ever free the above
// string. Since the allocation is from main, leaking it doesn't seem
// problematic. NOLINTNEXTLINE
ALOGV("app_process main add known option '%s'", argv[i]);
known_command = false;
continue;
}
for (int j = 0;
j < static_cast(sizeof(spaced_commands) / sizeof(spaced_commands[0]));
++j) {
if (strcmp(argv[i], spaced_commands[j]) == 0) {
known_command = true;
ALOGV("app_process main found known command '%s'", argv[i]);
}
}
if (argv[i][0] != '-') {
break;
}
if (argv[i][1] == '-' && argv[i][2] == 0) {
++i; // Skip --.
break;
}
runtime.addOption(strdup(argv[i]));
// The static analyzer gets upset that we don't ever free the above
// string. Since the allocation is from main, leaking it doesn't seem
// problematic. NOLINTNEXTLINE
ALOGV("app_process main add option '%s'", argv[i]);
}
// Parse runtime arguments. Stop at first unrecognized option.
bool zygote = false;
bool startSystemServer = false;
bool application = false;
String8 niceName;
String8 className;
++i; // Skip unused "parent dir" argument.
while (i < argc) {
const char* arg = argv[i++];
if (strcmp(arg, "--zygote") == 0) {
zygote = true;
niceName = ZYGOTE_NICE_NAME;
} else if (strcmp(arg, "--start-system-server") == 0) {
startSystemServer = true;
} else if (strcmp(arg, "--application") == 0) {
application = true;
} else if (strncmp(arg, "--nice-name=", 12) == 0) {
niceName.setTo(arg + 12);
} else if (strncmp(arg, "--", 2) != 0) {
className.setTo(arg);
break;
} else {
--i;
break;
}
}
Vector args;
if (!className.isEmpty()) {
// We're not in zygote mode, the only argument we need to pass
// to RuntimeInit is the application argument.
//
// The Remainder of args get passed to startup class main(). Make
// copies of them before we overwrite them with the process name.
args.add(application ? String8("application") : String8("tool"));
runtime.setClassNameAndArgs(className, argc - i, argv + i);
if (!LOG_NDEBUG) {
String8 restOfArgs;
char* const* argv_new = argv + i;
int argc_new = argc - i;
for (int k = 0; k < argc_new; ++k) {
restOfArgs.append("\"");
restOfArgs.append(argv_new[k]);
restOfArgs.append("\" ");
}
ALOGV("Class name = %s, args = %s", className.string(), restOfArgs.string());
}
} else {
// We're in zygote mode.
maybeCreateDalvikCache();
if (startSystemServer) {
args.add(String8("start-system-server"));
}
char prop[PROP_VALUE_MAX];
if (property_get(ABI_LIST_PROPERTY, prop, NULL) == 0) {
LOG_ALWAYS_FATAL("app_process: Unable to determine ABI list from property %s.",
ABI_LIST_PROPERTY);
return 11;
}
String8 abiFlag("--abi-list=");
abiFlag.append(prop);
args.add(abiFlag);
// In zygote mode, pass all remaining arguments to the zygote
// main() method.
for (; i < argc; ++i) {
args.add(String8(argv[i]));
}
}
if (!niceName.isEmpty()) {
runtime.setArgv0(niceName.string(), true /* setProcName */);
}
if (zygote) {
runtime.start("com.android.internal.os.ZygoteInit", args, zygote);
} else if (className) {
runtime.start("com.android.internal.os.RuntimeInit", args, zygote);
} else {
fprintf(stderr, "Error: no class name or --zygote supplied.\n");
app_usage();
LOG_ALWAYS_FATAL("app_process: no class name or --zygote supplied.");
}
由于AppRuntime继承自AndroidRuntime,且没有重写start方法,因此zygote的流程进入到了frameworks/base/core/jni/AndroidRuntime.cpp
接下来,我们来看看AndroidRuntime的start函数的流程。
static const String8 startSystemServer("start-system-server");
/*
* 'startSystemServer == true' means runtime is obsolete and not run from
* init.rc anymore, so we print out the boot start event here.
*/
for (size_t i = 0; i < options.size(); ++i) {
if (options[i] == startSystemServer) {
/* track our progress through the boot sequence */
const int LOG_BOOT_PROGRESS_START = 3000;
LOG_EVENT_LONG(LOG_BOOT_PROGRESS_START, ns2ms(systemTime(SYSTEM_TIME_MONOTONIC)));
}
}
const char* rootDir = getenv("ANDROID_ROOT");
if (rootDir == NULL) {
rootDir = "/system";
if (!hasDir("/system")) {
LOG_FATAL("No root directory specified, and /android does not exist.");
return;
}
setenv("ANDROID_ROOT", rootDir, 1);
}
//const char* kernelHack = getenv("LD_ASSUME_KERNEL");
//ALOGD("Found LD_ASSUME_KERNEL='%s'\n", kernelHack);
/* start the virtual machine */
JniInvocation jni_invocation;
jni_invocation.Init(NULL);
JNIEnv* env;
if (startVm(&mJavaVM, &env, zygote) != 0) {
return;
}
onVmCreated(env);
这边我们跟一下jni_invocation.Init():libnativehelper/JniInvocation.cpp
Init函数主要作用是初始化JNI,具体工作是首先通过dlopen加载libart.so获得其句柄,然后调用dlsym从libart.so中找到JNI_GetDefaultJavaVMInitArgs、JNI_CreateJavaVM、JNI_GetCreatedJavaVMs三个函数地址,赋值给对应成员属性,这三个函数会在后续虚拟机创建中调用。
#ifdef __ANDROID__
char buffer[PROP_VALUE_MAX];
#else
char* buffer = NULL;
#endif
library = GetLibrary(library, buffer);
// Load with RTLD_NODELETE in order to ensure that libart.so is not unmapped when it is closed.
// This is due to the fact that it is possible that some threads might have yet to finish
// exiting even after JNI_DeleteJavaVM returns, which can lead to segfaults if the library is
// unloaded.
const int kDlopenFlags = RTLD_NOW | RTLD_NODELETE;
handle_ = dlopen(library, kDlopenFlags);
if (handle_ == NULL) {
if (strcmp(library, kLibraryFallback) == 0) {
// Nothing else to try.
ALOGE("Failed to dlopen %s: %s", library, dlerror());
return false;
}
// Note that this is enough to get something like the zygote
// running, we can't property_set here to fix this for the future
// because we are root and not the system user. See
// RuntimeInit.commonInit for where we fix up the property to
// avoid future fallbacks. http://b/11463182
ALOGW("Falling back from %s to %s after dlopen error: %s",
library, kLibraryFallback, dlerror());
library = kLibraryFallback;
handle_ = dlopen(library, kDlopenFlags);
if (handle_ == NULL) {
ALOGE("Failed to dlopen %s: %s", library, dlerror());
return false;
}
}
if (!FindSymbol(reinterpret_cast(&JNI_GetDefaultJavaVMInitArgs_),
"JNI_GetDefaultJavaVMInitArgs")) {
return false;
}
if (!FindSymbol(reinterpret_cast(&JNI_CreateJavaVM_),
"JNI_CreateJavaVM")) {
return false;
}
if (!FindSymbol(reinterpret_cast(&JNI_GetCreatedJavaVMs_),
"JNI_GetCreatedJavaVMs")) {
return false;
}
return true;
其次,我们再跟一下startVm():
这个函数特别长,但是里面做的事情很单一,其实就是从各种系统属性中读取一些参数,然后通过addOption设置到AndroidRuntime的mOptions数组中存起来,另外就是调用之前从libart.so中找到JNI_CreateJavaVM函数,并将这些参数传入。
/*
* Register android functions.
*/
if (startReg(env) < 0) {
ALOGE("Unable to register all android natives\n");
return;
}
startReg首先是设置了Android创建线程的处理函数,然后创建了一个200容量的局部引用作用域,用于确保不会出现OutOfMemoryException,最后就是调用register_jni_procs进行JNI注册。
我们跟进startReg():
ATRACE_NAME("RegisterAndroidNatives");
/*
* This hook causes all future threads created in this process to be
* attached to the JavaVM. (This needs to go away in favor of JNI
* Attach calls.)
*/
androidSetCreateThreadFunc((android_create_thread_fn) javaCreateThreadEtc);
ALOGV("--- registering native functions ---\n");
/*
* Every "register" function calls one or more things that return
* a local reference (e.g. FindClass). Because we haven't really
* started the VM yet, they're all getting stored in the base frame
* and never released. Use Push/Pop to manage the storage.
*/
env->PushLocalFrame(200);
if (register_jni_procs(gRegJNI, NELEM(gRegJNI), env) < 0) {
env->PopLocalFrame(NULL);
return -1;
}
env->PopLocalFrame(NULL);
//createJavaThread("fubar", quickTest, (void*) "hello");
return 0;
从上述代码可以看出,startReg函数中主要是通过register_jni_procs来注册JNI函数。其中,gRegJNI是一个全局数组,该数组的定义如下:
static const RegJNIRec gRegJNI[] = {
REG_JNI(register_com_android_internal_os_RuntimeInit),
REG_JNI(register_com_android_internal_os_ZygoteInit_nativeZygoteInit),
REG_JNI(register_android_os_SystemClock),
REG_JNI(register_android_util_EventLog),
REG_JNI(register_android_util_Log),
REG_JNI(register_android_util_MemoryIntArray),
REG_JNI(register_android_util_PathParser),
REG_JNI(register_android_app_admin_SecurityLog),
REG_JNI(register_android_content_AssetManager),
......
我们挑一个register_com_android_internal_os_ZygoteInit_nativeZygoteInit,这实际上是自定义JNI函数并进行动态注册的标准写法,
内部是调用JNI的RegisterNatives,这样注册后,Java类ZygoteInit的native方法nativeZygoteInit就会调用com_android_internal_os_ZygoteInit_nativeZygoteInit函数。
int register_com_android_internal_os_ZygoteInit_nativeZygoteInit(JNIEnv* env)
{
const JNINativeMethod methods[] = {
{ "nativeZygoteInit", "()V",
(void*) com_android_internal_os_ZygoteInit_nativeZygoteInit },
};
return jniRegisterNativeMethods(env, "com/android/internal/os/ZygoteInit",
methods, NELEM(methods));
}
REG_JNI对应的宏定义及RegJNIRec结构体的定义为:
#define REG_JNI(name) { name, #name }
struct RegJNIRec {
int (*mProc)(JNIEnv*);
const char* mName;
};
根据宏定义可以看出,宏REG_JNI将得到函数名;定义RegJNIRec数组时,函数名被赋值给RegJNIRec结构体,于是每个函数名被强行转换为函数指针。
因此,register_jni_procs的参数就是一个函数指针数组,数组的大小和JNIEnv。
我们来跟进一下register_jni_procs函数:
static int register_jni_procs(const RegJNIRec array[], size_t count, JNIEnv* env)
{
for (size_t i = 0; i < count; i++) {
if (array[i].mProc(env) < 0) {
#ifndef NDEBUG
ALOGD("----------!!! %s failed to load\n", array[i].mName);
#endif
return -1;
}
}
return 0;
}
结合前面的分析,容易知道register_jni_procs函数,实际上就是调用函数指针(mProc)对应的函数,以进行实际的JNI函数注册。
继续分析AndroidRuntime.cpp的start函数:
/*
* Start VM. This thread becomes the main thread of the VM, and will
* not return until the VM exits.
*/
char* slashClassName = toSlashClassName(className != NULL ? className : "");
jclass startClass = env->FindClass(slashClassName);
if (startClass == NULL) {
ALOGE("JavaVM unable to locate class '%s'\n", slashClassName);
/* keep going */
} else {
jmethodID startMeth = env->GetStaticMethodID(startClass, "main",
"([Ljava/lang/String;)V");
if (startMeth == NULL) {
ALOGE("JavaVM unable to find main() in '%s'\n", className);
/* keep going */
} else {
env->CallStaticVoidMethod(startClass, startMeth, strArray);
}
}
free(slashClassName);
ALOGD("Shutting down VM\n");
if (mJavaVM->DetachCurrentThread() != JNI_OK)
ALOGW("Warning: unable to detach main thread\n");
if (mJavaVM->DestroyJavaVM() != 0)
ALOGW("Warning: VM did not shut down cleanly\n");
可以看到,在AndroidRuntime的最后,将通过反射调用ZygoteInit的main函数。至此,zygote进程进入了java世界。
其实我们仔细想一想,就会觉得zygote的整个流程实际上是非常符合实际情况的。
✨✨ 在Android中,每个进程都运行在对应的虚拟机上,因此zygote首先就负责创建出虚拟机。
✨✨ 然后,为了反射调用java代码,必须有对应的JNI函数,于是zygote进行了JNI函数的注册。
✨✨ 当一切准备妥当后,zygote进程才进入到了java世界。
现在我们跟进ZygoteInit.java的main函数。
public static void main(String argv[]) {
ZygoteServer zygoteServer = new ZygoteServer();
// Mark zygote start. This ensures that thread creation will throw
// an error.
//调用native 确保当前没有其他线程运行,处于安全考虑
ZygoteHooks.startZygoteNoThreadCreation();
// Zygote goes into its own process group.
try {
Os.setpgid(0, 0);
} catch (ErrnoException ex) {
throw new RuntimeException("Failed to setpgid(0,0)", ex);
}
final Runnable caller;
try {
// Report Zygote start time to tron unless it is a runtime restart
if (!"1".equals(SystemProperties.get("sys.boot_completed"))) {
MetricsLogger.histogram(null, "boot_zygote_init",
(int) SystemClock.elapsedRealtime());
}
String bootTimeTag = Process.is64Bit() ? "Zygote64Timing" : "Zygote32Timing";
TimingsTraceLog bootTimingsTraceLog = new TimingsTraceLog(bootTimeTag,
Trace.TRACE_TAG_DALVIK);
bootTimingsTraceLog.traceBegin("ZygoteInit");
RuntimeInit.enableDdms();
boolean startSystemServer = false;
String socketName = "zygote";
String abiList = null;
boolean enableLazyPreload = false;
//解析参数,得到上述变量的值
for (int i = 1; i < argv.length; i++) {
if ("start-system-server".equals(argv[i])) {
startSystemServer = true;
} else if ("--enable-lazy-preload".equals(argv[i])) {
enableLazyPreload = true;
} else if (argv[i].startsWith(ABI_LIST_ARG)) {
abiList = argv[i].substring(ABI_LIST_ARG.length());
} else if (argv[i].startsWith(SOCKET_NAME_ARG)) {
socketName = argv[i].substring(SOCKET_NAME_ARG.length());
} else {
throw new RuntimeException("Unknown command line argument: " + argv[i]);
}
}
if (abiList == null) {
throw new RuntimeException("No ABI list supplied.");
}
//注册zygoteServer,用于通信
zygoteServer.registerServerSocket(socketName);
// In some configurations, we avoid preloading resources and classes eagerly.
// In such cases, we will preload things prior to our first fork.
if (!enableLazyPreload) {
bootTimingsTraceLog.traceBegin("ZygotePreload");
EventLog.writeEvent(LOG_BOOT_PROGRESS_PRELOAD_START,
SystemClock.uptimeMillis());
preload(bootTimingsTraceLog); //默认情况,预加载信息,第一次fork才会预加载
EventLog.writeEvent(LOG_BOOT_PROGRESS_PRELOAD_END,
SystemClock.uptimeMillis());
bootTimingsTraceLog.traceEnd(); // ZygotePreload
} else {
Zygote.resetNicePriority();
}
// Do an initial gc to clean up after startup
bootTimingsTraceLog.traceBegin("PostZygoteInitGC");
gcAndFinalize();
bootTimingsTraceLog.traceEnd(); // PostZygoteInitGC
bootTimingsTraceLog.traceEnd(); // ZygoteInit
// Disable tracing so that forked processes do not inherit stale tracing tags from
// Zygote.
Trace.setTracingEnabled(false, 0);
// Zygote process unmounts root storage spaces.
Zygote.nativeUnmountStorageOnInit();
// Set seccomp policy
// 加载seccomp的过滤规则,所有android进程
//该过滤器安装到zygote进程中,由于所有Android应用均衍生自该进程
Seccomp.setPolicy();
//允许有其它线程了
ZygoteHooks.stopZygoteNoThreadCreation();
if (startSystemServer) {
Runnable r = forkSystemServer(abiList, socketName, zygoteServer);
// {@code r == null} in the parent (zygote) process, and {@code r != null} in the
// child (system_server) process.
if (r != null) {
r.run();
return;
}
}
Log.i(TAG, "Accepting command socket connections");
// The select loop returns early in the child process after a fork and
// loops forever in the zygote.
//zygote进程进入无限循环,处理请求
caller = zygoteServer.runSelectLoop(abiList);
} catch (Throwable ex) {
Log.e(TAG, "System zygote died with exception", ex);
throw ex;
} finally {
zygoteServer.closeServerSocket();
}
// We're in the child process and have exited the select loop. Proceed to execute the
// command.
if (caller != null) {
caller.run();
}
}
上面是ZygoteInit的main函数的主干部分,除了安全相关的内容外,最主要的工作就是注册server socket、预加载、启动system server及进入无限循环处理请求消息。
接下来我们分四部分分别讨论!
Android O将server socket相关的工作抽象到/frameworks/base/core/java/com/android/internal/os/ZygoteServer.java中了。我们来看看其中的registerZygoteSocket函数:
void registerServerSocket(String socketName) {
if (mServerSocket == null) {
int fileDesc;
//此处的socket name,就是zygote
final String fullSocketName = ANDROID_SOCKET_PREFIX + socketName;
try {
//在进程被创建时,就会创建对应的文件描述符,并加入到环境变量中
//因此,此时可以取出对应的环境变量
String env = System.getenv(fullSocketName);
fileDesc = Integer.parseInt(env);
} catch (RuntimeException ex) {
throw new RuntimeException(fullSocketName + " unset or invalid", ex);
}
try {
// 获取zygote socket的文件描述符
FileDescriptor fd = new FileDescriptor();
fd.setInt$(fileDesc);
mServerSocket = new LocalServerSocket(fd);
} catch (IOException ex) {
throw new RuntimeException(
"Error binding to local socket '" + fileDesc + "'", ex);
}
}
}
我们跟踪/framework//core/java/android/net/LocalServerSocket.java中的LocalServerSocket():
public LocalServerSocket(String name) throws IOException
{
impl = new LocalSocketImpl();
impl.create(LocalSocket.SOCKET_STREAM);
localAddress = new LocalSocketAddress(name);
impl.bind(localAddress);
impl.listen(LISTEN_BACKLOG);
}
我们看看预加载(zygoteinit.java)的内容:
static void preload(TimingsTraceLog bootTimingsTraceLog) {
Log.d(TAG, "begin preload");
bootTimingsTraceLog.traceBegin("BeginIcuCachePinning");
beginIcuCachePinning();
bootTimingsTraceLog.traceEnd(); // BeginIcuCachePinning
bootTimingsTraceLog.traceBegin("PreloadClasses");
preloadClasses(); //加载一些类,一般由厂商定义,这也就是启动慢的原因之一
bootTimingsTraceLog.traceEnd(); // PreloadClasses
bootTimingsTraceLog.traceBegin("PreloadResources");
preloadResources();
bootTimingsTraceLog.traceEnd(); // PreloadResources
Trace.traceBegin(Trace.TRACE_TAG_DALVIK, "PreloadAppProcessHALs");
nativePreloadAppProcessHALs();
Trace.traceEnd(Trace.TRACE_TAG_DALVIK);
Trace.traceBegin(Trace.TRACE_TAG_DALVIK, "PreloadOpenGL");
preloadOpenGL(); //图形相关
Trace.traceEnd(Trace.TRACE_TAG_DALVIK);
preloadSharedLibraries(); //一些必要的库
preloadTextResources(); //字符相关
// Ask the WebViewFactory to do any initialization that must run in the zygote process,
// for memory sharing purposes.
WebViewFactory.prepareWebViewInZygote();
endIcuCachePinning();
warmUpJcaProviders(); //安全相关
Log.d(TAG, "end preload");
sPreloadComplete = true;
}
为了让系统实际运行时更加流畅,在zygote启动时候,调用preload函数进行了一些预加载操作。Android 通过zygote fork的方式创建子进程。zygote进程预加载这些类和资源,在fork子进程时,仅需要做一个复制即可。
这样可以节约子进程的启动时间。同时,根据fork的copy-on-write机制可知,有些类如果不做改变,甚至都不用复制,子进程可以和父进程共享这部分数据,从而省去不少内存的占用。
再来看看启动System Server (zygoteinit.java)的流程:
private static Runnable forkSystemServer(String abiList, String socketName,
ZygoteServer zygoteServer) {
long capabilities = posixCapabilitiesAsBits(
OsConstants.CAP_IPC_LOCK,
OsConstants.CAP_KILL,
OsConstants.CAP_NET_ADMIN,
OsConstants.CAP_NET_BIND_SERVICE,
OsConstants.CAP_NET_BROADCAST,
OsConstants.CAP_NET_RAW,
OsConstants.CAP_SYS_MODULE,
OsConstants.CAP_SYS_NICE,
OsConstants.CAP_SYS_PTRACE,
OsConstants.CAP_SYS_TIME,
OsConstants.CAP_SYS_TTY_CONFIG,
OsConstants.CAP_WAKE_ALARM
);
/* Containers run without this capability, so avoid setting it in that case */
if (!SystemProperties.getBoolean(PROPERTY_RUNNING_IN_CONTAINER, false)) {
capabilities |= posixCapabilitiesAsBits(OsConstants.CAP_BLOCK_SUSPEND);
}
/* Hardcoded command line to start the system server */
String args[] = {
"--setuid=1000",
"--setgid=1000",
"--setgroups=1001,1002,1003,1004,1005,1006,1007,1008,1009,1010,1018,1021,1023,1032,3001,3002,3003,3006,3007,3009,3010",
"--capabilities=" + capabilities + "," + capabilities,
"--nice-name=system_server",
"--runtime-args",
"com.android.server.SystemServer",
};
ZygoteConnection.Arguments parsedArgs = null;
int pid;
try {
//将上面准备的参数,按照ZygoteConnection的风格进行封装
parsedArgs = new ZygoteConnection.Arguments(args);
ZygoteConnection.applyDebuggerSystemProperty(parsedArgs);
ZygoteConnection.applyInvokeWithSystemProperty(parsedArgs);
/* Request to fork the system server process */
pid = Zygote.forkSystemServer(
//通过fork"分裂"出system_server
parsedArgs.uid, parsedArgs.gid,
parsedArgs.gids,
parsedArgs.debugFlags,
null,
parsedArgs.permittedCapabilities,
parsedArgs.effectiveCapabilities);
} catch (IllegalArgumentException ex) {
throw new RuntimeException(ex);
}
/* For child process */
if (pid == 0) {
if (hasSecondZygote(abiList)) {
//waitForSecondaryZygote
waitForSecondaryZygote(socketName);
}
//fork时会copy socket,system server需要主动关闭
zygoteServer.closeServerSocket();
// system server进程处理自己的工作
return handleSystemServerProcess(parsedArgs);
}
return null;
}
创建出SystemServer进程后,zygote进程调用ZygoteServer中的函数runSelectLoop,处理server socket收到的命令。
Runnable runSelectLoop(String abiList) {
ArrayList fds = new ArrayList();
ArrayList peers = new ArrayList();
//先将server socket 添加到fds
fds.add(mServerSocket.getFileDescriptor());
peers.add(null);
while (true) {
// 每次循环,都重新创建需要监听的pollFds
StructPollfd[] pollFds = new StructPollfd[fds.size()];
for (int i = 0; i < pollFds.length; ++i) {
pollFds[i] = new StructPollfd();
pollFds[i].fd = fds.get(i);
pollFds[i].events = (short) POLLIN;
}
try {
Os.poll(pollFds, -1); //等待事件的到来
} catch (ErrnoException ex) {
throw new RuntimeException("poll failed", ex);
}
// 注意这里是倒序的,即优先处理已建立链接的信息,后处理新建链接的请求
for (int i = pollFds.length - 1; i >= 0; --i) {
if ((pollFds[i].revents & POLLIN) == 0) {
continue;
}
//// server socket最先加入fds, 因此这里是server socket收到数据
if (i == 0) {
// 收到新的建立通信的请求,建立通信连接
ZygoteConnection newPeer = acceptCommandPeer(abiList);
// 加入到peers和fds, 即下一次也开始监听
peers.add(newPeer);
fds.add(newPeer.getFileDesciptor());
} else {
try {
ZygoteConnection connection = peers.get(i);
final Runnable command = connection.processOneCommand(this);
if (mIsForkChild) {
// We're in the child. We should always have a command to run at this
// stage if processOneCommand hasn't called "exec".
if (command == null) {
throw new IllegalStateException("command == null");
}
return command;
} else {//其它通信连接收到数据
// We're in the server - we should never have any commands to run.
if (command != null) {
throw new IllegalStateException("command != null");
}
// We don't know whether the remote side of the socket was closed or
// not until we attempt to read from it from processOneCommand. This shows up as
// a regular POLLIN event in our regular processing loop.
if (connection.isClosedByPeer()) {
connection.closeSocket();
peers.remove(i);
fds.remove(i);
}
}
} catch (Exception e) {
if (!mIsForkChild) {
// We're in the server so any exception here is one that has taken place
// pre-fork while processing commands or reading / writing from the
// control socket. Make a loud noise about any such exceptions so that
// we know exactly what failed and why.
Slog.e(TAG, "Exception executing zygote command: ", e);
// Make sure the socket is closed so that the other end knows immediately
// that something has gone wrong and doesn't time out waiting for a
// response.
ZygoteConnection conn = peers.remove(i);
conn.closeSocket();
fds.remove(i);
} else {
// We're in the child so any exception caught here has happened post
// fork and before we execute ActivityThread.main (or any other main()
// method). Log the details of the exception and bring down the process.
Log.e(TAG, "Caught post-fork exception in child process.", e);
throw e;
}
}
}
}
}
}
从上面代码可知,初始时fds中仅有server socket,因此当有数据到来时,将执行i等于0的分支。此时,显然是需要创建新的通信连接,因此acceptCommandPeer将被调用。
我们看看acceptCommandPeer函数:
private ZygoteConnection acceptCommandPeer(String abiList) {
try {
return createNewConnection(mServerSocket.accept(), abiList);
} catch (IOException ex) {
throw new RuntimeException(
"IOException during accept()", ex);
}
}
从上面的代码,可以看出acceptCommandPeer调用了server socket的accpet函数。于是当新的连接建立时,zygote将会创建出一个新的socket与其通信,并将该socket加入到fds中。因此,一旦通信连接建立后,fds中将会包含有多个socket。
当poll监听到这一组sockets上有数据到来时,就会从阻塞中恢复。于是,我们需要判断到底是哪个socket收到了数据。
在runSelectLoop中采用倒序的方式轮询。由于server socket第一个被加入到fds,因此最后轮询到的socket才需要处理新建连接的操作;其它socket收到数据时,仅需要调用zygoteConnection的runonce函数执行数据对应的操作。若一个连接处理完所有对应消息后,该连接对应的socket和连接等将被移除。
在来看看zygoteinit.java中的forksystemserver函数中的handle
private static Runnable handleSystemServerProcess(ZygoteConnection.Arguments parsedArgs) {
// set umask to 0077 so new files and directories will default to owner-only permissions.
Os.umask(S_IRWXG | S_IRWXO);
if (parsedArgs.niceName != null) {
Process.setArgV0(parsedArgs.niceName);
}
final String systemServerClasspath = Os.getenv("SYSTEMSERVERCLASSPATH");
if (systemServerClasspath != null) {
performSystemServerDexOpt(systemServerClasspath);
// Capturing profiles is only supported for debug or eng builds since selinux normally
// prevents it.
boolean profileSystemServer = SystemProperties.getBoolean(
"dalvik.vm.profilesystemserver", false);
if (profileSystemServer && (Build.IS_USERDEBUG || Build.IS_ENG)) {
try {
File profileDir = Environment.getDataProfilesDePackageDirectory(
Process.SYSTEM_UID, "system_server");
File profile = new File(profileDir, "primary.prof");
profile.getParentFile().mkdirs();
profile.createNewFile();
String[] codePaths = systemServerClasspath.split(":");
VMRuntime.registerAppInfo(profile.getPath(), codePaths);
} catch (Exception e) {
Log.wtf(TAG, "Failed to set up system server profile", e);
}
}
}
if (parsedArgs.invokeWith != null) {
String[] args = parsedArgs.remainingArgs;
// If we have a non-null system server class path, we'll have to duplicate the
// existing arguments and append the classpath to it. ART will handle the classpath
// correctly when we exec a new process.
if (systemServerClasspath != null) {
String[] amendedArgs = new String[args.length + 2];
amendedArgs[0] = "-cp";
amendedArgs[1] = systemServerClasspath;
System.arraycopy(args, 0, amendedArgs, 2, args.length);
args = amendedArgs;
}
WrapperInit.execApplication(parsedArgs.invokeWith,
parsedArgs.niceName, parsedArgs.targetSdkVersion,
VMRuntime.getCurrentInstructionSet(), null, args);
throw new IllegalStateException("Unexpected return from WrapperInit.execApplication");
} else {
ClassLoader cl = null;
if (systemServerClasspath != null) {
cl = createPathClassLoader(systemServerClasspath, parsedArgs.targetSdkVersion);
Thread.currentThread().setContextClassLoader(cl);
}
/*
* Pass the remaining arguments to SystemServer.
*/
return ZygoteInit.zygoteInit(parsedArgs.targetSdkVersion, parsedArgs.remainingArgs, cl);
}
/* should never reach here */
}
我们看到程序将sdk版本,分裂进程的参数等等参数传入,然后调用zygoteinit函数,我们跟踪一下该函数
public static final Runnable zygoteInit(int targetSdkVersion, String[] argv, ClassLoader classLoader) {
if (RuntimeInit.DEBUG) {
Slog.d(RuntimeInit.TAG, "RuntimeInit: Starting application from zygote");
}
Trace.traceBegin(Trace.TRACE_TAG_ACTIVITY_MANAGER, "ZygoteInit");
RuntimeInit.redirectLogStreams();
RuntimeInit.commonInit();
ZygoteInit.nativeZygoteInit();
return RuntimeInit.applicationInit(targetSdkVersion, argv, classLoader);
}
该函数最终调用的是runtimeinit中的applicationinit函数
protected static Runnable applicationInit(int targetSdkVersion, String[] argv,
ClassLoader classLoader) {
// If the application calls System.exit(), terminate the process
// immediately without running any shutdown hooks. It is not possible to
// shutdown an Android application gracefully. Among other things, the
// Android runtime shutdown hooks close the Binder driver, which can cause
// leftover running threads to crash before the process actually exits.
nativeSetExitWithoutCleanup(true);
// We want to be fairly aggressive about heap utilization, to avoid
// holding on to a lot of memory that isn't needed.
VMRuntime.getRuntime().setTargetHeapUtilization(0.75f);
VMRuntime.getRuntime().setTargetSdkVersion(targetSdkVersion);
final Arguments args = new Arguments(argv);
// The end of of the RuntimeInit event (see #zygoteInit).
Trace.traceEnd(Trace.TRACE_TAG_ACTIVITY_MANAGER);
// Remaining arguments are passed to the start class's static main
return findStaticMain(args.startClass, args.startArgs, classLoader);
}
private static Runnable findStaticMain(String className, String[] argv,
ClassLoader classLoader) {
Class> cl;
try {
cl = Class.forName(className, true, classLoader);
} catch (ClassNotFoundException ex) {
throw new RuntimeException(
"Missing class when invoking static main " + className,
ex);
}
Method m;
try {
m = cl.getMethod("main", new Class[] { String[].class });
} catch (NoSuchMethodException ex) {
throw new RuntimeException(
"Missing static main on " + className, ex);
} catch (SecurityException ex) {
throw new RuntimeException(
"Problem getting static main on " + className, ex);
}
int modifiers = m.getModifiers();
if (! (Modifier.isStatic(modifiers) && Modifier.isPublic(modifiers))) {
throw new RuntimeException(
"Main method is not public and static on " + className);
}
以上就是整个启动过程,参考被人的代码,并适当补充一部分内容。以供学习引用,有不当之处请大家见谅。