今天的任务从早上搞到现在,才算完成,本来以为比较简单,就是出一个Demo,将实际业务的逻辑封装成so库,放在系统层,对上层不暴露任何东西,中间就一层JNI调用,原本以为很简单,但是由于自己的一个坏习惯,折腾了一天。将处理过程详细记录下来,同时吸取教训,也希望能给大家带来帮助。
JNI的封装很简单,代码就不上了,在Android7.0源码目录下添加共享库的过程,也有其他网友提到了,最重要的就是修改public.libraries.txt文件。将自己的要添加的目标库加到该文件中,如下图,可以看到该文件存在三处,system目录下两处,vendor下一处,最终编译成img文件烧录到手机上,分别是system/etc/public.libraries.txt、system/vendor/etc/public.libraries.txt、vendor/etc/public.libraries.txt,只有在这里声明过,应用才能调用,否则会因为命名空间的问题导致应用层调用失败。
自己先完成了业务逻辑的代码,然后编译成so库,push到手机上,但是发现一直停留在Android字样的动画开机界面,无法正常开机,于是开始抓log分析,正常和异常日志分别如下:
1、正常开机日志,可以看到,资料加载完成后,马上会执行startSystemServer方法启动system_server进程,相应的日志都有出现,当system_server启动成功后,界面也就快显示出来了。
Line 1694: 01-03 22:23:11.820 361 361 D Zygote : begin preload
Line 1695: 01-03 22:23:11.820 361 361 I Zygote : Installing ICU cache reference pinning...
Line 1696: 01-03 22:23:11.820 361 361 I Zygote : Preloading ICU data...
Line 1867: 01-03 22:23:12.087 361 361 I Zygote : Preloading classes...
Line 1868: 01-03 22:23:12.094 361 361 W Zygote : Class not found for preloading: [Landroid.view.Display$ColorTransform;
Line 2304: 01-03 22:23:13.874 361 361 W Zygote : Class not found for preloading: android.view.Display$ColorTransform
Line 2305: 01-03 22:23:13.875 361 361 W Zygote : Class not found for preloading: android.view.Display$ColorTransform$1
Line 2347: 01-03 22:23:14.546 361 361 I Zygote : ...preloaded 4158 classes in 2459ms.
Line 2375: 01-03 22:23:14.734 361 361 I Zygote : Preloading resources...
Line 2423: 01-03 22:23:15.081 361 361 I Zygote : ...preloaded 114 resources in 347ms.
Line 2426: 01-03 22:23:15.097 361 361 I Zygote : ...preloaded 41 resources in 17ms.
Line 2439: 01-03 22:23:15.183 361 361 I Zygote : Preloading shared libraries...
Line 2444: 01-03 22:23:15.227 361 361 I Zygote : Uninstalled ICU cache reference pinning...
Line 2445: 01-03 22:23:15.239 361 361 I Zygote : Installed AndroidKeyStoreProvider in 12ms.
Line 2446: 01-03 22:23:15.264 361 361 I Zygote : Warmed up JCA providers in 26ms.
Line 2447: 01-03 22:23:15.265 361 361 D Zygote : end preload
Line 2462: 01-03 22:23:15.448 361 361 I Zygote : System server process 2252 has been created
Line 2463: 01-03 22:23:15.455 361 361 I Zygote : Accepting command socket connections
Line 10398: 01-03 22:25:40.307 2252 2574 I Zygote : Process: zygote socket opened, supported ABIS: armeabi-v7a,armeabi
2、异常开机日志,从日志中明显可以看到,Zygote进程在加载完资料后,根据没有成功启动system_server,导致无法开机。
Line 1743: 01-03 22:16:43.383 347 347 D Zygote : begin preload
Line 1744: 01-03 22:16:43.383 347 347 I Zygote : Installing ICU cache reference pinning...
Line 1745: 01-03 22:16:43.384 347 347 I Zygote : Preloading ICU data...
Line 1914: 01-03 22:16:43.600 347 347 I Zygote : Preloading classes...
Line 1915: 01-03 22:16:43.607 347 347 W Zygote : Class not found for preloading: [Landroid.view.Display$ColorTransform;
Line 2347: 01-03 22:16:45.403 347 347 W Zygote : Class not found for preloading: android.view.Display$ColorTransform
Line 2348: 01-03 22:16:45.404 347 347 W Zygote : Class not found for preloading: android.view.Display$ColorTransform$1
Line 2376: 01-03 22:16:46.015 347 347 I Zygote : ...preloaded 4158 classes in 2414ms.
Line 2396: 01-03 22:16:46.191 347 347 I Zygote : Preloading resources...
Line 2438: 01-03 22:16:46.567 347 347 I Zygote : ...preloaded 114 resources in 375ms.
Line 2441: 01-03 22:16:46.583 347 347 I Zygote : ...preloaded 41 resources in 17ms.
Line 2472: 01-03 22:16:46.682 347 347 I Zygote : Preloading shared libraries...
Line 2477: 01-03 22:16:46.740 347 347 I Zygote : Uninstalled ICU cache reference pinning...
Line 2478: 01-03 22:16:46.757 347 347 I Zygote : Installed AndroidKeyStoreProvider in 16ms.
Line 2479: 01-03 22:16:46.780 347 347 I Zygote : Warmed up JCA providers in 24ms.
Line 2480: 01-03 22:16:46.781 347 347 D Zygote : end preload
启动system_server的代码是从frameworks\base\cmds\app_process\app_main.cpp文件中的main方法开始的,再往上就是init.rc角本了,main方法中组装启动参数,然后调用父类AndroidRuntime的start方法,代码如下:
int main(int argc, char* const argv[])
{
if (prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0) < 0) {
// Older kernels don't understand PR_SET_NO_NEW_PRIVS and return
// EINVAL. Don't die on such kernels.
if (errno != EINVAL) {
LOG_ALWAYS_FATAL("PR_SET_NO_NEW_PRIVS failed: %s", strerror(errno));
return 12;
}
}
AppRuntime runtime(argv[0], computeArgBlockSize(argc, argv));
// Process command line arguments
// ignore argv[0]
argc--;
argv++;
// Everything up to '--' or first non '-' arg goes to the vm.
//
// The first argument after the VM args is the "parent dir", which
// is currently unused.
//
// After the parent dir, we expect one or more the following internal
// arguments :
//
// --zygote : Start in zygote mode
// --start-system-server : Start the system server.
// --application : Start in application (stand alone, non zygote) mode.
// --nice-name : The nice name for this process.
//
// For non zygote starts, these arguments will be followed by
// the main class name. All remaining arguments are passed to
// the main method of this class.
//
// For zygote starts, all remaining arguments are passed to the zygote.
// main function.
//
// Note that we must copy argument string values since we will rewrite the
// entire argument block when we apply the nice name to argv0.
int i;
for (i = 0; i < argc; i++) {
if (argv[i][0] != '-') {
break;
}
if (argv[i][1] == '-' && argv[i][2] == 0) {
++i; // Skip --.
break;
}
runtime.addOption(strdup(argv[i]));
}
// Parse runtime arguments. Stop at first unrecognized option.
bool zygote = false;
bool startSystemServer = false;
bool application = false;
String8 niceName;
String8 className;
++i; // Skip unused "parent dir" argument.
while (i < argc) {
const char* arg = argv[i++];
if (strcmp(arg, "--zygote") == 0) {
zygote = true;
niceName = ZYGOTE_NICE_NAME;
} else if (strcmp(arg, "--start-system-server") == 0) {
startSystemServer = true;
} else if (strcmp(arg, "--application") == 0) {
application = true;
} else if (strncmp(arg, "--nice-name=", 12) == 0) {
niceName.setTo(arg + 12);
} else if (strncmp(arg, "--", 2) != 0) {
className.setTo(arg);
break;
} else {
--i;
break;
}
}
Vector args;
if (!className.isEmpty()) {
// We're not in zygote mode, the only argument we need to pass
// to RuntimeInit is the application argument.
//
// The Remainder of args get passed to startup class main(). Make
// copies of them before we overwrite them with the process name.
args.add(application ? String8("application") : String8("tool"));
runtime.setClassNameAndArgs(className, argc - i, argv + i);
} else {
// We're in zygote mode.
maybeCreateDalvikCache();
if (startSystemServer) {
args.add(String8("start-system-server"));
}
char prop[PROP_VALUE_MAX];
if (property_get(ABI_LIST_PROPERTY, prop, NULL) == 0) {
LOG_ALWAYS_FATAL("app_process: Unable to determine ABI list from property %s.",
ABI_LIST_PROPERTY);
return 11;
}
String8 abiFlag("--abi-list=");
abiFlag.append(prop);
args.add(abiFlag);
// In zygote mode, pass all remaining arguments to the zygote
// main() method.
for (; i < argc; ++i) {
args.add(String8(argv[i]));
}
}
if (!niceName.isEmpty()) {
runtime.setArgv0(niceName.string());
set_process_name(niceName.string());
}
if (zygote) {
runtime.start("com.android.internal.os.ZygoteInit", args, zygote);
} else if (className) {
runtime.start("com.android.internal.os.RuntimeInit", args, zygote);
} else {
fprintf(stderr, "Error: no class name or --zygote supplied.\n");
app_usage();
LOG_ALWAYS_FATAL("app_process: no class name or --zygote supplied.");
return 10;
}
}
这里的runtime是一个AppRuntime类型的对象,它是AndroidRuntime的子类,start方法也是调用父类的,AndroidRuntime类的start方法代码如下:
void AndroidRuntime::start(const char* className, const Vector& options, bool zygote)
{
ALOGD(">>>>>> START %s uid %d <<<<<<\n",
className != NULL ? className : "(unknown)", getuid());
static const String8 startSystemServer("start-system-server");
/*
* 'startSystemServer == true' means runtime is obsolete and not run from
* init.rc anymore, so we print out the boot start event here.
*/
for (size_t i = 0; i < options.size(); ++i) {
if (options[i] == startSystemServer) {
/* track our progress through the boot sequence */
const int LOG_BOOT_PROGRESS_START = 3000;
LOG_EVENT_LONG(LOG_BOOT_PROGRESS_START, ns2ms(systemTime(SYSTEM_TIME_MONOTONIC)));
}
}
const char* rootDir = getenv("ANDROID_ROOT");
if (rootDir == NULL) {
rootDir = "/system";
if (!hasDir("/system")) {
LOG_FATAL("No root directory specified, and /android does not exist.");
return;
}
setenv("ANDROID_ROOT", rootDir, 1);
}
//const char* kernelHack = getenv("LD_ASSUME_KERNEL");
//ALOGD("Found LD_ASSUME_KERNEL='%s'\n", kernelHack);
/* start the virtual machine */
JniInvocation jni_invocation;
jni_invocation.Init(NULL);
JNIEnv* env;
if (startVm(&mJavaVM, &env, zygote) != 0) {
return;
}
onVmCreated(env);
/*
* Register android functions.
*/
if (startReg(env) < 0) {
ALOGE("Unable to register all android natives\n");
return;
}
/*
* We want to call main() with a String array with arguments in it.
* At present we have two arguments, the class name and an option string.
* Create an array to hold them.
*/
jclass stringClass;
jobjectArray strArray;
jstring classNameStr;
stringClass = env->FindClass("java/lang/String");
assert(stringClass != NULL);
strArray = env->NewObjectArray(options.size() + 1, stringClass, NULL);
assert(strArray != NULL);
classNameStr = env->NewStringUTF(className);
assert(classNameStr != NULL);
env->SetObjectArrayElement(strArray, 0, classNameStr);
for (size_t i = 0; i < options.size(); ++i) {
jstring optionsStr = env->NewStringUTF(options.itemAt(i).string());
assert(optionsStr != NULL);
env->SetObjectArrayElement(strArray, i + 1, optionsStr);
}
/*
* Start VM. This thread becomes the main thread of the VM, and will
* not return until the VM exits.
*/
char* slashClassName = toSlashClassName(className);
jclass startClass = env->FindClass(slashClassName);
if (startClass == NULL) {
ALOGE("JavaVM unable to locate class '%s'\n", slashClassName);
/* keep going */
} else {
jmethodID startMeth = env->GetStaticMethodID(startClass, "main",
"([Ljava/lang/String;)V");
if (startMeth == NULL) {
ALOGE("JavaVM unable to find main() in '%s'\n", className);
/* keep going */
} else {
env->CallStaticVoidMethod(startClass, startMeth, strArray);
#if 0
if (env->ExceptionCheck())
threadExitUncaughtException(env);
#endif
}
}
free(slashClassName);
ALOGD("Shutting down VM\n");
if (mJavaVM->DetachCurrentThread() != JNI_OK)
ALOGW("Warning: unable to detach main thread\n");
if (mJavaVM->DestroyJavaVM() != 0)
ALOGW("Warning: VM did not shut down cleanly\n");
}
这里就是组装参数,然后通过反射调用到frameworks\base\core\java\com\android\internal\os\ZygoteInit.java类的main方法,ZygoteInit类的main方法的代码如下:
public static void main(String argv[]) {
// Mark zygote start. This ensures that thread creation will throw
// an error.
ZygoteHooks.startZygoteNoThreadCreation();
try {
Trace.traceBegin(Trace.TRACE_TAG_DALVIK, "ZygoteInit");
RuntimeInit.enableDdms();
// Start profiling the zygote initialization.
SamplingProfilerIntegration.start();
boolean startSystemServer = false;
String socketName = "zygote";
String abiList = null;
for (int i = 1; i < argv.length; i++) {
if ("start-system-server".equals(argv[i])) {
startSystemServer = true;
} else if (argv[i].startsWith(ABI_LIST_ARG)) {
abiList = argv[i].substring(ABI_LIST_ARG.length());
} else if (argv[i].startsWith(SOCKET_NAME_ARG)) {
socketName = argv[i].substring(SOCKET_NAME_ARG.length());
} else {
throw new RuntimeException("Unknown command line argument: " + argv[i]);
}
}
if (abiList == null) {
throw new RuntimeException("No ABI list supplied.");
}
registerZygoteSocket(socketName);
Trace.traceBegin(Trace.TRACE_TAG_DALVIK, "ZygotePreload");
EventLog.writeEvent(LOG_BOOT_PROGRESS_PRELOAD_START,
SystemClock.uptimeMillis());
preload();
EventLog.writeEvent(LOG_BOOT_PROGRESS_PRELOAD_END,
SystemClock.uptimeMillis());
Trace.traceEnd(Trace.TRACE_TAG_DALVIK);
// Finish profiling the zygote initialization.
SamplingProfilerIntegration.writeZygoteSnapshot();
// Do an initial gc to clean up after startup
Trace.traceBegin(Trace.TRACE_TAG_DALVIK, "PostZygoteInitGC");
gcAndFinalize();
Trace.traceEnd(Trace.TRACE_TAG_DALVIK);
Trace.traceEnd(Trace.TRACE_TAG_DALVIK);
// Disable tracing so that forked processes do not inherit stale tracing tags from
// Zygote.
Trace.setTracingEnabled(false);
// Zygote process unmounts root storage spaces.
Zygote.nativeUnmountStorageOnInit();
ZygoteHooks.stopZygoteNoThreadCreation();
if (startSystemServer) {
startSystemServer(abiList, socketName);
}
Log.i(TAG, "Accepting command socket connections");
runSelectLoop(abiList);
closeServerSocket();
} catch (MethodAndArgsCaller caller) {
caller.run();
} catch (Throwable ex) {
Log.e(TAG, "Zygote died with exception", ex);
closeServerSocket();
throw ex;
}
}
这里和日志上对应的就是调用preload()方法,它会去加载所有的类、资源、动态库,日志也都有打印,加载完成后,因为在app_main.cpp文件中有封装start-system-server参数,所以startSystemServer值为true,继续调用startSystemServer(abiList, socketName)去启动system_server,我的问题也就是在这里产生的,于是,在这里加日志,发现这里的逻辑是正常的,于是继续往下查,到Zygote类中的forkSystemServer方法的调用也都是正常,但是for的方法没有返回,fork真正的逻辑是在frameworks\base\core\jni\com_android_internal_os_Zygote.cpp文件中的ForkAndSpecializeCommon方法中调用完成的,我之前的博客中也都有提及,但是这里往上没办法加日志了,加了日志编译不通过,没办法只能往回找。
开始将自己cpp文件中的逻辑全部删掉,但是编译后仍然无法正常启动,因为我就是搞简单Demo,所以就两个文件,一个cpp,一个mk,于是继续排查mk,最终在这里找到了答案,原来我的mk中依赖了另一个so库,而那个so库cpp文件中定义了一个全局变量,该变量在so加加载时就会构造,而构造方法中写了部分和串口通讯的逻辑,然而问题出现的场景中,系统还未启动,而且我的设备根本没有串口,导致fork无返回,部分代码截图如下:
注释掉和串口通讯的那行逻辑,就可以正常启动了。看到下面Zygote已经开始fork其他进程,说明我们已经成功了,马上松口气。
问题解决了,得到什么教训呢?就是我们以后要实现一个功能,如果有依赖时,先从最简单的开始,什么都不要依赖,条件超少越好,最简单的能实现,然后再往上加逻辑,这样即使出问题,也比较容易排查。如果我们一开始就把业务逻辑加上去,依赖的东西比较多,那么此时出问题的话, 排查起来方向也很多,难度就大了很多,像我这样,如果按照前面的思路去查Zygote,估计这问题肯定把我折腾死了。
非常深刻的教训,希望大家能有所启发。夜深了,该休息了!!