1、目前拿到编号为066E的故障机器,通过Android logcat日志分析初始报错现象是Android系统的zygote的AndroidRuntime在不断重启。
具体现象如下:
1. 07-26 07:48:43.625 2378 2378 D AndroidRuntime: >>>>>> START com.android.internal.os.ZygoteInit uid 0 <<<<<<
2. 07-26 07:48:43.634 2378 2378 D AndroidRuntime: CheckJNI is OFF
3. 07-26 07:48:43.635 2378 2378 I art : option[0]=-Xzygote
4. 07-26 07:48:43.636 2378 2378 I art : option[1]=-Xstacktracefile:/data/anr/traces.txt
5. 07-26 07:48:43.636 2378 2378 I art : option[2]=exit
6. 07-26 07:48:43.636 2378 2378 I art : option[3]=vfprintf
7. 07-26 07:48:43.636 2378 2378 I art : option[4]=sensitiveThread
8. 07-26 07:48:43.636 2378 2378 I art : option[5]=-verbose:gc
9. 07-26 07:48:43.636 2378 2378 I art : option[6]=-Xms16m
10. ...................
11. 07-26 07:48:43.625 2378 2378 D AndroidRuntime: >>>>>> START com.android.internal.os.ZygoteInit uid 0 <<<<<<
12. 07-26 07:48:43.634 2378 2378 D AndroidRuntime: CheckJNI is OFF
13. 07-26 07:48:43.635 2378 2378 I art : option[0]=-Xzygote
14. 07-26 07:48:43.636 2378 2378 I art : option[1]=-Xstacktracefile:/data/anr/traces.txt
15. 07-26 07:48:43.636 2378 2378 I art : option[2]=exit
16. 07-26 07:48:43.636 2378 2378 I art : option[3]=vfprintf
17. 07-26 07:48:43.636 2378 2378 I art : option[4]=sensitiveThread
18. 07-26 07:48:43.636 2378 2378 I art : option[5]=-verbose:gc
19. 07-26 07:48:43.636 2378 2378 I art : option[6]=-Xms16m
20. ..........
21. 07-26 07:48:43.625 2378 2378 D AndroidRuntime: >>>>>> START com.android.internal.os.ZygoteInit uid 0 <<<<<<
22. 07-26 07:48:43.634 2378 2378 D AndroidRuntime: CheckJNI is OFF
23. 07-26 07:48:43.635 2378 2378 I art : option[0]=-Xzygote
24. 07-26 07:48:43.636 2378 2378 I art : option[1]=-Xstacktracefile:/data/anr/traces.txt
25. 07-26 07:48:43.636 2378 2378 I art : option[2]=exit
2、AndroidRuntime无限重启原因是ServiceManager启动PMS失败 ,重要错误提示System : java.lang.RuntimeException: There must be exactly one installer; found []意思是没有发现PackageInstaller apk
1. PackageManager: reconcileAppsData finished 112 packages
2. 07-26 07:48:47.591 2471 2471 E PackageManager: There should probably be a verifier, but, none were found
3. 07-26 07:48:47.592 2471 2471 E System : ******************************************
4. 07-26 07:48:47.593 2471 2471 E System : ************ Failure starting system services
5. 07-26 07:48:47.593 2471 2471 E System : java.lang.RuntimeException: There must be exactly one installer; found []
6. 07-26 07:48:47.593 2471 2471 E System : at com.android.server.pm.PackageManagerService.getRequiredInstallerLPr(PackageManagerService.java:2875)
7. 07-26 07:48:47.593 2471 2471 E System : at com.android.server.pm.PackageManagerService.(PackageManagerService.java:2749)
8. 07-26 07:48:47.593 2471 2471 E System : at com.android.server.pm.PackageManagerService.main(PackageManagerService.java:2005)
9. 07-26 07:48:47.593 2471 2471 E System : at com.android.server.SystemServer.startBootstrapServices(SystemServer.java:486)
10. 07-26 07:48:47.593 2471 2471 E System : at com.android.server.SystemServer.run(SystemServer.java:346)
11. 07-26 07:48:47.593 2471 2471 E System : at com.android.server.SystemServer.main(SystemServer.java:230)
12. 07-26 07:48:47.593 2471 2471 E System : at java.lang.reflect.Method.invoke(Native Method)
13. 07-26 07:48:47.593 2471 2471 E System : at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:889)
14. 07-26 07:48:47.593 2471 2471 E System : at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:779)
15. 07-26 07:48:47.593 2471 2471 D AndroidRuntime: Shutting down VM
16. 07-26 07:48:47.593 2471 2471 E AndroidRuntime: *** FATAL EXCEPTION IN SYSTEM PROCESS: main
17. 07-26 07:48:47.593 2471 2471 E AndroidRuntime: java.lang.RuntimeException: There must be exactly one installer; found []
18. 07-26 07:48:47.593 2471 2471 E AndroidRuntime: at com.android.server.pm.PackageManagerService.getRequiredInstallerLPr(PackageManagerService.java:2875)
19. 07-26 07:48:47.593 2471 2471 E AndroidRuntime: at com.android.server.pm.PackageManagerService.(PackageManagerService.java:2749)
20. 07-26 07:48:47.593 2471 2471 E AndroidRuntime: at com.android.server.pm.PackageManagerService.main(PackageManagerService.java:2005)
21. 07-26 07:48:47.593 2471 2471 E AndroidRuntime: at com.android.server.SystemServer.startBootstrapServices(SystemServer.java:486)
22. 07-26 07:48:47.593 2471 2471 E AndroidRuntime: at com.android.server.SystemServer.run(SystemServer.java:346)
23. 07-26 07:48:47.593 2471 2471 E AndroidRuntime: at com.android.server.SystemServer.main(SystemServer.java:230)
24. 07-26 07:48:47.593 2471 2471 E AndroidRuntime: at java.lang.reflect.Method.invoke(Native Method)
25. 07-26 07:48:47.593 2471 2471 E AndroidRuntime: at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:889)
26. 07-26 07:48:47.593 2471 2471 E AndroidRuntime: at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:779)
27. 07-26 07:48:47.594 2471 2471 E AndroidRuntime: Error reporting crash
28. 07-26 07:48:47.594 2471 2471 E AndroidRuntime: java.lang.NullPointerException: Attempt to invoke virtual method 'void android.app.ActivityThread$Profiler.stopProfiling()' on a null object reference
29. 07-26 07:48:47.594 2471 2471 E AndroidRuntime: at android.app.ActivityThread.stopProfiling(ActivityThread.java:4868)
30. 07-26 07:48:47.594 2471 2471 E AndroidRuntime: at com.android.internal.os.RuntimeInit$UncaughtHandler.uncaughtException(RuntimeInit.java:93)
31. 07-26 07:48:47.594 2471 2471 E AndroidRuntime: at java.lang.ThreadGroup.uncaughtException(ThreadGroup.java:1068)
32. 07-26 07:48:47.594 2471 2471 E AndroidRuntime: at java.lang.ThreadGroup.uncaughtException(ThreadGroup.java:1063)
33. 07-26 07:48:47.594 2471 2471 I Process : Sending signal. PID: 2471 SIG: 9
34. 07-26 07:48:47.637 488 488 I ServiceManager: service 'batterystats' died
35. 07-26 07:48:47.637 488 488 I ServiceManager: service 'appops' died
36. 07-26 07:48:47.637 488 488 I ServiceManager: service 'power' died
37. 07-26 07:48:47.637 488 488 I ServiceManager: service 'display' died
38. 07-26 07:48:47.637 488 488 I ServiceManager: service 'regionalization' died
Packageinstaller apk是用于apk安装、权限校验、权限赋权等,如果系统不存在Packageinstaller apk ,Android系统会不断重启查找这个apk。
相关函数片段如下:
1. new PackageManagerService() //PMS启动
2. ---> mRequiredInstallerPackage = getRequiredInstallerLPr(); //查找系统中的 Packageinstaller
3. --->getRequiredInstallerLPr{
4. final Intent intent = new Intent(Intent.ACTION_INSTALL_PACKAGE);
5. intent.addCategory(Intent.CATEGORY_DEFAULT);
6. intent.setDataAndType(Uri.fromFile(new File("foo.apk")), PACKAGE_MIME_TYPE);
7. //以上三行设定查找apk的条件
8.
9. final List matches = queryIntentActivitiesInternal(intent, PACKAGE_MIME_TYPE,
10. MATCH_SYSTEM_ONLY | MATCH_DIRECT_BOOT_AWARE | MATCH_DIRECT_BOOT_UNAWARE,
11. UserHandle.USER_SYSTEM); //查询符合设定条件的apk ,即 Packageinstaller
12. if (matches.size() == 1) {
13. ResolveInfo resolveInfo = matches.get(0);
14. if (!resolveInfo.activityInfo.applicationInfo.isPrivilegedApp()) {
15. throw new RuntimeException("The installer must be a privileged app");
16. }
17. return matches.get(0).getComponentInfo().packageName;
18. } else {
19. throw new RuntimeException("There must be exactly one installer; found " + matches);
20. //若获取不到就会抛出异常,禁止系统正常开机 ,这里就是系统报错的出处
21. }
22. }
函数queryIntentActivitiesInternal用于查询系统匹配intent 的apk
1. private @NonNull List queryIntentActivitiesInternal(Intent intent,
2. String resolvedType, int flags, int userId) {
3. if (!sUserManager.exists(userId)) return Collections.emptyList();
4.
5. // 这里的userId == UserHandle.USER_SYSTEM == 0 就是机器第一次初始化的默认user,
6. //如果不存在 USER_SYSTEM 直接返回。现在情况是不存在USER_SYSTEM
7. …….
8. ResolveInfo xpResolveInfo = querySkipCurrentProfileIntents(matchingFilters, intent,
9. resolvedType, flags, userId);
10. …
11. //查询匹配条件的apk
12. }
3、android从4.2开始便添加了多用户功能,其具体的管理者为UserManager. 默认机器第一次开机的用户为system user,Android启动过程会查询UserManager中相关system user的信息。故障机器system user配置文件信息丢失。UserManager无法配置system user。
UserManager 中也就不存在 system user。
UserManager 有关user 管理的文件都是保存在data分区下 user_de 、user 、system
和 system_de文件夹中
正常机器启动UserManager日志输出:
1. 01-09 07:28:27.863 1302 1302 V UserManagerService: Found /data/user_de/0 with serial number 0
2. 01-09 07:28:27.863 1302 1302 V UserManagerService: Found /data/user/0 with serial number 0
3. 01-09 07:28:27.864 1302 1302 V UserManagerService: Found /data/system_de/0 with serial number 0
4. 01-09 07:28:27.864 1302 1302 V UserManagerService: Found /data/system_ce/0 with serial number 0
问题机器UserManager日志输出:
1. 07-26 07:49:13.210 3879 3879 E UserManagerService: Unable to read user 0
日志对比明显看出问题机器无法找到 user 0 ,即 system user。
UserManagerService 有关启动配置 system user的函数:
new UserManagerService()
2. ---> readUserListLP()
3. ---> readUserLP() {
4. --AtomicFile userFile =
5. new AtomicFile(new File(mUsersDir, Integer.toString(id) + XML_SUFFIX));
6. //这里 userFile = /data/system/users/0.xml ,问题机器没有0.xml
7. fis = userFile.openRead();
8. XmlPullParser parser = Xml.newPullParser();
9. parser.setInput(fis, StandardCharsets.UTF_8.name());
10. int type;
11. while ((type = parser.next()) != XmlPullParser.START_TAG
12. && type != XmlPullParser.END_DOCUMENT) {
13. // Skip
14. }
15.
16. if (type != XmlPullParser.START_TAG) {
17. //如果文件不存在 或者无法解析这个文件 就报 如下错误,
18. Slog.e(LOG_TAG, "Unable to read user " + id);
19. return null;
20. }
21. }
4、以上分析都是基于Android 的log 分析,总结大致就是/data分区某些有关UserManagerService文件丢失,致使PMS查询PackageInstaller apk获取为空而异常,PMS异常会使zygote崩溃重启。
5、Kernel 日志中错误最重要的信息是
9.411816] EXT4-fs (mmcblk0p24): mounted filesystem with ordered data mode. Opts: barrier=1,discard
2. [ 9.420235] fs_mgr: __mount(source=/dev/block/bootdevice/by-name/system,target=/system,type=ext4)=0
3. [ 9.620377] EXT4-fs warning (device mmcblk0p50): ext4_clear_journal_err:4669: Filesystem error recorded from previous mount: IO failure
4. [ 9.631529] EXT4-fs warning (device mmcblk0p50): ext4_clear_journal_err:4670: Marking fs in need of filesystem check.
5. [ 9.643172] EXT4-fs (mmcblk0p50): warning: mounting fs with errors, running e2fsck is recommended
6. [ 9.656426] EXT4-fs (mmcblk0p50): recovery complete
7. [ 9.660770] EXT4-fs (mmcblk0p50): mounted filesystem with ordered data mode. Opts: barrier=1,noauto_da_alloc,discard
8. [ 9.671052] fs_mgr: __mount(source=/dev/block/bootdevice/by-name/userdata,target=/data,type=ext4)=0
9. [ 9.683910] EXT4-fs (mmcblk0p27): warning: maximal mount count reached, running e2fsck is recommended
10. [ 9.693249] EXT4-fs (mmcblk0p27): mounted filesystem with ordered data mode. Opts: (null)
11. [ 9.700502] fs_mgr: __mount(source=/dev/block/bootdevice/by-name/factory,target=/factory,type=ext4)=0
12. [ 9.712334] EXT4-fs (mmcblk0p28): warning: maximal mount count reached, running e2fsck is recommended
13. [ 9.721616] EXT4-fs (mmcblk0p28): mounted filesystem with ordered data mode. Opts: (null)
14. [ 9.728915] fs_mgr: __mount(source=/dev/block/bootdevice/by-name/esim,target=/esim,type=ext4)=0
15. [ 9.741321] EXT4-fs (mmcblk0p29): warning: maximal mount count reached, running e2fsck is recommended
16. [ 9.751306] EXT4-fs (mmcblk0p29): mounted filesystem with ordered data mode. Opts: (null)
17. [ 9.758620] fs_mgr: __mount(source=/dev/block/bootdevice/by-name/sdcard,target=/sdcard,type=ext4)=0
上图相关日志提示除了 mmcblk0p24(system)可以正常挂载分区,mmcblk0p50(data)、mmcblk0p27(factory) 、mmcblk0p28(esim)、 mmcblk0p29(sdcard) 挂载都出现文件系统错误。
需要系统使用文件系统工具e2fsck 进行恢复,虽然使用e2fsck恢复,但是下次开机还是会出现当前的错误提示。
出现这种情况的原因有可能是刷机过程中Android文件系统遭到破坏,/data 分区的某些文件丢失。
总结:
翻译机2.0开机故障分析结果:
1、 Android 文件系统在刷机过程中被破坏,/data分区某些文件丢失
2、 /data分区文件/data/system/users/0.xml等丢失,UserManagerService无法启动配置system user
3、 UserManagerService无system user,所以PMS 函数queryIntentActivitiesInternal查询system user失败
4、 UserManagerService不存在system user,所以PMS函数在调用queryIntentActivitiesInternal时候
5、 PMS获取PackageInstaller失败,AndroidRuntime无法正常启动而崩溃
6、 AndroidRuntime崩溃,导致zygote不断重启
7、 如果继续追寻文件系统被破坏的问题,需要ODM公司更专业的分析。