GSI测试是google在android8.0以后新增的一项测试项,也是为了测试一些兼容性的东西,不多描述。测试之前是要把system.img通过fastboot刷成google提供的原生的gsi测试包(即上层全是用google的)。
我们在开始测试后,因为有测试项是需要重启手机的。重启后却无法开机了,查看相关报错的log如下:
01-01 08:00:00.276 D/WifiApConfigStore( 1204): 2G band allowed channels are:1,6,11
01-01 08:00:00.335 E/IpManager.wlan0( 1204): ERROR Failed to disable IPv6: java.lang.IllegalStateException: command '1 interface ipv6 wlan0 disable' failed with '400 1 Failed to change IPv6 state (No such file or directory)'
01-01 08:00:00.335 E/HalDeviceManager( 1204): isSupported: called but mServiceManager is null!?
01-01 08:00:00.335 I/WifiNative-wlan0( 1204): Vendor HAL not supported, Ignore stop...
01-01 08:00:00.335 D/WificondControl( 1204): tearing down interfaces in wificond
01-01 08:00:00.337 D/CommandListener( 782): Clearing all IP addresses on wlan0
01-01 08:00:00.342 D/WifiController( 1204): isAirplaneModeOn = false, isWifiEnabled = false, isScanningAvailable = false
01-01 08:00:00.345 I/WifiService( 1204): getVerboseLoggingLevel uid=1000
01-01 08:00:00.345 W/libc ( 1204): Unable to set property "log.tag.WifiHAL" to "D": connection failed; errno=111 (Connection refused)
01-01 08:00:00.345 E/System ( 1204): ******************************************
01-01 08:00:00.346 E/System ( 1204): ************ Failure starting system services
01-01 08:00:00.346 E/System ( 1204): java.lang.RuntimeException: Failed to create service com.android.server.wifi.WifiService: service constructor threw an exception
01-01 08:00:00.346 E/System ( 1204): at com.android.server.SystemServiceManager.startService(SystemServiceManager.java:107)
01-01 08:00:00.346 E/System ( 1204): at com.android.server.SystemServiceManager.startService(SystemServiceManager.java:70)
01-01 08:00:00.346 E/System ( 1204): at com.android.server.SystemServer.startOtherServices(SystemServer.java:1072)
01-01 08:00:00.346 E/System ( 1204): at com.android.server.SystemServer.run(SystemServer.java:391)
01-01 08:00:00.346 E/System ( 1204): at com.android.server.SystemServer.main(SystemServer.java:267)
01-01 08:00:00.346 E/System ( 1204): at java.lang.reflect.Method.invoke(Native Method)
01-01 08:00:00.346 E/System ( 1204): at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:438)
01-01 08:00:00.346 E/System ( 1204): at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:787)
01-01 08:00:00.346 E/System ( 1204): Caused by: java.lang.reflect.InvocationTargetException
01-01 08:00:00.346 E/System ( 1204): at java.lang.reflect.Constructor.newInstance0(Native Method)
01-01 08:00:00.346 E/System ( 1204): at java.lang.reflect.Constructor.newInstance(Constructor.java:334)
01-01 08:00:00.346 E/System ( 1204): at com.android.server.SystemServiceManager.startService(SystemServiceManager.java:96)
01-01 08:00:00.346 E/System ( 1204): ... 7 more
01-01 08:00:00.346 E/System ( 1204): Caused by: java.lang.RuntimeException: failed to set system property
01-01 08:00:00.346 E/System ( 1204): at android.os.SystemProperties.native_set(Native Method)
01-01 08:00:00.346 E/System ( 1204): at android.os.SystemProperties.set(SystemProperties.java:155)
01-01 08:00:00.346 E/System ( 1204): at com.android.server.wifi.SystemPropertyService.set(SystemPropertyService.java:28)
01-01 08:00:00.346 E/System ( 1204): at com.android.server.wifi.WifiStateMachine.configureVerboseHalLogging(WifiStateMachine.java:1254)
01-01 08:00:00.346 E/System ( 1204): at com.android.server.wifi.WifiStateMachine.enableVerboseLogging(WifiStateMachine.java:1236)
01-01 08:00:00.346 E/System ( 1204): at com.android.server.wifi.WifiServiceImpl.enableVerboseLoggingInternal(WifiServiceImpl.java:2395)
01-01 08:00:00.346 E/System ( 1204): at com.android.server.wifi.WifiServiceImpl.(WifiServiceImpl.java:461)
01-01 08:00:00.346 E/System ( 1204): at com.android.server.wifi.WifiService.(WifiService.java:32)
从log的表现看,第一眼就看到了此错误:Unable to set property “log.tag.WifiHAL” to “D”: connection failed;
一个property属性log.tag.WifiHAL无法被设置成D,原因是connection failed。
首先怀疑的就是权限问题,但是检查了sepolicy相关地方,都没有异常。并且此问题,在不刷入google的system.img是完全没有问题的,似乎问题有些无解,先来看看报错的地方的代码:
// sourc code
/bionic/libc/bionic/system_properties.cpp
1278 // Use proper protocol
1279 PropertyServiceConnection connection;
1280 if (!connection.IsValid()) {
1281 errno = connection.GetLastError();
1282 __libc_format_log(ANDROID_LOG_WARN,
1283 "libc",
1284 "Unable to set property \"%s\" to \"%s\": connection failed; errno=%d (%s)",
1285 key,
1286 value,
1287 errno,
1288 strerror(errno));
1289 return -1;
1290 }
501 bool IsValid() {
502 return socket_ != -1;
503 }
480 PropertyServiceConnection() : last_error_(0) {
481 socket_ = ::socket(AF_LOCAL, SOCK_STREAM | SOCK_CLOEXEC, 0);
482 if (socket_ == -1) {
483 last_error_ = errno;
484 return;
485 }
486
487 const size_t namelen = strlen(property_service_socket);
488 sockaddr_un addr;
489 memset(&addr, 0, sizeof(addr));
490 strlcpy(addr.sun_path, property_service_socket, sizeof(addr.sun_path));
491 addr.sun_family = AF_LOCAL;
492 socklen_t alen = namelen + offsetof(sockaddr_un, sun_path) + 1;
493
494 if (TEMP_FAILURE_RETRY(connect(socket_, reinterpret_cast(&addr), alen)) == -1) {
495 close(socket_);
496 socket_ = -1;
497 last_error_ = errno;
498 }
499 }
这时候同步拉进来驱动的同事,在socket.c中添加一些log和callstack分析是否有异常。经过漫长时间的build版本和分析,socket也正常。
驱动在研究socket方向的问题。我就先继续仔细检查log,并且也和正常的其他机型的log做比较,发现异常时候会多这么一些奇怪的信息:
01-01 00:12:55.699 1346 1346 F libc : CANNOT LINK EXECUTABLE "/vendor/bin/hw/[email protected]": library "[email protected]" not found
01-01 00:12:55.706 1346 1346 F libc : Fatal signal 6 (SIGABRT), code -6 in tid 1346 (vendor.qti.hard), pid 1346 (vendor.qti.hard)
01-01 00:12:55.734 1352 1352 I crash_dump32: obtaining output fd from tombstoned, type: kDebuggerdTombstone
01-01 00:12:55.741 701 701 I /system/bin/tombstoned: received crash request for pid 1346
01-01 00:12:55.742 1352 1352 I crash_dump32: performing dump of process 1346 (target tid = 1346)
01-01 00:12:55.742 1352 1352 F DEBUG : *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
01-01 00:12:55.742 1352 1352 F DEBUG : Build fingerprint: 'Android/aosp_arm_a/generic_arm_a:8.1.0/OC-MR1/4498750:userdebug/test-keys'
01-01 00:12:55.742 1352 1352 F DEBUG : Revision: '0'
01-01 00:12:55.742 1352 1352 F DEBUG : ABI: 'arm'
01-01 00:12:55.742 1352 1352 F DEBUG : pid: 1346, tid: 1346, name: vendor.qti.hard >>> /vendor/bin/hw/[email protected] <<<
01-01 00:12:55.742 1352 1352 F DEBUG : signal 6 (SIGABRT), code -6 (SI_TKILL), fault addr --------
01-01 00:12:55.743 1352 1352 F DEBUG : Abort message: 'CANNOT LINK EXECUTABLE "/vendor/bin/hw/[email protected]": library "[email protected]" not found'
01-01 00:12:55.743 1352 1352 F DEBUG : r0 00000000 r1 00000542 r2 00000006 r3 00000008
01-01 00:12:55.743 1352 1352 F DEBUG : r4 00000542 r5 00000542 r6 be9246fc r7 0000010c
01-01 00:12:55.743 1352 1352 F DEBUG : r8 00000000 r9 be924720 sl be924990 fp 00000000
01-01 00:12:55.743 1352 1352 F DEBUG : ip be92599c sp be9246e8 lr b0ae9aa7 pc b0ae7fe0 cpsr 20000030
01-01 00:12:55.746 1352 1352 F DEBUG :
01-01 00:12:55.746 1352 1352 F DEBUG : backtrace:
01-01 00:12:55.746 1352 1352 F DEBUG : #00 pc 0005efe0 /system/bin/linker (__dl_abort+63)
01-01 00:12:55.746 1352 1352 F DEBUG : #01 pc 000101cb /system/bin/linker (__dl___linker_init+2794)
01-01 00:12:55.746 1352 1352 F DEBUG : #02 pc 00014f08 /system/bin/linker (_start+4)
在源码中找一下这个service,在vendor/qcom/proproetary/fastmmi/qmmi/hidl/[email protected]中启动。因为VTS(GSI是在VTS环境下用google img测试CTS用例)本就是测试hidl兼容性的,所以看到这一块就更加怀疑。尝试去掉这个服务,问题不再复现。
咨询了Qcom此服务是否能去掉,Qcom表示不影响。但是问题原因还是需要找一下。check一下源码编译文件,发现缺失的[email protected]编译在了system/lib下,而我们测试GSI是会把system.img替换成google的system.img当然不会包括高通编进来的库。引起上述问题。