Android的BUG(四) - Android app的卡死问题

做android,免不了要去运行一些跑分程序,常用的跑分程序有quadrant(象限),nbench,安兔兔等。作为系统工程师,对这些跑分程序都非常的不屑,这个只能是一个不客观的参考,但客户都喜欢拿这个比较,于是乎,各家各厂都或多或少会针对此做优化(甚至是或直接的作假),这可不是什么好现象,浮夸的厉害,到处放卫星,亩产万斤的,弄的我们这些老实人都很被动。不过这里就不说这些破事了。国内大家常用的跑分程序,就是安兔兔了,但是不知道大家有没有发现,安兔兔跑起来后,有时会卡住不动,除了返回键和触摸操作都没什么用。

出现这一问题时,home键可以退出,继续运行其他应用,说明系统此时还是正常的。Top,vmstat看一下,也没有高CPU/IO占用率的进程,ps –t看一下,也没发现D状态的线程。不过,<span style="ps –t倒是发现了一个现象:

app_47 9691 8787 610076 28768 ffffffff 2aac4424 S com.antutu.ABenchMark
app_47 9706 9691 609060 24476 80061b00 2aac5434 S com.antutu.ABenchMark

出现了同名的进程!这很奇怪~
看这两个进程的父进程, 一个是zygote, 另外一个,则是com.antutu.ABenchMark自己。由此大约可以推断出来,后一个进程是前一个进程fork出来的,fork后还没来得及exec就卡住了。

接上adb,看下两个进程的状态吧:

Process: 9691
(gdb) bt
#0 read () at bionic/libc/arch-mips/syscalls/read.S:13
#1 0x2ad6d7d0 in executeProcess (env=0x1c7e60, javaCommands=0x2c118ab8, javaEnvironment=0x0, javaWorkingDirectory=0x0, inDescriptor=0x2c118af0, outDescriptor=0x2c118b00, 
 errDescriptor=0x2c118b10, redirectErrorStream=0 '\000') at libcore/luni/src/main/native/java_lang_ProcessManager.cpp:165
#2 ProcessManager_exec (env=0x1c7e60, javaCommands=0x2c118ab8, javaEnvironment=0x0, javaWorkingDirectory=0x0, inDescriptor=0x2c118af0, outDescriptor=0x2c118b00, 
 errDescriptor=0x2c118b10, redirectErrorStream=0 '\000') at libcore/luni/src/main/native/java_lang_ProcessManager.cpp:240
#3 0x2b8cccc4 in call_it () at external/libffi/src/mips/o32.S:145
#4 0x0026eb78 in ?? ()

没什么特别的,确实是卡在process的fork中。

再看看process 9706

(gdb) info thread
* 1 Thread 9706 __futex_syscall4 () at bionic/libc/arch-mips/bionic/atomics_mips.S:218
(gdb) bt
#0 __futex_syscall4 () at bionic/libc/arch-mips/bionic/atomics_mips.S:218
#1 0x2aabc288 in _normal_lock (mutex=0x2ab2142c) at bionic/libc/bionic/pthread.c:951
#2 pthread_mutex_lock (mutex=0x2ab2142c) at bionic/libc/bionic/pthread.c:1041
#3 0x2aabf848 in dlmalloc (bytes=4096) at bionic/libc/bionic/dlmalloc.c:4261
#4 0x2aace004 in __smakebuf (fp=0x2ab21598) at bionic/libc/stdio/makebuf.c:62
#5 0x2aad4658 in __swsetup (fp=0x2ab21598) at bionic/libc/stdio/wsetup.c:73
#6 0x2aace6a0 in putc_unlocked (c=48, fp=<value optimized out>) at bionic/libc/stdio/putc.c:46
#7 0x2aace744 in putc (c=48, fp=0x2ab21598) at bionic/libc/stdio/putc.c:64
#8 0x2aae44c0 in cpuacct_add (uid=<value optimized out>) at bionic/libc/bionic/cpuacct.c:55
#9 0x2aae57b0 in fork () at bionic/libc/bionic/fork.c:57
#10 0x2ad6d764 in executeProcess (env=0x1c7e60, javaCommands=0x2c118ab8, javaEnvironment=0x0, javaWorkingDirectory=0x0, inDescriptor=0x2c118af0, outDescriptor=0x2c118b00, 
 errDescriptor=0x2c118b10, redirectErrorStream=0 '\000') at libcore/luni/src/main/native/java_lang_ProcessManager.cpp:92
#11 ProcessManager_exec (env=0x1c7e60, javaCommands=0x2c118ab8, javaEnvironment=0x0, javaWorkingDirectory=0x0, inDescriptor=0x2c118af0, outDescriptor=0x2c118b00, 
 errDescriptor=0x2c118b10, redirectErrorStream=0 '\000') at libcore/luni/src/main/native/java_lang_ProcessManager.cpp:240
#12 0x2b8cccc4 in call_it () at external/libffi/src/mips/o32.S:145
#13 0x0026eb78 in ?? ()
(gdb)

可以看到停在bionic的fork中了,具体函数是: cpuacct_add(getuid()); 中的fprintf。 错误原因从bt上看得到,又是锁的问题。

这个问题找到原因后,解决方法倒是没有花什么精力,直接google一下,问题和解决方法都出来了:

https://code.google.com/p/android/issues/detail?id=19916
Comment 1 by [email protected], Nov 23, 2011
This issue has also been found on ICS. cpuacct_add should not be doing anything that calls malloc() or free(). Proposed fixes are here:
http://review.omapzoom.org/16579
http://review.omapzoom.org/16573


现在越来越多的apk,会偷偷的fork进程,执行系统中的命令或dump调试信息,甚至如skype,会一下fork很多自己写的native服务,看着总归不是很爽。

你可能感兴趣的:(android)