【Android】卸载SD卡上应用导致重启的问题分析

问题描述

在L版本上安装一个apk

移动到sd卡上

【Android】卸载SD卡上应用导致重启的问题分析_第1张图片

升级系统到M版本

打开应用(必要操作)

卸载

就出现了重启

 

系统被杀掉了

自己写一个ap,也会出现这个问题

 

 

在应用信息中进行move操作

 

在Android 5系统中,应用移动到SD卡后,应用apk被挂载到/mnt/asec目录中

这个复现步骤很繁琐,随着sd卡的消退,出现该问题的几率很小

但是,为了追求完美,我们还是要去分析

 

为什么发生重启?

查看log

05-0109:48:27.582086  1163  1192 I MountService: unmountSecureContainer,id=com.UCMobile-2, force=true

05-0109:48:27.672052  1163  1192 D VoldConnector: SND -> {42 asec unmountcom.UCMobile-2 force}

05-0109:48:27.672459   438   449 D FrameworkListener: dispatchCommanddata = (42 asec unmount com.UCMobile-2 force)

05-0109:48:27.672553   438   449 D VoldCmdListener: asec unmountcom.UCMobile-2 force

05-0109:48:27.673646   438   449 W Vold   : com.UCMobile-2 unmount attempt 1 failed (Device or resource busy)

05-0109:48:28.349615   438   449 E ProcessKiller: Process system_server(1163) has open file /mnt/asec/com.UCMobile-2/base.apk

05-0109:48:29.870884   438   449 W Vold   : com.UCMobile-2 unmount attempt 2 failed (Device or resource busy)

05-0109:48:30.681231   438   449 E ProcessKiller: Process system_server(1163) has open file /mnt/asec/com.UCMobile-2/base.apk

05-0109:48:32.243998   438   449 W Vold   : com.UCMobile-2 unmount attempt 3 failed (Device or resource busy)

05-0109:48:33.041623   438   449 E ProcessKiller: Process system_server(1163) has open file /mnt/asec/com.UCMobile-2/base.apk

05-0109:48:33.045726   438   449 W ProcessKiller: SendingTerminated to process 1163

 

结合上面的log和相关代码,我们可以分析出发生重启的大概流程。在这一次的执行中,vold进程的pid是438,system_server进程的pid是1163,卸载安装在SD卡上的应用com.UCMobile时,system_server通过socket方式向vold进程发送命令{42 asecunmount com.UCMobile-2 force},vold进程接收到该指令后,对/mnt/asec/com.UCMobile-2进行卸载操作(调用系统umount方法),但由于/mnt/asec/com.UCMobile-2被system_server进程占用(Device orresource busy),导致umount操作失败,在连续3次umount操作都失败的时候,vold通过查看系统/proc遍历得到占用了/mnt/asec/com.UCMobile-2资源的进程,然后向这些进程发送SIGTERM(15)信号甚至是SIGKILL(9)信号,不幸的是,system_server由于打开了/mnt/asec/com.UCMobile-2/base.apk而名在kill名单之中,于是system_server被无情的杀掉,接着系统便发生了重启。这是一个忧伤的故事。

 

根据该log,我们查找到了

android/system/vold/Process.cpp,ProcessKiller的log是从这里打印出来的

void Process::killProcessesWithOpenFiles(const char *path, int signal)

 

 

我们来看看system_server是怎样被kill掉的

调用到

/system/vold/VolumeManager.cpp

 

#define UNMOUNT_RETRIES 5
#define UNMOUNT_SLEEP_BETWEEN_RETRY_MS (1000 * 1000)
//这里传入的id为com.UCMobile-2
int VolumeManager::unmountAsec(const char *id, bool force) {
    char asecFileName[255];
    char mountPoint[255];
 
    if (!isLegalAsecId(id)) {
        SLOGE("unmountAsec: Invalid asec id \"%s\"", id);
        errno = EINVAL;
        return -1;
    }
 
    if (findAsec(id, asecFileName, sizeof(asecFileName))) {
        SLOGE("Couldn't find ASEC %s", id);
        return -1;
    }
//有定义const char *VolumeManager::ASECDIR           = "/mnt/asec";
    int written = snprintf(mountPoint, sizeof(mountPoint), "%s/%s", VolumeManager::ASECDIR, id);
    if ((written < 0) || (size_t(written) >= sizeof(mountPoint))) {
        SLOGE("ASEC unmount failed for %s: couldn't construct mountpoint", id);
        return -1;
    }
 
    char idHash[33];
    if (!asecHash(id, idHash, sizeof(idHash))) {
        SLOGE("Hash of '%s' failed (%s)", id, strerror(errno));
        return -1;
    }
//进行卸载操作
    return unmountLoopImage(id, idHash, asecFileName, mountPoint, force);
}

 

//这里传入的id为com.UCMobile-2, mountPoint为/mnt/asec/com.UCMobile-2

int VolumeManager::unmountLoopImage(const char *id, const char *idHash,
        const char *fileName, const char *mountPoint, bool force) {
    if (!isMountpointMounted(mountPoint)) {
        SLOGE("Unmount request for %s when not mounted", id);
        errno = ENOENT;
        return -1;
    }
 
    int i, rc;
//循环操作,如果umount操作成功,退出循环
    for (i = 1; i <= UNMOUNT_RETRIES; i++) {
//调用umount方法进行unmount操作,出错编号保存在errno中
//umount成功,则返回值为0 
        rc = umount(mountPoint);
//如果rc为0,表示umount成功,退出循环
        if (!rc) {
            break;
        }
        if (rc && (errno == EINVAL || errno == ENOENT)) {
            SLOGI("Container %s unmounted OK", id);
            rc = 0;
            break;
        }
//umount调用异常,打印出失败原因
        SLOGW("%s unmount attempt %d failed (%s)",
              id, i, strerror(errno));
 
        int signal = 0; // default is to just complain
 
        if (force) {
        //i为4,5时。即如果SIGTERM信号不足以kill掉目标进程,改用大杀器SIGKILL来kill
            if (i > (UNMOUNT_RETRIES - 2))
                signal = SIGKILL;
        //i为3时
            else if (i > (UNMOUNT_RETRIES - 3))
                signal = SIGTERM;
        }
//杀进程处理
        Process::killProcessesWithOpenFiles(mountPoint, signal);
//sleep 1秒,
        usleep(UNMOUNT_SLEEP_BETWEEN_RETRY_MS);
    }
        //省略部分代码
        … …
}

 

system/vold/Process.cpp

//kill掉打开文件的进程

//在这个例子中path为/mnt/asec/com.UCMobile-2

voidProcess::killProcessesWithOpenFiles(const char *path, int signal) {

    DIR*   dir;

    struct dirent* de;

//打开文件/proc,如果打开失败,return

    if (!(dir = opendir("/proc"))) {

        SLOGE("opendir failed (%s)",strerror(errno));

        return;

    }

//遍历文件

    while ((de = readdir(dir))) {

//根据文件名获取进程号,如果文件名不是数字形式的pid,返回-1

        int pid = getPid(de->d_name);

        char name[PATH_MAX];

 

        if (pid == -1)

            continue;

//根据pid得到进程名,写入到name中

        getProcessName(pid, name, sizeof(name));

 

        char openfile[PATH_MAX];

 

//查看/proc//fd目录中所有的链接文件是否包含路径path,如果包含,表示打开了path路径下的文件。检查进程pid是否打开了path路径下的文件

        if (checkFileDescriptorSymLinks(pid,path, openfile, sizeof(openfile))) {

            SLOGE("Process %s (%d) hasopen file %s", name, pid, openfile);

        } else if (checkFileMaps(pid, path,openfile, sizeof(openfile))) {

//检查文件/proc//maps,查看内存映射

            SLOGE("Process %s (%d) hasopen filemap for %s", name, pid, openfile);

        } else if (checkSymLink(pid, path,"cwd")) {

//检查/proc//cwd的链接文件是否匹配

            SLOGE("Process %s (%d) has cwdwithin %s", name, pid, path);

        } else if (checkSymLink(pid, path,"root")) {

//检查/proc//root的链接文件是否匹配

            SLOGE("Process %s (%d) haschroot within %s", name, pid, path);

        } else if (checkSymLink(pid, path,"exe")) {

//检查/proc//exe的链接文件是否匹配

            SLOGE("Process %s (%d) hasexecutable path within %s", name, pid, path);

//没有找到,查看下一个文件

        } else {

            continue;

        }

 

        if (signal != 0) {

            SLOGW("Sending %s to process%d", strsignal(signal), pid);

         //发送信号给指定进程

            kill(pid, signal);

        }

    }

    closedir(dir);

}

 

killProcessesWithOpenFiles方法把占用了/mnt/asec/com.UCMobile-2的进程都给kill掉了。其本意是关闭掉占用资源的进程,但是却把system_server给kill掉了,导致系统发生重启。

【Android】卸载SD卡上应用导致重启的问题分析_第2张图片

解决尝试

屏蔽向system_server发送信号,不让vold进程把system_server杀掉。

void Process::killProcessesWithOpenFiles(constchar *path, int signal) {

   DIR*    dir;

   struct dirent* de;

 

   if (!(dir = opendir("/proc"))) {

       SLOGE("opendir failed (%s)", strerror(errno));

       return;

    }

 

   while ((de = readdir(dir))) {

       int pid = getPid(de->d_name);

       char name[PATH_MAX];

 

       if (pid == -1)

           continue;

       getProcessName(pid, name, sizeof(name));

 

       char openfile[PATH_MAX];

// openfile用来存取得到的文件名

       if (checkFileDescriptorSymLinks(pid, path, openfile,sizeof(openfile))) {

           SLOGE("Process %s (%d) has open file %s, path=[%s]", name,pid, openfile, path);

       } else if (checkFileMaps(pid, path, openfile, sizeof(openfile))) {

           SLOGE("Process %s (%d)has open filemap for %s", name, pid, openfile);

       } else if (checkSymLink(pid, path, "cwd")) {

           SLOGE("Process %s (%d) has cwd within %s", name, pid, path);

       } else if (checkSymLink(pid, path, "root")) {

           SLOGE("Process %s (%d) has chroot within %s", name, pid,path);

       } else if (checkSymLink(pid, path, "exe")) {

           SLOGE("Process %s (%d) has executable path within %s", name,pid, path);

       } else {

           continue;

       }

 

       if (signal != 0) {

           SLOGW("Sending %s to process %d", strsignal(signal), pid);

//test

                if(strcmp(name, "system_server") == 0)

                {

                        SLOGW("do not killsystem_server, skip");

                        continue;

                }

           kill(pid, signal);

       }

    }

   closedir(dir);

}

       


在我分析这个问题的时候,我对应用的打开流程并不熟悉,为了快速的了解base.apk是在哪里进行加载的,我考虑删除掉base.apk进行探索查看

在手机上安装一个UC浏览器进行测试,root手机后,删掉其对应的base.apk,/data/app/com.UCMobile-1/base.apk,然后在桌面上点击UC的图标,会出现错误提示,不能打开该应用。

抓取相关log

 

01-02 01:13:58.043: I/ActivityManager(1062): START u0{act=android.intent.action.MAIN cat=[android.intent.category.LAUNCHER]flg=0x10200000 cmp=com.UCMobile/.main.UCMobile (has extras)} from uid 10009from pid 1705 on display 0

01-02 01:13:58.044: W/asset(1062):Asset path /data/app/com.UCMobile-1/base.apk is neither a directory nor file(type=1).

01-02 01:13:58.045: V/WindowManager(1062): Looking for focus: 10 =Window{c3f667e u0 StatusBar}, flags=-2122055608, canReceive=false

01-02 01:13:58.045: V/WindowManager(1062): findFocusedWindow: Found newfocus @ 1 = Window{9f6bd81 u0com.android.launcher3/com.android.launcher3.Launcher}

01-02 01:13:58.047: I/BufferQueueProducer(396): [com.android.launcher3/com.android.launcher3.Launcher](this:0x7f8baca000,id:81,api:1,p:1705,c:396)queueBuffer: fps=0.10 dur=10490.40 max=10490.40 min=10490.40

01-02 01:13:58.050: W/asset(1062):Asset path /data/app/com.UCMobile-1/base.apk is neither a directory nor file(type=1).

01-02 01:13:58.052: D/PowerManagerService(1062):acquireWakeLockInternal: lock=170577702, flags=0x1, tag="*launch*",ws=WorkSource{10072}, uid=1000, pid=1062

 

根据该log里的关键词,找到frameworks/base/libs/androidfw/AssetManager.cpp的addAssetPath方法。接下来我们就可以有的放矢来查看AssetManager了。

 

AssetManager简介

AssetManager是Android系统中的资源管理器,用来加载和管理APK文件里的数据资源。在应用开发中,就可以使用AssetManager去访问assets目录中的资源:

//获取AssetManager

AssetManager am = getApplicationContext().getAssets();

//获取assets目录下的文件列表

  String[] filePathList =am.list("");

//打开assets目录中的文件,如test.txt

InputStream inStream = am.open("test.txt");

 

从上面的log中我们看到,AssetManager试图读取/data/app/com.UCMobile-1/base.apk这样的apk文件来加载资源,base.apk是原始apk安装包的一个复制文件,如果将其从手机中pull出来,是可以进行安装使用的。有时候我们可以通过获取base.apk来得到应用安装包。

在Java层有和AssetManager.cpp相对应的AssetManager.java文件,Java层的AssetManager提供对外的访问接口,主要功能由C++层的AssetManager来实现。

相关文件

frameworks/base/libs/androidfw/AssetManager.cpp

frameworks/base/core/java/android/content/res/AssetManager.java

frameworks/base/core/jni/android_util_AssetManager.cpp

frameworks/base/libs/androidfw/ZipFileRO.cpp



检测Java层的内存泄漏

在MountService.java的unmountSecureContainer方法中,在向vold发送unmount命令之前,进行了GC操作:

    public intunmountSecureContainer(String id, boolean force) {

… …

        /*

         * Force a GC to makesure AssetManagers in other threads of the

         * system_server arecleaned up. We have to do this since AssetManager

         * instances are kept as a WeakReference andit's possible we have files

         * open on the externalstorage.

         */

        Runtime.getRuntime().gc();

     

        int rc =StorageResultCode.OperationSucceeded;

        try {

            final Command cmd =new Command("asec", "unmount", id);

            if (force) {

               cmd.appendArg("force");

            }

       //向vold发送消息

            mConnector.execute(cmd);

        } catch(NativeDaemonConnectorException e) {

            int code =e.getCode();

            if (code ==VoldResponseCode.OpFailedStorageBusy) {

                rc =StorageResultCode.OperationFailedStorageBusy;

            } else {

                rc =StorageResultCode.OperationFailedInternalError;

            }

        }

 

              … …

        return rc;

    }

 

显然,Google的开发人员意识到这里需要进行gc操作来对AssetManager进行释放。

 

AssetManager的创建很容易找到,就是newAssetManager的地方,但是在java里怎么调到的finalize()方法,进而调用到C++里的AssetManager的析构方法的呢?

不断的尝试去进行分析。

 

在前面的分析中,我们查看到,C++层的AssetManager的处理代码中没有内存泄漏,只要AssetManager对象正确的释放了,是可以释放打开的base.apk的。那么会不会是Java层的AssetManager出现了泄漏导致C++层的AssetManager没有释放呢?Java层的AssetManager对象的创建和释放又是在哪里处理的呢?

看着茫茫代码,让人不禁感叹:只在此山中,云深不知处。

 

 

我们从AssetManager对象创建过程来入手去分析。

哪里打开的base.apk

点击一个应用的时候,会加载其对应的base.apk,我们在Java层AssetManager 类的addAssetPath方法中添加堆栈打印,

其调用堆栈为

android.content.res.AssetManager.addAssetPath(AssetManager.java:653)

 android.app.ResourcesManager.getTopLevelResources(ResourcesManager.java:221)

 android.app.ActivityThread.getTopLevelResources(ActivityThread.java:1854)

 android.app.LoadedApk.getResources(LoadedApk.java:558)

 android.app.ContextImpl.(ContextImpl.java:1884)

 android.app.ContextImpl.createPackageContextAsUser(ContextImpl.java:1733)

 android.app.ContextImpl.createPackageContextAsUser(ContextImpl.java:1718)

 com.android.server.AttributeCache.get(AttributeCache.java:114)

 com.android.server.am.ActivityRecord.(ActivityRecord.java:564)com.android.server.am.ActivityStackSupervisor.startActivityLocked(ActivityStackSupervisor.java:1763)

com.android.server.am.ActivityStackSupervisor.startActivityMayWait(ActivityStackSupervisor.java:1153)

com.android.server.am.ActivityManagerService.startActivityAsUser(ActivityManagerService.java:4271)

 com.android.server.am.ActivityManagerService.startActivity(ActivityManagerService.java:4258)

 android.app.ActivityManagerNative.onTransact(ActivityManagerNative.java:168)

 com.android.server.am.ActivityManagerService.onTransact(ActivityManagerService.java:2703)

 

分析这个堆栈信息,我们来追根溯源的查看AssetManager对象的创建过程。

第一次打开应用的时候,调到ActivityManagerService的startActivityAsUser方法,

后来调到com.android.server.am.ActivityRecord.(ActivityRecord.java:564)

462    ActivityRecord(ActivityManagerService _service, ProcessRecord _caller,

463  int _launchedFromUid, String_launchedFromPackage, Intent _intent, String _resolvedType,

464             ActivityInfo aInfo, Configuration_configuration,

465             ActivityRecord _resultTo, String_resultWho, int _reqCode,

466             boolean _componentSpecified,boolean _rootVoiceInteraction,

467             ActivityStackSupervisorsupervisor,

468             ActivityContainer container,Bundle options) {

… …

564 AttributeCache.Entry ent = AttributeCache.instance().get(packageName,

565              realTheme, com.android.internal.R.styleable.Window,userId);

继续调用

com.android.server.AttributeCache.get(AttributeCache.java:114)

查看AttributeCache.java

98     public Entry get(String packageName, intresId, int[] styleable, int userId) {

… …

112 Context context;

113 try {

114     context =mContext.createPackageContextAsUser(packageName, 0,

115             new UserHandle(userId));

116     if (context == null) {

117         return null;

118     }

119 } catch(PackageManager.NameNotFoundException e) {

120     return null;

121 }

122 pkg = new Package(context);

123 mPackages.put(packageName, pkg);

在这里,mPackages是一个map,其定义为

        private final WeakHashMap mPackages =

            newWeakHashMap();

在Android 7上,其定义修改为了

    private final ArrayMap> mPackages = new ArrayMap<>();

 

在114行,变量context指向了

mContext.createPackageContextAsUser(packageName,0, new UserHandle(userId));的返回值

在122行对context进行了引用。我们可以查看到AttributeCache的内部类Package的定义及其构造方法

    public final static classPackage {

        public final Contextcontext;

        private finalSparseArray> mMap

                = newSparseArray>();

       

        public Package(Context c) {

            context = c;

        }

    }

 

114行调用到

android.app.ContextImpl.createPackageContextAsUser(ContextImpl.java:1718)

继续调用到

android.app.ContextImpl.createPackageContextAsUser(ContextImpl.java:1733)

查看ContextImpl.java

1733     ContextImpl c = new ContextImpl(this,mMainThread, pi, mActivityToken,

1734   user, restricted, mDisplay, null,Display.INVALID_DISPLAY, themePackageName);

 

1733行又调用到

android.app.ContextImpl.(ContextImpl.java:1884)

查看ContextImpl.java

1884         Resources resources = packageInfo.getResources(mainThread);

… …

1906         mResources = resources;

新创建的ContextImpl对象的成员变量mResources对resources进行了引用。

 

1884行继续调用

android.app.LoadedApk.getResources(LoadedApk.java:558)

android.app.ActivityThread.getTopLevelResources(ActivityThread.java:1854)

android.app.ResourcesManager.getTopLevelResources(ResourcesManager.java:221)

查看ResourcesManager.java

181 Resources getTopLevelResources(StringresDir, String[] splitResDirs,

182    String[] overlayDirs, String[] libDirs, intdisplayId, String packageName,

183 Configuration overrideConfiguration, CompatibilityInfo compatInfo,Context context) {

… …

211         AssetManager assets = newAssetManager();

… …

218         // already.

219         if (resDir != null) {

220  

221             if (assets.addAssetPath(resDir) == 0) {

222                 return null;

223             }

224         }

… …

301         r = new Resources(assets, dm, config,compatInfo);

… …

//ContextImpl.java中的第1884行Resources resources指向该返回值r

319             return r;

 

终于,找到了AssetManager创建的地方了,然后创建了C++层的AssetManager对象,再调用addAssetPath方法加载了base.apk

在Resources.java中有

   public Resources(AssetManager assets, DisplayMetrics metrics,Configuration config,  CompatibilityInfocompatInfo) {

        mAssets = assets;

        mMetrics.setToDefaults();

        if (compatInfo != null) {

            mCompatibilityInfo = compatInfo;

        }

        updateConfiguration(config, metrics);

        assets.recreateStringBlocks();//Modified for ThemeManager

   }

Resources中的mAssets对新创建的AssetManager对象进行了引用。

 

AssetManager的创建可以追溯到这里,而对AssetManager对象的一系列的引用可以定位到AttributeCache类的mPackages

【Android】卸载SD卡上应用导致重启的问题分析_第3张图片

 

一个大胆的想法

AttributeCache是单例设计,而PMS与AMS都是在system_server进程中。我们来尝试调用AttributeCache的removePackage(String packageName)方法来对AssetManager对象进行释放。

public final class AttributeCache {

   private static AttributeCache sInstance = null;

   

   private final Context mContext;

   private final WeakHashMap mPackages =

           new WeakHashMap();

 

//SystemServer.java中的startBootstrapServices()方法里调用该init方法进行的初始化

   public static void init(Context context) {

       if (sInstance == null) {

         //创建单例对象

           sInstance = new AttributeCache(context);

       }

    }

 

//获取单例对象

   public static AttributeCache instance() {

       return sInstance;

    }

   

   public AttributeCache(Context context) {

       mContext = context;

    }

   

   public void removePackage(String packageName) {

       synchronized (this) {

           mPackages.remove(packageName);

       }

    }

}

 

PackageManagerService.java

AsecInstallArgs类的doPostDeleteLI中进行释放,

 

booleandoPostDeleteLI(boolean delete) {

    if (DEBUG_SD_INSTALL) Slog.i(TAG,"doPostDeleteLI() del=" + delete);

    final List allCodePaths =getAllCodePaths();

 

//cidcom.UCMobile-2中提取出分隔符’-’前的包名com.UCMobile

int pos = cid.lastIndexOf("-");

   if (pos > 0)

    {

       String packageName = cid.substring(0, pos);//cidcom.UCMobile-2

        AttributeCache ac =AttributeCache.instance();

        if (ac != null) {  // packageNamecom.UCMobile

           ac.removePackage(packageName);

        }

    }

 

    if (mounted) {

        // Unmount first

        if (PackageHelper.unMountSdDir(cid)) {

            mounted = false;

        }

    }

    return !mounted;

}

另外,需要对AttributeCache进行import

importcom.android.server.AttributeCache;

添加后进行测试,竟然不重启了,增加了一个释放操作就解决了这个问题。回顾分析处理的过程,真是山穷水尽疑无路,柳暗花明又一村。

在AsecInstallArgs类中,已经有现成的方法来从cid(com.UCMobile-2)中截取应用包名(com.UCMobile)

String getPackageName() {

   return getAsecPackageName(cid);

}

       

static String getAsecPackageName(StringpackageCid) {

   int idx = packageCid.lastIndexOf("-");

   if (idx == -1) {

       return packageCid;

    }

   return packageCid.substring(0, idx);

我们可以直接使用这个方法来使改动更加简洁优雅:

AttributeCache ac =AttributeCache.instance();

 if (ac != null) { 

   ac.removePackage( getPackageName() );

 }

 


你可能感兴趣的:(Android)