本文来自于腾讯Bugly公众号(weixinBugly), 作者:chivyzhuang,未经作者同意,请勿转载,原文地址:https://mp.weixin.qq.com/s/qT5waBvKVbqhuqXYHC-3cA
| 导语 外部存储作为开发中经常接触的一个重要系统组成,在Android历代版本中,有过许许多多重要的变更。我也曾疑惑过,为什么一个简简单单外部存储,会存在存在这么多奇奇怪怪的路径
:/sdcard、/mnt/sdacrd、/storage/extSdCard、/mnt/shell/emulated/0、/storage/emulated/0、/mnt/shell/runtime/default/emulated/0…其实,这背后代表了一项项技术的成熟与发布:模拟外部存储、多用户、运行时权限…
Android 4.0
支持模拟外部存储(通过FUSE实现)
出现了主外部存储,以及二级外部存储(没有接口对外暴露)
支持MTP(Media Transfer Protocol)、PTP协议(Picture Transfer Protocol)
Android 4.1
开发者选项出现”强制应用声明读权限才可以进行读操作”的开关
Android 4.2
支持多用户,每个用户拥有独立的外部存储
Android 4.4
Context.getExternalFilesDirs()
接口,可以获取应用在主外部存储和其他二级外部存储下的files路径Android 6.0
外部存储支持动态权限管理
Adoptable Storage特性
Android 7.0
补充一个点:如果应用的minSdkVersion和targetSdkVersion设置成<=3,系统会默认授予READ_EXTERNAL_STORAGE权限
a. 必要性
b. 实现原理
系统/system/bin/sdcard守护进程,使用FUSE实现类FAT格式SD卡文件系统的模拟,也就是我们经常说的内置SD卡。(详细代码可以参考:/xref/system/core/sdcard/sdcard.c)
用户空间文件系统(Filesystem in Userspace,简称FUSE)是一个面向类Unix计算机操作系统的软件接口,它使无特权的用户能够无需编辑内核代码而创建自己的文件系统。目前Linux通过内核模块对此进行支持。
sdcard守护进程模拟外部存储大致流程(Android 4.0为例):
# create virtual SD card at /mnt/sdcard, based on the /data/media directory
# daemon will drop to user/group system/media_rw after initializing
# underlying files in /data/media will be created with user and group media_rw (1023)
service sdcard /system/bin/sdcard /data/media 1023 1023
class late_start
fd = open("/dev/fuse", O_RDWR);
if (fd < 0) {
ERROR("cannot open fuse device (%d)\n", errno);
return -1;
}
#define MOUNT_POINT "/mnt/sdcard"
...
sprintf(opts, "fd=%i,rootmode=40000,default_permissions,allow_other,"
"user_id=%d,group_id=%d", fd, uid, gid);
res = mount("/dev/fuse", MOUNT_POINT, "fuse", MS_NOSUID | MS_NODEV, opts);
if (res < 0) {
ERROR("cannot mount fuse filesystem (%d)\n", errno);
return -1;
}
*开线程,在线程中处理文件系统事件,并将结果写回。
void handle_fuse_requests(struct fuse *fuse)
{
unsigned char req[256 * 1024 + 128];
int len;
for (;;) {
len = read(fuse->fd, req, 8192);
if (len < 0) {
if (errno == EINTR)
continue;
ERROR("handle_fuse_requests: errno=%d\n", errno);
return;
}
handle_fuse_request(fuse, (void*) req, (void*) (req + sizeof(struct fuse_in_header)), len);
}
}
经过上面一系列步骤,sdcard进程在/mnt/sdcard路径上创建了一个FUSE文件系统,所有对/mnt/sdcard将转为事件由sdcard守护进程处理,并对应到/data/media目录。
例如,应用创建/mnt/sdcard/a文件,实际是创建/data/media/a文件。
c. 优点
模拟外部存储容量和/data分区是共享的,用户数据在内外存储的分配更加自由;
模拟外部存储本身不可卸载,不会因为卸载导致应用访问出现问题,也减少了外部因素导致被破坏的情况;
所有的访问都经过sdcard守护进程,Android可以定制访问规则;
d. 劣势
性能上存在一定损失
e. 影响
Android 6.0以后,由于动态权限管理的需要,会存在多个fuse挂载点,这导致inotify/FileObserver对外部存储进行文件事件监控时,会丢失事件。
inotify是Linux核心子系统之一,做为文件系统的附加功能,它可监控文件系统并将异动通知应用程序。 —— 维基百科(https://zh.wikipedia.org/wiki/Inotify)
a. 支持版本
Android 4.2开始支持多用户,但仅限平板;
Android 5.0开始,设备制造商可以在编译时候开启多用户模块;
b. 背景知识
MS_BIND (Linux 2.4 onward)
Perform a bind mount, making a file or a directory subtree visible at another point within a file system. Bind mounts may cross file system boundaries and span chroot(2) jails. The filesystemtype and dataarguments are ignored. Up until Linux 2.6.26, mountflags was also ignored (the bind mount has the same mount options as the underlying mount point). —— mount(2) - Linux man page(https://linux.die.net/man/2/mount)
图例(来自https://xionchen.github.io/2016/08/25/linux-bind-mount)
2) bind完成之后,对/mnt/backup的访问将等同于对/home的访问,原/mnt/backup变为不可见。
Mount namespaces provide isolation of the list of mount points seen by the processes in each namespace instance. Thus, the processes in each of the mount namespace instances will see distinct single-directory hierarchies. —— mount_namespaces(7) - Linux manual page - man7.org(http://man7.org/linux/man-pages/man7/mount_namespaces.7.html)
通俗的讲,挂载命名空间实现了挂载点的隔离,在不同挂载命名空间的进程,看到的目录层次不同。
挂载命名空间实现了完全的隔离,但对于有些情况并不适用。例如在Linux系统上,进程A在命名空间1挂载了一张CD-ROM,这时候命名空间2因为隔离无法看到这张CD-ROM。
为了解决这个问题,引入了挂载传播(mount propagation)。传播挂载定义了挂载点的传播类型:
1.共享挂载:此类型的挂载点会加入一个peer group,并会在group内传播和接收挂载事件;
2.从属挂载:此类型的挂载点会加入一个peer group,并会接收group内的挂载事件,但不传播;
3.共享/从属挂载:上面两种类型的共存体。可以从一个peer group(此时类型为从属挂载)接收挂载事件,再传播到另一个peer group;
4.私有挂载:此类型的挂载点没有peer group,既不传播也不接收挂载事件;
5.不可绑定挂载:不展开讲;
peer group的形成条件为,一个挂载点被设置成共享挂载,并满足以下任意一种情况:
1.挂载点在创建新的命名空间时被复制
2.从该挂载点创建了一个绑定挂载
[1] 如果一个共享挂载是peer group中仅存的挂载点,那么对它应用从属挂载将会导致它变为私有挂载。
[2] 对一个非共享挂载类型的挂载点,应用从属挂载是无效的。
背景知识讲到这里,其中挂载点的传播类型比较不好理解,但很重要,可以参考上面mount namespace的Linux Programmer’s Manual里面的例子(搜索MS_XXX example)进行学习,链接在这里(http://man7.org/linux/man-pages/man7/mount_namespaces.7.html)。
c. 实现原理
概括多用户的外部存储隔离实现:应用进程在创建时,创建了新的挂载命名空间,然后通过绑定挂载对应用暴露当前用户的外部存储空间。
以Android 4.2代码为例【mountEmulatedStorage(dalvik_system_Zygote.cpp)】:
/*
* Create a private mount namespace and bind mount appropriate emulated
* storage for the given user.
*/
static int mountEmulatedStorage(uid_t uid, u4 mountMode) {
// See storage config details at http://source.android.com/tech/storage/
userid_t userid = multiuser_get_user_id(uid);
// Create a second private mount namespace for our process
if (unshare(CLONE_NEWNS) == -1) {
SLOGE("Failed to unshare(): %s", strerror(errno));
return -1;
}
// Create bind mounts to expose external storage
if (mountMode == MOUNT_EXTERNAL_MULTIUSER
|| mountMode == MOUNT_EXTERNAL_MULTIUSER_ALL) {
// These paths must already be created by init.rc
const char* source = getenv("EMULATED_STORAGE_SOURCE");
const char* target = getenv("EMULATED_STORAGE_TARGET");
const char* legacy = getenv("EXTERNAL_STORAGE");
if (source == NULL || target == NULL || legacy == NULL) {
SLOGE("Storage environment undefined; unable to provide external storage");
return -1;
}
// Prepare source paths
char source_user[PATH_MAX];
char source_obb[PATH_MAX];
char target_user[PATH_MAX];
// /mnt/shell/emulated/0
snprintf(source_user, PATH_MAX, "%s/%d", source, userid);
// /mnt/shell/emulated/obb
snprintf(source_obb, PATH_MAX, "%s/obb", source);
// /storage/emulated/0
snprintf(target_user, PATH_MAX, "%s/%d", target, userid);
if (fs_prepare_dir(source_user, 0000, 0, 0) == -1
|| fs_prepare_dir(source_obb, 0000, 0, 0) == -1
|| fs_prepare_dir(target_user, 0000, 0, 0) == -1) {
return -1;
}
if (mountMode == MOUNT_EXTERNAL_MULTIUSER_ALL) {
// Mount entire external storage tree for all users
if (mount(source, target, NULL, MS_BIND, NULL) == -1) {
SLOGE("Failed to mount %s to %s: %s", source, target, strerror(errno));
return -1;
}
} else {
// Only mount user-specific external storage
if (mount(source_user, target_user, NULL, MS_BIND, NULL) == -1) {
SLOGE("Failed to mount %s to %s: %s", source_user, target_user, strerror(errno));
return -1;
}
}
...
// Finally, mount user-specific path into place for legacy users
if (mount(target_user, legacy, NULL, MS_BIND | MS_REC, NULL) == -1) {
SLOGE("Failed to mount %s to %s: %s", target_user, legacy, strerror(errno));
return -1;
}
...
a.背景
Android 6.0引入了运行时权限,允许用户对危险权限进行动态授权,这部分权限包含外部存储访问权限。
b.实现原理
外部存储访问权限的动态授权,是利用FUSE和挂载命名空间这两个技术配合实现。
通过下面这个提交记录
(https://android.googlesource.com/platform/system/core/+/f38f29c87d97cea45d04b783bddbd969234b1030%5E%21/#F1),我们可以很清楚的了解整个实现。
Let's reinvent storage, yet again!
Now that we're treating storage as a runtime permission, we need to
grant read/write access without killing the app. This is really
tricky, since we had been using GIDs for access control, and they're
set in stone once Zygote drops privileges.
The only thing left that can change dynamically is the filesystem
itself, so let's do that. This means changing the FUSE daemon to
present itself as three different views:
/mnt/runtime_default/foo - view for apps with no access
/mnt/runtime_read/foo - view for apps with read access
/mnt/runtime_write/foo - view for apps with write access
There is still a single location for all the backing files, and
filesystem permissions are derived the same way for each view, but
the file modes are masked off differently for each mountpoint.
During Zygote fork, it wires up the appropriate storage access into
an isolated mount namespace based on the current app permissions. When
the app is granted permissions dynamically at runtime, the system
asks vold to jump into the existing mount namespace and bind mount
the newly granted access model into place.
Bug: 21858077
Change-Id: I5a016f0958a92fd390c02b5ae159f8008bd4f4b7
为了达到不杀死进程,就能够赋予进程读/写外置存储的目的,Android利用FUSE对/data/media模拟了三种访问视图,分别是default、read、write。
当应用被授予读/写权限时,vold子进程会切换到应用的挂载命名空间,将对应的视图重新绑定到应用的外部存储路径上。
切换进程的挂载命名空间,需要内核版本在3.8及以上,切换函数为setns,ndk貌似没有对开发者暴露,但可以在源码里找到arm的实现,有需要直接编入就可以了,也就一个sys call。
/* Generated by gensyscalls.py. Do not edit. */
#include <private/bionic_asm.h>
ENTRY(setns)
mov ip, r7
ldr r7, =__NR_setns
swi #0
mov r7, ip
cmn r0, #(MAX_ERRNO + 1)
bxls lr
neg r0, r0
END(setns)
c. 代码分析
static void run(const char* source_path, const char* label, uid_t uid,
gid_t gid, userid_t userid, bool multi_user, bool full_write) {
...
// 分配三个视图路径,分别为default、read和write,label一般用来标示存储,例如模拟的外置存储,这里label为"emulated"
snprintf(fuse_default.dest_path, PATH_MAX, "/mnt/runtime/default/%s", label);
snprintf(fuse_read.dest_path, PATH_MAX, "/mnt/runtime/read/%s", label);
snprintf(fuse_write.dest_path, PATH_MAX, "/mnt/runtime/write/%s", label);
...
// fuse_setup方法挂载fuse文件系统
if (multi_user) {
/* Multi-user storage is fully isolated per user, so "other"
* permissions are completely masked off. */
if (fuse_setup(&fuse_default, AID_SDCARD_RW, 0006)
|| fuse_setup(&fuse_read, AID_EVERYBODY, 0027)
|| fuse_setup(&fuse_write, AID_EVERYBODY, full_write ? 0007 : 0027)) {
ERROR("failed to fuse_setup\n");
exit(1);
}
} else {
/* Physical storage is readable by all users on device, but
* the Android directories are masked off to a single user
* deep inside attr_from_stat(). */
if (fuse_setup(&fuse_default, AID_SDCARD_RW, 0006)
|| fuse_setup(&fuse_read, AID_EVERYBODY, full_write ? 0027 : 0022)
|| fuse_setup(&fuse_write, AID_EVERYBODY, full_write ? 0007 : 0022)) {
ERROR("failed to fuse_setup\n");
exit(1);
}
}
...
// 从原本一个处理线程变为三个,分别处理三个视图的访问请求
if (pthread_create(&thread_default, NULL, start_handler, &handler_default)
|| pthread_create(&thread_read, NULL, start_handler, &handler_read)
|| pthread_create(&thread_write, NULL, start_handler, &handler_write)) {
ERROR("failed to pthread_create\n");
exit(1);
}
...
}
// 挂载fuse文件系统
static int fuse_setup(struct fuse* fuse, gid_t gid, mode_t mask) {
char opts[256];
fuse->fd = open("/dev/fuse", O_RDWR);
if (fuse->fd == -1) {
ERROR("failed to open fuse device: %s\n", strerror(errno));
return -1;
}
umount2(fuse->dest_path, MNT_DETACH);
snprintf(opts, sizeof(opts),
"fd=%i,rootmode=40000,default_permissions,allow_other,user_id=%d,group_id=%d",
fuse->fd, fuse->global->uid, fuse->global->gid);
if (mount("/dev/fuse", fuse->dest_path, "fuse", MS_NOSUID | MS_NODEV | MS_NOEXEC |
MS_NOATIME, opts) != 0) {
ERROR("failed to mount fuse filesystem: %s\n", strerror(errno));
return -1;
}
fuse->gid = gid;
fuse->mask = mask;
return 0;
}
1.创建新的挂载命名空间;
2.将之前的挂载命名空间在/storage下的挂载全部去除,排除影响;
3.根据mount_mode,选择一个路径;
4.将选择的路径绑定到/storage下。
// Create a private mount namespace and bind mount appropriate emulated
// storage for the given user.
static bool MountEmulatedStorage(uid_t uid, jint mount_mode,
bool force_mount_namespace) {
// See storage config details at http://source.android.com/tech/storage/
// Create a second private mount namespace for our process
if (unshare(CLONE_NEWNS) == -1) {
ALOGW("Failed to unshare(): %s", strerror(errno));
return false;
}
// Unmount storage provided by root namespace and mount requested view
UnmountTree("/storage");
String8 storageSource;
if (mount_mode == MOUNT_EXTERNAL_DEFAULT) {
storageSource = "/mnt/runtime/default";
} else if (mount_mode == MOUNT_EXTERNAL_READ) {
storageSource = "/mnt/runtime/read";
} else if (mount_mode == MOUNT_EXTERNAL_WRITE) {
storageSource = "/mnt/runtime/write";
} else {
// Sane default of no storage visible
return true;
}
if (TEMP_FAILURE_RETRY(mount(storageSource.string(), "/storage",
NULL, MS_BIND | MS_REC | MS_SLAVE, NULL)) == -1) {
ALOGW("Failed to mount %s to /storage: %s", storageSource.string(), strerror(errno));
return false;
}
获取init的挂载命名空间,为了对之后进程的挂载命名空间进行对比,如果一致,不重新绑定;
遍历/proc下各个进程目录,根据uid进行筛选;
找到对应的pid后,fork子进程进行重新挂载,这里用到setns进行挂载命名空间的切换;
重新挂载部分的逻辑和应用进程创建时基本一致,不难理解。
int VolumeManager::remountUid(uid_t uid, const std::string& mode) {
LOG(DEBUG) << "Remounting " << uid << " as mode " << mode;
DIR* dir;
struct dirent* de;
char rootName[PATH_MAX];
char pidName[PATH_MAX];
int pidFd;
int nsFd;
struct stat sb;
pid_t child;
if (!(dir = opendir("/proc"))) {
PLOG(ERROR) << "Failed to opendir";
return -1;
}
// Figure out root namespace to compare against below
if (sane_readlinkat(dirfd(dir), "1/ns/mnt", rootName, PATH_MAX) == -1) {
PLOG(ERROR) << "Failed to readlink";
closedir(dir);
return -1;
}
// Poke through all running PIDs look for apps running as UID
while ((de = readdir(dir))) {
pidFd = -1;
nsFd = -1;
pidFd = openat(dirfd(dir), de->d_name, O_RDONLY | O_DIRECTORY | O_CLOEXEC);
if (pidFd < 0) {
goto next;
}
if (fstat(pidFd, &sb) != 0) {
PLOG(WARNING) << "Failed to stat " << de->d_name;
goto next;
}
if (sb.st_uid != uid) {
goto next;
}
// Matches so far, but refuse to touch if in root namespace
LOG(DEBUG) << "Found matching PID " << de->d_name;
if (sane_readlinkat(pidFd, "ns/mnt", pidName, PATH_MAX) == -1) {
PLOG(WARNING) << "Failed to read namespace for " << de->d_name;
goto next;
}
if (!strcmp(rootName, pidName)) {
LOG(WARNING) << "Skipping due to root namespace";
goto next;
}
// We purposefully leave the namespace open across the fork
nsFd = openat(pidFd, "ns/mnt", O_RDONLY);
if (nsFd < 0) {
PLOG(WARNING) << "Failed to open namespace for " << de->d_name;
goto next;
}
if (!(child = fork())) {
if (setns(nsFd, CLONE_NEWNS) != 0) {
PLOG(ERROR) << "Failed to setns for " << de->d_name;
_exit(1);
}
unmount_tree("/storage");
std::string storageSource;
if (mode == "default") {
storageSource = "/mnt/runtime/default";
} else if (mode == "read") {
storageSource = "/mnt/runtime/read";
} else if (mode == "write") {
storageSource = "/mnt/runtime/write";
} else {
// Sane default of no storage visible
_exit(0);
}
if (TEMP_FAILURE_RETRY(mount(storageSource.c_str(), "/storage",
NULL, MS_BIND | MS_REC | MS_SLAVE, NULL)) == -1) {
PLOG(ERROR) << "Failed to mount " << storageSource << " for "
<< de->d_name;
_exit(1);
}
// Mount user-specific symlink helper into place
userid_t user_id = multiuser_get_user_id(uid);
std::string userSource(StringPrintf("/mnt/user/%d", user_id));
if (TEMP_FAILURE_RETRY(mount(userSource.c_str(), "/storage/self",
NULL, MS_BIND, NULL)) == -1) {
PLOG(ERROR) << "Failed to mount " << userSource << " for "
<< de->d_name;
_exit(1);
}
_exit(0);
}
if (child == -1) {
PLOG(ERROR) << "Failed to fork";
goto next;
} else {
TEMP_FAILURE_RETRY(waitpid(child, nullptr, 0));
}
next:
close(nsFd);
close(pidFd);
}
closedir(dir);
return 0;
}