前段时间调试代码查看log的时候,偶然发现有虚拟机的异常报错信息.当时怀着好奇心尝试复现和分析了下.现在看来,这是一个既奇怪又很有意思的问题.(前半部分初步分析和实验由我完成,后半部分深入kernel分析和root cause由leon.wang同学完成, 在此表示感谢)
问题现象
-
手机
- 64bit芯片平台
- Android O
- Linux version 4.4.83-01055-g0f06e44
-
复现步骤
- 恢复初厂设置或烧完版本首次开机
- 低概率出现开机黑屏时间略长,但手机仍能正常使用.
定位分析
初步分析
通过复现的多份logcat, 发现ART虚拟机有类似如下的错误打印信息
12-21 09:03:03.909 3296 3296 E zygote : Could not create image space with image file '/system/framework/boot.art'. Attempting to fall back to imageless running. Error was: Failed to open oat file '/data/dalvik-cache/arm/system@[email protected]' referenced from image /data/dalvik-cache/arm/system@[email protected]: File /data/dalvik-cache/arm/system@[email protected] does not exist.Cannot relocate image /system/framework/boot-conscrypt.art to /data/dalvik-cache/arm/system@[email protected]: Cannot patch a secondary image.
12-21 09:03:03.909 3296 3296 E zygote : Attempted image: /system/framework/boot-conscrypt.art
01-01 04:18:22.538 3034 3034 E zygote64: Could not create image space with image file '/system/framework/boot.art'. Attempting to fall back to imageless running. Error was: Failed to open oat file '/data/dalvik-cache/arm64/system@[email protected]' referenced from image /data/dalvik-cache/arm64/system@[email protected]: File /data/dalvik-cache/arm64/system@[email protected] does not exist.Cannot relocate image /system/framework/boot-org.apache.http.legacy.boot.art to /data/dalvik-cache/arm64/system@[email protected]: Cannot patch a secondary image.
01-01 04:18:22.538 3034 3034 E zygote64: Attempted image: /system/framework/boot-org.apache.http.legacy.boot.art
由上述错误分析可知,问题原因是由于找不到指定的oat/vdex文件造成的.
出现以上类似错误后,adb shell到手机到/data/dalvik-cache/
通过log,再次尝试去复现.zygote64正在操作/data/dalvik-cache/arm64/system@[email protected]
01-01 05:59:38.790 3033 3033 I zygote64: image_location = /system/framework/boot-legacy-test.art, found_image = 1, has_system = 1, has_cache = 1
01-01 05:59:38.791 3033 3033 I zygote64: ImageSpace::Init entering image_filename=/data/dalvik-cache/arm64/system@[email protected]
01-01 05:59:38.791 3033 3033 I zygote64: relocate = 1
但没有找到相应的oat文件(只有art/vdex文件)
-rw-r--r-- 1 root root 24576 2012-01-01 05:59 system@[email protected]
lrwxrwxrwx 1 root root 50 2012-01-01 05:59 system@[email protected] -> /system/framework/arm64/boot-legacy-test.vdex
正常流程应是在/data/dalvik-cache/
-rw-r--r-- 1 root root 20480 2012-01-01 05:59 system@[email protected]
lrwxrwxrwx 1 root root 46 2012-01-01 05:59 system@[email protected] -> /system/framework/arm/boot-legacy-test.oat
lrwxrwxrwx 1 root root 46 2012-01-01 05:59 system@[email protected] -> /system/framework/arm/boot-legacy-test.vdex
由于找不到oat/vdex文件,此后zygote会进入异常分支处理,将/data/dalvik-cache/
通过进一步的流程梳理, 最后定位到patchoat模块的symlink系统调用可能存在问题.
// Create a symlink from the old oat to the new oat
if (symlink(input_oat_filename.c_str(), output_oat_filename.c_str()) < 0) {
int err = errno;
LOG(ERROR) << "Failed to create symlink at " << output_oat_filename
<< " error(" << err << "): " << strerror(err);
return false;
}
从现场来看
- 创建符号链接可能出现了问题,导致用户态程序误以为文件生成成功.从而后续虚拟机在映射oat/vdex文件到内存时出现找不到文件的异常情况.
- 从log输出来看,这里没有打印出报错信息,因此,symlink()应是创建成功的.
- 或者创建符号链接成功,但后续访问符号链接期间出现问题.
实验
由于问题复现概率较低,因此第一要务是先提高复现概率.
把低概率问题巧妙地转化为高概率问题甚至必现问题的技巧如下:
编写测试程序,针对symlink系统调用进行循环压力测试.
思路如下:
- 检测符号链接是否存在.若存在,则unlink删除
- 调用symlink创建符号链接
- 检查符号链接是否存在
- 若不存在,则打印错误log并sleep保留现场
- 若存在,则回到1继续循环.
经测试验证,测试程序在手机上均能稳定复现(5分钟内).
... ...
438: we will unlink /data/dalvik-cache/arm64/tim.target
we will link /data/dalvik-cache/arm64/tim.target
439: we will unlink /data/dalvik-cache/arm64/tim.target
we will link /data/dalvik-cache/arm64/tim.target
symlink didn't succeed. /data/dalvik-cache/arm64/tim.target doesn't exist.
查看一下符号链接
# ls -l /data/dalvik-cache/arm64/tim.target
ls: /data/dalvik-cache/arm64/tim.target: No such file or directory
删除符号链接所在子目录
# rm -rf /data/dalvik-cache/arm64/
rm: /data/dalvik-cache/arm64//tim.target: No such file or directory
rm: /data/dalvik-cache/arm64/: Directory not empty
深入分析
由于工作繁忙,kernel部分的工作转给稳定性同学leon.wang继续跟踪处理.
-
source
- kernel/fs/f2fs/namei.c
创建符号链接
由于配置了"CONFIG_F2FS_FS_ENCRYPTION=y", 创建符号链接时,符号链接名会进行加密处理.
357static int f2fs_symlink(struct inode *dir, struct dentry *dentry,
358 const char *symname)
359{
360 struct f2fs_sb_info *sbi = F2FS_I_SB(dir);
361 struct inode *inode;
362 size_t len = strlen(symname);
363 size_t p_len;
364 char *p_str;
365 struct f2fs_str disk_link = FSTR_INIT(NULL, 0);
366 struct f2fs_encrypted_symlink_data *sd = NULL;
367 int err;
368
369 if (len > dir->i_sb->s_blocksize)
370 return -ENAMETOOLONG;
371
372 inode = f2fs_new_inode(dir, S_IFLNK | S_IRWXUGO);
373 if (IS_ERR(inode))
374 return PTR_ERR(inode);
375
376 if (f2fs_encrypted_inode(inode))
377 inode->i_op = &f2fs_encrypted_symlink_inode_operations;
378 else
379 inode->i_op = &f2fs_symlink_inode_operations;
380 inode->i_mapping->a_ops = &f2fs_dblock_aops;
381
382 f2fs_balance_fs(sbi, true);
383
384 f2fs_lock_op(sbi);
385 err = f2fs_add_link(dentry, inode);
386 if (err)
387 goto out;
388 f2fs_unlock_op(sbi);
389 alloc_nid_done(sbi, inode->i_ino);
390
391 if (f2fs_encrypted_inode(dir)) {
392 struct qstr istr = QSTR_INIT(symname, len);
393
394 err = f2fs_get_encryption_info(inode);
395 if (err)
396 goto err_out;
397
398 err = f2fs_fname_crypto_alloc_buffer(inode, len, &disk_link);
399 if (err)
400 goto err_out;
401
402 err = f2fs_fname_usr_to_disk(inode, &istr, &disk_link);
403 if (err < 0)
404 goto err_out;
405
406 p_len = encrypted_symlink_data_len(disk_link.len) + 1;
407
408 if (p_len > dir->i_sb->s_blocksize) {
409 err = -ENAMETOOLONG;
410 goto err_out;
411 }
412
413 sd = kzalloc(p_len, GFP_NOFS);
414 if (!sd) {
415 err = -ENOMEM;
416 goto err_out;
417 }
418 memcpy(sd->encrypted_path, disk_link.name, disk_link.len);
419 sd->len = cpu_to_le16(disk_link.len);
420 p_str = (char *)sd;
421 } else {
422 p_len = len + 1;
423 p_str = (char *)symname;
424 }
425
426 err = page_symlink(inode, p_str, p_len);
427
428err_out:
429 d_instantiate(dentry, inode);
430 unlock_new_inode(inode);
431
432 /*
433 * Let's flush symlink data in order to avoid broken symlink as much as
434 * possible. Nevertheless, fsyncing is the best way, but there is no
435 * way to get a file descriptor in order to flush that.
436 *
437 * Note that, it needs to do dir->fsync to make this recoverable.
438 * If the symlink path is stored into inline_data, there is no
439 * performance regression.
440 */
441 if (!err) {
442 filemap_write_and_wait_range(inode->i_mapping, 0, p_len - 1);
443
444 if (IS_DIRSYNC(dir))
445 f2fs_sync_fs(sbi->sb, 1);
446 } else {
447 f2fs_unlink(dir, dentry);
448 }
449
450 kfree(sd);
451 f2fs_fname_crypto_free_buffer(&disk_link);
452 return err;
453out:
454 handle_failed_inode(inode);
455 return err;
456}
访问符号链接
当访问符号链接时(access()等), 需要先解密.
975#ifdef CONFIG_F2FS_FS_ENCRYPTION
976static const char *f2fs_encrypted_follow_link(struct dentry *dentry, void **cookie)
977{
978 struct page *cpage = NULL;
979 char *caddr, *paddr = NULL;
980 struct f2fs_str cstr = FSTR_INIT(NULL, 0);
981 struct f2fs_str pstr = FSTR_INIT(NULL, 0);
982 struct inode *inode = d_inode(dentry);
983 struct f2fs_encrypted_symlink_data *sd;
984 loff_t size = min_t(loff_t, i_size_read(inode), PAGE_SIZE - 1);
985 u32 max_size = inode->i_sb->s_blocksize;
986 int res;
987
988 res = f2fs_get_encryption_info(inode);
989 if (res)
990 return ERR_PTR(res);
991
992 cpage = read_mapping_page(inode->i_mapping, 0, NULL);
993 if (IS_ERR(cpage))
994 return ERR_CAST(cpage);
995 caddr = kmap(cpage);
996 caddr[size] = 0;
997
998 /* Symlink is encrypted */
999 sd = (struct f2fs_encrypted_symlink_data *)caddr;
1000 cstr.len = le16_to_cpu(sd->len);
1001
1002 /* this is broken symlink case */
1003 if (unlikely(cstr.len == 0)) {
1004 res = -ENOENT;
1005 goto errout;
1006 }
1007 cstr.name = kmalloc(cstr.len, GFP_NOFS);
1008 if (!cstr.name) {
1009 res = -ENOMEM;
1010 goto errout;
1011 }
1012 memcpy(cstr.name, sd->encrypted_path, cstr.len);
1013
1014 /* this is broken symlink case */
1015 if (unlikely(cstr.name[0] == 0)) {
1016 res = -ENOENT;
1017 goto errout;
1018 }
1019
1020 if ((cstr.len + sizeof(struct f2fs_encrypted_symlink_data) - 1) >
1021 max_size) {
1022 /* Symlink data on the disk is corrupted */
1023 res = -EIO;
1024 goto errout;
1025 }
1026 res = f2fs_fname_crypto_alloc_buffer(inode, cstr.len, &pstr);
1027 if (res)
1028 goto errout;
1029
1030 res = f2fs_fname_disk_to_usr(inode, NULL, &cstr, &pstr);
1031 if (res < 0)
1032 goto errout;
1033
1034 kfree(cstr.name);
1035
1036 paddr = pstr.name;
1037
1038 /* Null-terminate the name */
1039 paddr[res] = '\0';
1040
1041 kunmap(cpage);
1042 page_cache_release(cpage);
1043 return *cookie = paddr;
1044errout:
1045 kfree(cstr.name);
1046 f2fs_fname_crypto_free_buffer(&pstr);
1047 kunmap(cpage);
1048 page_cache_release(cpage);
1049 return ERR_PTR(res);
1050}
review这里的代码,会发现这里逻辑上有误. 加密后的密文的每个byte是可以为'\0'的,但解密后的明文首字节就不可以(真正的目标文件,不可为空).
以复现的加密数据为例
[f2fs_encrypted_follow_link] symlink cstr.name=[0 99 1a 2c a1 6f 72 c5 1c b5 | f7 16 a3 5 7e fb dd 18 5c 46 | 79 33 49 6c c6 9d cd a3 6b]
如果遇到上面这种数据,因为1015行的判断,导致错误认为'\0'开头的加密数据是broken的,就会报错-ENOENT.
Root Cause
此问题是由于文件系统加密,在读取symlink相关的文件时遇到kernel原生bug引起的.问题根因不在symlink()系统调用本身,而是symlink加密后,若密文首字节为'\0',则下次访问符号链接时会触发kernel隐藏的原生bug, 从而造成访问符号链接返回"No such file or directory"错误.
问题低概率的原因在于加密后密文首字节是否为'\0'. 通过循环压力测试,则能大大提高此类问题复现的命中率,从而加速此问题得以圆满解决.
解决方案
patch
-
验证
- 使用以上patch进行验证,测试程序长时间都能正常运行.