一.怎么抓取kernel ramdump
1.手机准备
到代码的根目录 执行
python vendor/xiaomi/securebootsigner/Qualcomm/tools/debugpolicy.py
然后会自动重启
第二步
重启之后 需要有root
adb root
adb shell "echo 1 > /sys/module/msm_poweroff/parameters/download_mode"
如何确认是否打开 download mode
adb shell "cat /sys/module/msm_poweroff/parameters/download_mode"
返回值是1 就可以了
如果重启手机了,需要重新执行第二步
复现之后 如果是底层重启,手机会进入黑屏状态,连上linux lsusb 查看 会有一个 900e 或者 9091的设备
此时用高通qpst configuration 抓dump 就行了。(装好qpst 打开 qpst configuration, 手机连接电脑,如果是900e的话,会自动抓 dump的)
备注:因为很多watchdog问题都是线程D状态引起的,所以我们再分析类似问题的时候是需要ramdump的,我们再测试的时候最好setprop persist.sys.crashOnWatchdog true. 这样的话,发生watchdog问题的时候会自动进入到抓ramdump的模式下,然后就能最大限度的保留现场,以便后续分析。
2.qpst环境搭建
安装包下载路径:
http://note.youdao.com/noteshare?id=4b317b88f46638ec8af54953864f7116
分别解压安装:
1.qpst.win.2.7_installer_00472.4.zip
2.qxdm.win.4.0_installer_00210.1.zip
3.QUD.WIN.1.1+Installer-10039.2.rar
二.怎么分析kernel ramdump
1.crash工具安装
首先需要安装一改crash工具,安装包下载链接:
https://github.com/crash-utility/crash/releases
2.怎么加载ramdump
我们抓到的ramdump的文件大概如下:
pzc@pzc-K56CM:~/log/C8/c8-ramdump$ ls
CODERAM.BIN DDRCS1_0.BIN dump_info.txt IPA_HRAM.BIN IPA_SRAM.BIN logcat.bin PART_BIN.BIN PMON_HIS.BIN vmlinux-ee0535c
DATARAM.BIN DDRCS1_1.BIN IPA_DICT.BIN IPA_IRAM.BIN lastkmsg.txt MSGRAM.BIN PIMEM.BIN RST_STAT.BIN
DDRCS0_0.BIN DDR_DATA.BIN IPA_DRAM.BIN IPA_MBOX.BIN load.cmm OCIMEM.BIN PMIC_PON.BIN tz_log.txt
第一步:
pzc@pzc-K56CM:~/log/C8/c8-ramdump$ hexdump -e '16/4 "%08x " "\n"' -s 0x03f6d4 -n 8 OCIMEM.BIN
94800000 0000000a
取得--kaslr 的地址:94800000 0000000a
第二步:
确保--kaslr 后跟的地址正确:0xa94800000
pzc@pzc-K56CM:~/log/C8/c8-ramdump$ crash64 vmlinux-ee0535c DDRCS0_0.BIN@0x0000000080000000,DDRCS0_1.BIN@0x0000000100000000,DDRCS1_0.BIN@0x0000000140000000,DDRCS1_1.BIN@0x00000001c0000000 --kaslr 0xa94800000
crash64 7.1.9
Copyright (C) 2002-2016 Red Hat, Inc.
Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation
Copyright (C) 1999-2006 Hewlett-Packard Co
Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited
Copyright (C) 2006, 2007 VA Linux Systems Japan K.K.
Copyright (C) 2005, 2011 NEC Corporation
Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions. Enter "help copying" to see the conditions.
This program has absolutely no warranty. Enter "help warranty" for details.
GNU gdb (GDB) 7.6
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "--host=x86_64-unknown-linux-gnu --target=aarch64-elf-linux"...
please wait... (patching 161877 gdb minimal_symbol values)
2.分析ramdump
等待大概两分钟就会进入调试模式:
WARNING: cannot determine starting stack frame for task ffffffce2f11cb00
KERNEL: vmlinux-ee0535c
DUMPFILES: /var/tmp/ramdump_elf_uvwal1 [temporary ELF header]
DDRCS0_0.BIN
DDRCS1_0.BIN
DDRCS1_1.BIN
CPUS: 8
DATE: Thu Jan 4 09:26:45 2018
UPTIME: 00:02:09
LOAD AVERAGE: 6.68, 2.96, 1.12
TASKS: 2833
NODENAME: localhost
RELEASE: 4.4.21-perf-g91f9a92-00622-gee0535c
VERSION: #1 SMP PREEMPT Thu Dec 21 03:26:45 CST 2017
MACHINE: aarch64 (unknown Mhz)
MEMORY: 5.7 GB
PANIC: "Unable to handle kernel NULL pointer dereference at virtual address 00000200"
PID: 0
COMMAND: "swapper/0"
TASK: ffffff8a9ec15750 (1 of 8) [THREAD_INFO: ffffff8a9ec00000]
CPU: 0
STATE: TASK_RUNNING
WARNING: panic task not found
crash64>
我们可以看当前存在的D状态的进程:
crash64> ps | grep "UN"
59 2 1 ffffffce34ddbe80 UN 0.0 0 0 [kworker/u16:1]
163 2 0 ffffffcd35344b00 UN 0.0 0 0 [mdss_dsi_event]
326 2 3 ffffffce33031900 UN 0.0 0 0 [irq/265-synapti]
431 2 0 ffffffcd349bf080 UN 0.0 0 0 [mmc-cmdqd/0]
501 2 2 ffffffcd3451e400 UN 0.0 0 0 [msm-core:sampli]
692 1 0 ffffffce2f11d780 UN 0.5 184732 41064 surfaceflinger
切换到326进程:
crash64> set 326
PID: 326
COMMAND: "irq/265-synapti"
TASK: ffffffce33031900 [THREAD_INFO: ffffffcd34fec000]
CPU: 3
STATE: TASK_UNINTERRUPTIBLE
crash64>
查看当前进程的调用栈:
crash64> bt
PID: 326 TASK: ffffffce33031900 CPU: 3 COMMAND: "irq/265-synapti"
#0 [ffffffcd34fef360] __switch_to at ffffff8a9c885560
#1 [ffffffcd34fef390] __schedule at ffffff8a9d6ecd18
#2 [ffffffcd34fef3f0] schedule at ffffff8a9d6ed07c
#3 [ffffffcd34fef410] do_exit at ffffff8a9c8a3d7c
#4 [ffffffcd34fef480] die at ffffff8a9c88864c
#5 [ffffffcd34fef4d0] __do_kernel_fault at ffffff8a9c8991a0
#6 [ffffffcd34fef500] do_translation_fault at ffffff8a9c8975dc
#7 [ffffffcd34fef540] do_mem_abort at ffffff8a9c880ad8
#8 [ffffffcd34fef720] el1_da at ffffff8a9c883cf8
PC: ffffff8a9c8bc178 [kthread_data+4]
LR: ffffff8a9c8f74a8 [irq_thread_dtor+68]
SP: ffffffcd34fef720 PSTATE: 60000145
X29: ffffffcd34fef720 X28: ffffffcd34fec000 X27: 0000000000000005
X26: 0000000000000001 X25: ffffff8a9ec05000 X24: ffffffcd34fef7d0
X23: ffffff8a9ec17000 X22: 0000000000000000 X21: ffffff8a9ef8f000
X20: ffffffce33031900 X19: ffffffce33031900 X18: 0000000000000010
X17: 000000000000000e X16: 0000000000000007 X15: ffffff8a9d8c0000
X14: 2d6d64742d696164 X13: 00000000001c1f9e X12: 0000000000989680
X11: 0000000041acdf40 X10: ffffffce3d4ffc78 X9: ffffffce3d4ffc88
X8: ffffffcd34f4f320 X7: 0000000000000000 X6: 0000000000000004
X5: 00000000036399ed X4: ffffffce33032018 X3: 0000000000000000
X2: 0000000000000000 X1: ffffff8a9c8f7464 X0: 0000000000000000
#9 [ffffffcd34fef740] task_work_run at ffffff8a9c8ba24c
#10 [ffffffcd34fef770] do_exit at ffffff8a9c8a4074
#11 [ffffffcd34fef7e0] die at ffffff8a9c88864c
#12 [ffffffcd34fef830] __do_kernel_fault at ffffff8a9c8991a0
#13 [ffffffcd34fef860] do_page_fault at ffffff8a9c8974d0
#14 [ffffffcd34fef8d0] do_translation_fault at ffffff8a9c897574
#15 [ffffffcd34fef910] do_mem_abort at ffffff8a9c880ad8
#16 [ffffffcd34fefaf0] el1_da at ffffff8a9c883cf8
PC: ffffff8a9cfca228 [synaptics_rmi4_add_and_update_tp_data+36]
LR: ffffff8a9cfa9ff0 [input_event+524]
SP: ffffffcd34fefaf0 PSTATE: 80000005
X29: ffffffcd34fefaf0 X28: 0000000000000000 X27: 0000000000000005
X26: 0000000000000001 X25: ffffffcd34849000 X24: 0000000000000003
X23: ffffff8a9ec06000 X22: 0000000000000036 X21: ffffffcd34fefbf8
X20: ffffff8a9ec06000 X19: ffffffcd34848800 X18: 0000000000000060
X17: 000000000000000e X16: 0000000000000007 X15: ffffff8a9d8c0000
X14: 0000000000000000 X13: 00000000001b1c92 X12: 0000000000989680
X11: 0000000040ffdb77 X10: 00000000000008c0 X9: ffffffcd34fec000
好了,环境搭建和初步的调试就是这样了,具体问题再具体分析吧。后边会说一个分析的实例