kernel ramdump分析--如何启动crash

一.怎么抓取kernel ramdump

1.手机准备

到代码的根目录 执行

python vendor/xiaomi/securebootsigner/Qualcomm/tools/debugpolicy.py
然后会自动重启
第二步
重启之后 需要有root
adb root
adb shell "echo 1 > /sys/module/msm_poweroff/parameters/download_mode"
如何确认是否打开 download mode
adb shell "cat /sys/module/msm_poweroff/parameters/download_mode"
返回值是1 就可以了
如果重启手机了,需要重新执行第二步
复现之后 如果是底层重启,手机会进入黑屏状态,连上linux lsusb 查看 会有一个 900e 或者 9091的设备
此时用高通qpst configuration 抓dump 就行了。(装好qpst 打开 qpst configuration, 手机连接电脑,如果是900e的话,会自动抓 dump的)
备注:因为很多watchdog问题都是线程D状态引起的,所以我们再分析类似问题的时候是需要ramdump的,我们再测试的时候最好setprop persist.sys.crashOnWatchdog true. 这样的话,发生watchdog问题的时候会自动进入到抓ramdump的模式下,然后就能最大限度的保留现场,以便后续分析。
2.qpst环境搭建

安装包下载路径:
http://note.youdao.com/noteshare?id=4b317b88f46638ec8af54953864f7116
分别解压安装:
1.qpst.win.2.7_installer_00472.4.zip
2.qxdm.win.4.0_installer_00210.1.zip
3.QUD.WIN.1.1+Installer-10039.2.rar

二.怎么分析kernel ramdump
1.crash工具安装

首先需要安装一改crash工具,安装包下载链接:
https://github.com/crash-utility/crash/releases

2.怎么加载ramdump

我们抓到的ramdump的文件大概如下:

    pzc@pzc-K56CM:~/log/C8/c8-ramdump$ ls
    CODERAM.BIN   DDRCS1_0.BIN  dump_info.txt  IPA_HRAM.BIN  IPA_SRAM.BIN  logcat.bin  PART_BIN.BIN  PMON_HIS.BIN  vmlinux-ee0535c
    DATARAM.BIN   DDRCS1_1.BIN  IPA_DICT.BIN   IPA_IRAM.BIN  lastkmsg.txt  MSGRAM.BIN  PIMEM.BIN     RST_STAT.BIN
    DDRCS0_0.BIN  DDR_DATA.BIN  IPA_DRAM.BIN   IPA_MBOX.BIN  load.cmm      OCIMEM.BIN  PMIC_PON.BIN  tz_log.txt

第一步:

    pzc@pzc-K56CM:~/log/C8/c8-ramdump$ hexdump  -e '16/4 "%08x " "\n"' -s 0x03f6d4 -n 8 OCIMEM.BIN
    94800000 0000000a

取得--kaslr 的地址:94800000 0000000a

第二步:

确保--kaslr 后跟的地址正确:0xa94800000

    pzc@pzc-K56CM:~/log/C8/c8-ramdump$ crash64 vmlinux-ee0535c DDRCS0_0.BIN@0x0000000080000000,DDRCS0_1.BIN@0x0000000100000000,DDRCS1_0.BIN@0x0000000140000000,DDRCS1_1.BIN@0x00000001c0000000 --kaslr 0xa94800000
     
    crash64 7.1.9
    Copyright (C) 2002-2016  Red Hat, Inc.
    Copyright (C) 2004, 2005, 2006, 2010  IBM Corporation
    Copyright (C) 1999-2006  Hewlett-Packard Co
    Copyright (C) 2005, 2006, 2011, 2012  Fujitsu Limited
    Copyright (C) 2006, 2007  VA Linux Systems Japan K.K.
    Copyright (C) 2005, 2011  NEC Corporation
    Copyright (C) 1999, 2002, 2007  Silicon Graphics, Inc.
    Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
    This program is free software, covered by the GNU General Public License,
    and you are welcome to change it and/or distribute copies of it under
    certain conditions.  Enter "help copying" to see the conditions.
    This program has absolutely no warranty.  Enter "help warranty" for details.
     
    GNU gdb (GDB) 7.6
    Copyright (C) 2013 Free Software Foundation, Inc.
    License GPLv3+: GNU GPL version 3 or later
    This is free software: you are free to change and redistribute it.
    There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
    and "show warranty" for details.
    This GDB was configured as "--host=x86_64-unknown-linux-gnu --target=aarch64-elf-linux"...
    please wait... (patching 161877 gdb minimal_symbol values)

2.分析ramdump

等待大概两分钟就会进入调试模式:

    WARNING: cannot determine starting stack frame for task ffffffce2f11cb00
          KERNEL: vmlinux-ee0535c           
       DUMPFILES: /var/tmp/ramdump_elf_uvwal1 [temporary ELF header]
                  DDRCS0_0.BIN
                  DDRCS1_0.BIN
                  DDRCS1_1.BIN
            CPUS: 8
            DATE: Thu Jan  4 09:26:45 2018
          UPTIME: 00:02:09
    LOAD AVERAGE: 6.68, 2.96, 1.12
           TASKS: 2833
        NODENAME: localhost
         RELEASE: 4.4.21-perf-g91f9a92-00622-gee0535c
         VERSION: #1 SMP PREEMPT Thu Dec 21 03:26:45 CST 2017
         MACHINE: aarch64  (unknown Mhz)
          MEMORY: 5.7 GB
           PANIC: "Unable to handle kernel NULL pointer dereference at virtual address 00000200"
             PID: 0
         COMMAND: "swapper/0"
            TASK: ffffff8a9ec15750  (1 of 8)  [THREAD_INFO: ffffff8a9ec00000]
             CPU: 0
           STATE: TASK_RUNNING
         WARNING: panic task not found
     
     
    crash64>

我们可以看当前存在的D状态的进程:

    crash64> ps | grep "UN"
         59      2   1  ffffffce34ddbe80  UN   0.0       0      0  [kworker/u16:1]
        163      2   0  ffffffcd35344b00  UN   0.0       0      0  [mdss_dsi_event]
        326      2   3  ffffffce33031900  UN   0.0       0      0  [irq/265-synapti]
        431      2   0  ffffffcd349bf080  UN   0.0       0      0  [mmc-cmdqd/0]
        501      2   2  ffffffcd3451e400  UN   0.0       0      0  [msm-core:sampli]
        692      1   0  ffffffce2f11d780  UN   0.5  184732  41064  surfaceflinger

切换到326进程:

    crash64> set 326
        PID: 326
    COMMAND: "irq/265-synapti"
       TASK: ffffffce33031900  [THREAD_INFO: ffffffcd34fec000]
        CPU: 3
      STATE: TASK_UNINTERRUPTIBLE
    crash64>

查看当前进程的调用栈:

    crash64> bt
    PID: 326    TASK: ffffffce33031900  CPU: 3   COMMAND: "irq/265-synapti"
     #0 [ffffffcd34fef360] __switch_to at ffffff8a9c885560
     #1 [ffffffcd34fef390] __schedule at ffffff8a9d6ecd18
     #2 [ffffffcd34fef3f0] schedule at ffffff8a9d6ed07c
     #3 [ffffffcd34fef410] do_exit at ffffff8a9c8a3d7c
     #4 [ffffffcd34fef480] die at ffffff8a9c88864c
     #5 [ffffffcd34fef4d0] __do_kernel_fault at ffffff8a9c8991a0
     #6 [ffffffcd34fef500] do_translation_fault at ffffff8a9c8975dc
     #7 [ffffffcd34fef540] do_mem_abort at ffffff8a9c880ad8
     #8 [ffffffcd34fef720] el1_da at ffffff8a9c883cf8
         PC: ffffff8a9c8bc178  [kthread_data+4]
         LR: ffffff8a9c8f74a8  [irq_thread_dtor+68]
         SP: ffffffcd34fef720  PSTATE: 60000145
        X29: ffffffcd34fef720  X28: ffffffcd34fec000  X27: 0000000000000005
        X26: 0000000000000001  X25: ffffff8a9ec05000  X24: ffffffcd34fef7d0
        X23: ffffff8a9ec17000  X22: 0000000000000000  X21: ffffff8a9ef8f000
        X20: ffffffce33031900  X19: ffffffce33031900  X18: 0000000000000010
        X17: 000000000000000e  X16: 0000000000000007  X15: ffffff8a9d8c0000
        X14: 2d6d64742d696164  X13: 00000000001c1f9e  X12: 0000000000989680
        X11: 0000000041acdf40  X10: ffffffce3d4ffc78   X9: ffffffce3d4ffc88
         X8: ffffffcd34f4f320   X7: 0000000000000000   X6: 0000000000000004
         X5: 00000000036399ed   X4: ffffffce33032018   X3: 0000000000000000
         X2: 0000000000000000   X1: ffffff8a9c8f7464   X0: 0000000000000000
     #9 [ffffffcd34fef740] task_work_run at ffffff8a9c8ba24c
    #10 [ffffffcd34fef770] do_exit at ffffff8a9c8a4074
    #11 [ffffffcd34fef7e0] die at ffffff8a9c88864c
    #12 [ffffffcd34fef830] __do_kernel_fault at ffffff8a9c8991a0
    #13 [ffffffcd34fef860] do_page_fault at ffffff8a9c8974d0
    #14 [ffffffcd34fef8d0] do_translation_fault at ffffff8a9c897574
    #15 [ffffffcd34fef910] do_mem_abort at ffffff8a9c880ad8
    #16 [ffffffcd34fefaf0] el1_da at ffffff8a9c883cf8
         PC: ffffff8a9cfca228  [synaptics_rmi4_add_and_update_tp_data+36]
         LR: ffffff8a9cfa9ff0  [input_event+524]
         SP: ffffffcd34fefaf0  PSTATE: 80000005
        X29: ffffffcd34fefaf0  X28: 0000000000000000  X27: 0000000000000005
        X26: 0000000000000001  X25: ffffffcd34849000  X24: 0000000000000003
        X23: ffffff8a9ec06000  X22: 0000000000000036  X21: ffffffcd34fefbf8
        X20: ffffff8a9ec06000  X19: ffffffcd34848800  X18: 0000000000000060
        X17: 000000000000000e  X16: 0000000000000007  X15: ffffff8a9d8c0000
        X14: 0000000000000000  X13: 00000000001b1c92  X12: 0000000000989680
        X11: 0000000040ffdb77  X10: 00000000000008c0   X9: ffffffcd34fec000


好了,环境搭建和初步的调试就是这样了,具体问题再具体分析吧。后边会说一个分析的实例
 

你可能感兴趣的:(kernel ramdump分析--如何启动crash)