ARM Linux崩溃分析(三) - 内核崩溃的实例分析

测试代码如下:

#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 

static void test_crash(void)
{
	char *pstr = NULL;
	printk("drivr crash \n");
	*pstr = 12;
	printk("%s \n",pstr);
	return;
}

static int __init test_init(void)
{
	printk("drivr test \n");
	
	test_crash();
	return  0;
}


static void __exit test_exit(void)
{
	
	printk("drivr exit   \n");
	return ;
}


module_init(test_init);
module_exit(test_exit);

MODULE_AUTHOR("Alex");
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("test driver");
MODULE_VERSION("v0.0.1");

修改成built-in模式,把驱动编译进内核

obj-y = driver_test.o

重新烧录内核,
启动报错如下:

……
drivr test 
drivr crash 
Unable to handle kernel NULL pointer dereference at virtual address 00000000
Mem abort info:
  ESR = 0x96000045
  Exception class = DABT (current EL), IL = 32 bits
  SET = 0, FnV = 0
  EA = 0, S1PTW = 0
Data abort info:
  ISV = 0, ISS = 0x00000069
  CM = 0, WnR = 1
pgd = ffffff8008979000
[00000000] *pgd=0000000063bfe003, *pud=0000000063bfe003
, *pmd=0000000000000000
Internal error: Oops: 96000045 [#1] SMP
Modules linked in:
CPU: 2 PID: 1 Comm: swapper/0 Not tainted 4.9.37 #4
Hardware name: Hisilicon HI3559AV100 DEMO Board (DT)
task: ffffffc022458000 task.stack: ffffffc022460000
PC is at test_init+0x34/0x48
LR is at test_init+0x20/0x48
pc : [<ffffff8008898cbc>] lr : [<ffffff8008898ca8>] pstate: 60000005
sp : ffffffc022463dc0
x29: ffffffc022463dc0 x28: 0000000000000000 
x27: ffffff80088ae9e0 x26: ffffff80088a5268 
x25: ffffff80088731d8 x24: ffffff8008880458 
x23: ffffff800892a000 x22: 0000000000000006 
x21: 0000000000000000 x20: ffffff8008898c88 
x19: ffffffc022460000 x18: ffffffc021f39085 
x17: 000000000000000b x16: 0000000000000001 
x15: 0000000000008000 x14: 0000000000000095 
x13: 0000000000000000 x12: 0000000000000007 
x11: 0000000000000002 x10: 0000000000000096 
x9 : 000000000000000d x8 : ffffffc023b79f54 
x7 : 0000000000000000 x6 : ffffff8008930d84 
x5 : 000000000000000a x4 : 0000000000000000 
x3 : 000000000000000c x2 : 0000000000000000 
x1 : 0000000000000000 x0 : ffffff80087e94b8 

Process swapper/0 (pid: 1, stack limit = 0xffffffc022460020)
Stack: (0xffffffc022463dc0 to 0xffffffc022464000)
3dc0: ffffffc022463dd0 ffffff8008083108 ffffffc022463e40 ffffff8008880bf8
3de0: 00000000000000a6 ffffff800892a000 ffffff80088a5278 0000000000000006
3e00: ffffff80088ae600 0000000000000000 ffffff800892a000 ffffff80087afce8
3e20: 0000000600000006 ffffff8008880458 0000000000000000 ffffff80088731d8
3e40: ffffffc022463ea0 ffffff800869bc28 ffffff800869bc18 0000000000000000
3e60: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
3e80: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
3ea0: 0000000000000000 ffffff8008082ee0 ffffff800869bc18 0000000000000000
3ec0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
3ee0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
3f00: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
3f20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
3f40: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
3f60: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
3f80: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
3fa0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
3fc0: 0000000000000000 0000000000000005 0000000000000000 0000000000000000
3fe0: 0000000000000000 0000000000000000 add5d5aae5136836 8710e15983171f1a
Call trace:
Exception stack(0xffffffc022463bf0 to 0xffffffc022463d20)
3be0:                                   ffffffc022460000 0000007fffffffff
3c00: ffffffc022463dc0 ffffff8008898cbc ffffff8008933198 000000000000000c
3c20: ffffffc022463c40 ffffff80080db544 ffffff80087b21f0 00000000088e6310
3c40: ffffffc022463ce0 ffffff80080db828 ffffffc022460000 ffffff8008898c88
3c60: 0000000000000000 0000000000000006 ffffff800892a000 ffffff8008880458
3c80: ffffff80088731d8 ffffff80088a5268 ffffff80087e94b8 0000000000000000
3ca0: 0000000000000000 000000000000000c 0000000000000000 000000000000000a
3cc0: ffffff8008930d84 0000000000000000 ffffffc023b79f54 000000000000000d
3ce0: 0000000000000096 0000000000000002 0000000000000007 0000000000000000
3d00: 0000000000000095 0000000000008000 0000000000000001 000000000000000b
[<ffffff8008898cbc>] test_init+0x34/0x48
[<ffffff8008083108>] do_one_initcall+0x38/0x128
[<ffffff8008880bf8>] kernel_init_freeable+0x144/0x1e0
[<ffffff800869bc28>] kernel_init+0x10/0x100
[<ffffff8008082ee0>] ret_from_fork+0x10/0x30
Code: 52800183 b0fffa80 d2800001 9112e000 (39000043) 
---[ end trace 81c2d58de8ff7b35 ]---
Kernel panic - not syncing: Fatal exception
SMP: stopping secondary CPUs
Kernel Offset: disabled
Memory Limit: 512 MB
---[ end Kernel panic - not syncing: Fatal exception

直接确定发生错误的函数
看到这句

PC is at test_init+0x34/0x48

出现错误时我们最关注的就是PC值,因为它就是发生错误的指令的地址。

(1)明确出错原因。

由出错信息“Unable to handle kernel NULL pointer dereference at virtual address 00000000”
可知内核是因为非法地址访问出错,使用了空指针。

(2)根据栈回溯信息找出函数调用关系。

内核崩溃时,可以从 pc 寄存器得知崩溃发生时的函数、出错指令。但是很多情况下,错误有可能是它的调用者引入的,所以找出函数的调用关系也很重要。

使用 Oops 的栈信息手工进行栈回溯

前面说过,从 Oops 信息的 pc 寄存器值可知得知崩溃发生时的函数、出错指令。但是错误有可能是它的调用者引入的,所以还要找出函数的调用关系。
由于内核配置了 CONFIG_FRAME_POINTER,当出现 Oops 信息时,会打印栈回溯信息。如果内核没有配置 CONFIG_FRAME_POINTER,这时可以自己分析栈信息,找到函数的调用关系。

1.栈的作用

一个程序包含代码段、数据段、BSS 段、堆、栈;其中数据段用来中存储初始值不为 0的全局数据,BSS 段用来存储初始值为 0 的全局数据,堆用于动态内存分配,栈用于实现函数调用、存储局部变量。
被调用函数在执行之前,它会将一些寄存器的值保存在栈中,其中包括返回地址寄存器lr。如果知道了所保存的 lr 寄存的值,那么就可以知道它的调用者是谁。在栈信息中,一个函数一个函数地往上找出所有保存的 lr 值,就可以知道各个调用函数,这就是栈回溯的原理。

注意

如果内核没有配置 CONFIG_FRAME_POINTER,打印的是:

pc : [<ffffff8008898cbc>] lr : [<ffffff8008898ca8>]

只有地址信息

使用反汇编:

aarch64-himix100-linux-objdump -D vmlinux > vmlinux.dis

根据地址信息:
ffffff8008898cbc
查到的信息如下:

ffffff8008898c88 :
ffffff8008898c88:	a9bf7bfd 	stp	x29, x30, [sp, #-16]!
ffffff8008898c8c:	b0fffa80 	adrp	x0, ffffff80087e9000 
ffffff8008898c90:	91126000 	add	x0, x0, #0x498
ffffff8008898c94:	910003fd 	mov	x29, sp
ffffff8008898c98:	97e204c2 	bl	ffffff8008119fa0 
ffffff8008898c9c:	b0fffa80 	adrp	x0, ffffff80087e9000 
ffffff8008898ca0:	9112a000 	add	x0, x0, #0x4a8
ffffff8008898ca4:	97e204bf 	bl	ffffff8008119fa0 
ffffff8008898ca8:	d2800002 	mov	x2, #0x0                   	// #0
ffffff8008898cac:	52800183 	mov	w3, #0xc                   	// #12
ffffff8008898cb0:	b0fffa80 	adrp	x0, ffffff80087e9000 
ffffff8008898cb4:	d2800001 	mov	x1, #0x0                   	// #0
ffffff8008898cb8:	9112e000 	add	x0, x0, #0x4b8
ffffff8008898cbc:	39000043 	strb	w3, [x2]
ffffff8008898cc0:	97e204b8 	bl	ffffff8008119fa0 
ffffff8008898cc4:	52800000 	mov	w0, #0x0                   	// #0
ffffff8008898cc8:	a8c17bfd 	ldp	x29, x30, [sp], #16
ffffff8008898ccc:	d65f03c0 	ret

也能很快的找到问题

结束!

你可能感兴趣的:(嵌入式Linux)