内核调试案例(oops错误)

原文地址:http://blog.csdn.net/willand1981/article/details/5715492,感谢原文作者。

结合自己的实践和网上的文章,介绍手工调试内核bug的通用方法。


1.步骤
1).Collect oops output, System.map, /proc/ksyms, vmlinux, /proc/modules 
2).Use ksymoops to interpret oops
   Instructions is /usr/src/linux/Documentation/oops-tracing.txt
   Ksymoops(8) man page (http://www.die.net/doc/linux/man/man8/ksymoops.8.html)
   
2.简单分析
1)Ksymoops disassembles the code section 
2)The EIP points to the failing instruction 
3)The call trace section shows how you got there
   Caution: Noise on the stack?
3.找到出错代码 


oops 例子
C Source Code 

int test_read_proc(char *buf, char **start, off_t offset, int count, int *eof, void*data) 
{ int *ptr; ptr=0; printk("%d/n",*ptr); 
return 0; 

写个内核模块程序test.c,使用上面的代码来在/proc/下创建文件。
读去文件的时候会调用这个函数,从而产生下面的oops:

Unable to handle kernel NULL pointer dereference at virtual address 00000000
c2483069 <---EIP (Instruction Pointer or Program Counter)
*pde = 00000000
Oops: 0000
CPU: 
0
EIP: 0010:[ipv6:__insmod_ipv6_O/lib/modules/2.4.10-4GB/kernel/net/ipv6
           ipv6+-472895383/96]
EFLAGS: 00010283
eax: db591f98 ebx: de2aeb60 ecx: de2aeb80 edx: c2483060
esi: 00000c00 edi: d41d0000 ebp: db591f5c esp: db591f4c
ds: 0018 es: 0018 ss: 0018
Process cat (pid: 1986, stackpage=db591000)
Stack: c012ca65 000001f0 ffffffea 00000000 00001000 c014e878 d41d0000 db591f98
       00000000 00000c00 db591f94 00000000 de2aeb60 ffffffea 00000000 00001000
       deae6f40 00000000 00000000 00000000 c01324d6 de2aeb60 0804db50 00001000
Call Trace: [__alloc_pages+65/452] [proc_file_read+204/420] [sys_read+146/200]
[system_call+51/64]
Code:a1 00 00 00 00 50 68 10 31 48 c2 e8 67 38 c9 fd 31 c0 89 ec


使用ksysmoops来获取内河函数地址,(也可以读取/proc/kallsym或者/proc/ksyms文件察看export出来的函数地址)
Using defaults from ksymoops -t elf32-i386 -a i386 

Code; 00000000 Before first symbol
00000000 <_EIP>
:
Code; 00000000 Before first symbol
0: a1 00 00 00 00 mov 0x0,%eax
Code; 00000004 Before first symbol
5: 50 push %eax
Code; 00000006 Before first symbol
6: 68 10 31 48 c2 push $0xc2483110
Code; 0000000a Before first symbol
b: e8 67 38 c9 fd call fdc93877
<_EIP+0xfdc93877> fdc93876 <END_OF_CODE+1e1fa3d8/????>
Code; 00000010 Before first symbol
10: 31 c0 xor %eax,%eax
Code; 00000012 Before first symbol
12: 89 ec mov %ebp,%esp
c2483060 test_read_proc [test]
c2483000 __insmod_test_O/home/ross/prog/test.o_M3 [test]
c2483110 __insmod_test_S.rodata_L68 [test]
c2483060 __insmod_test_S.text_L176 [test]
c2483080 foo [test]
de79c340 ip6_frag_mem [ipv6]
de783d00 addrconf_del_ifaddr [ipv6]
de78a5bc ipv6_packet_init [ipv6]
de78fd70 ipv6_sock_mc_drop [ipv6]
de781ee4 ip6_call_ra_chain [ipv6]

可以看出EIP的c2483069是在[test]模块的test_read_proc 函数中。
(EIP) - (Base addr of routine) 
c2483069 - c2483060 = 9 
下一步反汇编test.o, 找到偏移量为9行即位代码出错行。

找到出错代码,使用objdump:
Excerpt from "objdump -D test.o " 
test.o: 
file 
format elf32-i386 
Disassembly of section .text: 
00000000 <test_read_proc>:
0: 55 push %ebp 
1: 89 e5 mov %esp,%ebp 
3: 83 ec 08 sub $0x8,%esp 
6: 83 c4 f8 add $0xfffffff8,%esp 
9: a1 00 00 00 00 mov 0x0,%eax 
e: 50 push %eax 
f: 68 00 00 00 00 push $0x0 

C Source Code 

int test_read_proc(char *buf, char **start, off_t offset, int count, int *eof, void*data) 
{ int *ptr; ptr=0; printk("%d/n",*ptr); 
return 0; 

你可能感兴趣的:(c,汇编,File,null,System,output)