为了定位问题我们往往要求客户发个coredump回来,而客户运行的程序一般都是release版本,这样的版本用GDB一调能得到call stack,但却没有参数,对问题的判断有很大阻碍。本节期望找到这些参数的值。
期望得到类似:
#0 call2 (arg1=21, arg2=22) at release_core.c:28
#1 0x0000000000400820 in call1 (arg1=11, arg2=12) at release_core.c:37
#2 0x0000000000400863 in call0 (arg1=1, arg2=2) at release_core.c:43
#3 0x00000000004008ec in main () at release_core.c:60
而不是:
#0 0x000000000040079d in call2 ()
#1 0x0000000000400820 in call1 ()
#2 0x0000000000400863 in call0 ()
#3 0x00000000004008ec in main ()
#include
#include
#include
#include
#include
#define __USE_GNU
#include
void action(int sig, siginfo_t* siginfo, void* context)
{
char func_name[]="action_handler";
char desc[]="I am in handler - action\n";
write(stdout, desc, sizeof(desc));
exit(-1);
}
void call2(int arg1, int arg2){
char eye_catcher[]="call2";
printf("I am call2, arg1=%d, arg2=%d\n", arg1, arg2);
int *p=0;
*p=5; //cause SIGSEGV
}
void call1(int arg1, int arg2){
char eye_catcher[]="call1";
printf("I am call1, arg1=%d, arg2=%d\n", arg1, arg2);
call2(21,22);
}
void call0(int arg1, int arg2){
char eye_catcher[]="call0";
printf("I am call0, arg1=%d, arg2=%d\n", arg1, arg2);
call1(11,12);
}
int main(void)
{
char eye_catcher[]="main_func";
struct sigaction act;
memset(&act, 0, sizeof(act));
act.sa_sigaction = action;
act.sa_flags = SA_SIGINFO;
if (sigaction(SIGSEGV, &act, NULL) < 0) {
perror("sigaction");
return 1;
}
call0(1,2);
printf("end...\n");
return EXIT_SUCCESS;
}
多层函数调用而已。
现在我们模拟客户运行到call2失败了
[mzhai]$ gcc release_core.c
[mzhai]$ gdb ./a.out
(gdb) b main
(gdb) r
(gdb) b call2
(gdb) c
(gdb) where
#0 0x000000000040079d in call2 ()
#1 0x0000000000400820 in call1 ()
#2 0x0000000000400863 in call0 ()
#3 0x00000000004008ec in main ()
打印了调用栈,参数值没打印出来,很郁闷。
根据X86 64bit调用规则: 参数值少于7个的话,都用寄存器传值,从左到右分别是rdi, rsi, rdx, rcx, r8, r9.
让我们看看rdi, rsi的值:
(gdb) info reg edi esi
edi 0x15 21
esi 0x16 22
果然是源码中传给call2的参数值21,22.
但是,这是call2刚刚运行的情况,此时rdi/rsi的值还没改变;如果运行一大截后很难保证没有指令改变了他们的值,这种情况怎么办?让我们研究下call2的上一层frame call1.
调用约定:
不管三七二十一,我们先找到call1的栈基:
(gdb) p $rbp
$1 = (void *) 0x7fffffffe080
(gdb) x /xg $rbp
0x7fffffffe080: 0x00007fffffffe0b0
(gdb) x /xg 0x00007fffffffe0b0
0x7fffffffe0b0: 0x00007fffffffe0e0
0x00007fffffffe0b0 即为call1的栈基(同理0x00007fffffffe0e0即为call0的栈基,暂且不表)。然后,看下call1的汇编代码:
(gdb) disass call1
Dump of assembler code for function call1:
0x00000000004007df <+0>: push %rbp
0x00000000004007e0 <+1>: mov %rsp,%rbp
0x00000000004007e3 <+4>: sub $0x20,%rsp
0x00000000004007e7 <+8>: mov %edi,-0x14(%rbp)
0x00000000004007ea <+11>: mov %esi,-0x18(%rbp)
edi(rdi是64位,edi是其中的32位)及esi分别保存到rbp-0x14, rbp-0x18处,它们分别就是call1(arg0, arg1) 中的arg0和arg1, 依据源码它们分别为11,12。让我们来验证下:
(gdb) set $call1_rbp=0x00007fffffffe0b0
(gdb) x /xw $call1_rbp-0x14
0x7fffffffe09c: 0x0000000b
(gdb) x /xw $call1_rbp-0x18
0x7fffffffe098: 0x0000000c
完全正确!
依照相同的办法,我们也能找出call0的参数:
(gdb) set $call0_rbp=0x00007fffffffe0e0
(gdb) disass call0
Dump of assembler code for function call0:
0x0000000000400822 <+0>: push %rbp
0x0000000000400823 <+1>: mov %rsp,%rbp
0x0000000000400826 <+4>: sub $0x20,%rsp
0x000000000040082a <+8>: mov %edi,-0x14(%rbp)
0x000000000040082d <+11>: mov %esi,-0x18(%rbp)
...
(gdb) x /xw $call0_rbp-0x14
0x7fffffffe0cc: 0x00000001
(gdb) x /xw $call0_rbp-0x18
0x7fffffffe0c8: 0x00000002
行文至此,可以看到确实能找到函数的参数值。
上面我们是自己找某层的rbp, 如果调用层次比较深的话容易算错,这一点GDB可以帮我们做,只需要用frame切换到那一层,$rbp即为那层的栈基(但别的寄存器的值并不一定对)。example:
(gdb) f 1
#1 0x0000000000400820 in call1 ()
(gdb) p $rbp
$2 = (void *) 0x7fffffffe0b0
(gdb)
(gdb) f 2
#2 0x0000000000400863 in call0 ()
(gdb) p $rbp
$3 = (void *) 0x7fffffffe0e0
只是有一点要注意:当调用栈中有信号处理函数的话,会多一层signal handler called
Breakpoint 3, 0x0000000000400711 in action ()
(gdb) where
#0 0x0000000000400711 in action ()
#1
#2 0x00000000004007d7 in call2 ()
#3 0x0000000000400824 in call1 ()
#4 0x0000000000400862 in call0 ()
#5 0x00000000004008eb in main ()
这一层是操作系统给加上的,里面主要存放了调用action之前的旧寄存器的值(如果感兴趣里面存的内容,请移步上一篇博文《为SIGSEGV设置handler有用吗?》),用GDB列出的rbp其实是#2 call2的rbp,#1可以在分析函数参数时可以被忽略掉。
(gdb) p $rbp
$1 = (void *) 0x7fffffffd370
(gdb) f 1
#1
(gdb) p $rbp
$2 = (void *) 0x7fffffffe080
(gdb) f 2
#2 0x00000000004007d7 in call2 ()
(gdb) p $rbp
$3 = (void *) 0x7fffffffe080
通过rbp链及函数的汇编代码(一般是前10行)即可确定各个参数的值。
此过程利用gdb script应该能自动化。有空实现一个。