Stack corruption bug is sometimes difficult to fix if we can't find out the steps to reproduce it. The cause of the bug may not be so obvious. The best thing is to have the culprit reveal itself as soon as possible, even before the stack corrupted. In this post, I'll introduce the tool to help discover stack corruption.
Rationale:
The figure below shows the structure of stack frame. It's important to know that stack grows downwards. The callee's frame is at lower position relative to caller's frame, and the callee's local variables are at lower position relative to return address. So, if our code carelessly write to a local variable beyond its boundary, the saved %ebp and return address may be corrupted. The appication may continue running until the callee returns or even later and then crash.
This is a sample code demonstrates this:
1 int foo(int a)
2 {
3 char var[4];
4 strcpy(var, "corrupt me!!!");
5 int a, b;
6 a = a + b;
7 return 0;
8 }
9
10 int bar()
11 {
12 return foo();
13 }
It's not hard to see that the saved %ebp should always stay unchanged during the execution of the callee since it will be used on return to restore the caller's %ebp.
So we can:
- Save the saved %ebp value at the beginning of the callee;
- Get the saved %ebp value before the callee returns;
- Compare these two value to see if they are the same;
Implementation:
The saved %ebp is the value of the memory that %ebp register is pointing to. In order to get its value, we need to use assembly language. But it's not difficult, it usually doesn't take more than one instruction to achieve. Here is the one for GCC on x86 platform.
asm("mov (%%ebp),%0": "=r" (variable for storing ebp's value));
Armed with this knowledge, we have the macro below to help detecting stack corruption.
1 #ifndef _h_DBGHELPER
2 #define _h_DBGHELPER
3
4
5 #include
6
7
8 #define STACKCHECK
9 #ifdef STACKCHECK // stack check enabled
10
11 #define STACK_CHECK_RAND 0xCD000000
12 #define STACK_CHECK_MASK 0x00FFFFFF
13
14 // the internal logic of checking stack state
15 #define STACK_CHECK_END_INTERNAL() u_STACK_CHECK_EBP_VALUE_RETURN = ((u_STACK_CHECK_EBP_VALUE_RETURN & STACK_CHECK_MASK)/
16 | STACK_CHECK_RAND);/
17 if((u_STACK_CHECK_EBP_VALUE_ENTER & ~STACK_CHECK_MASK) != STACK_CHECK_RAND)/
18 {/
19 fprintf(stderr, /
20 "Corrupted u_STACK_CHECK_EBP_VALUE_ENTER!! It's %x/n", u_STACK_CHECK_EBP_VALUE_ENTER);/
21 assert((u_STACK_CHECK_EBP_VALUE_ENTER & ~STACK_CHECK_MASK) == STACK_CHECK_RAND);/
22 }/
23 if((u_STACK_CHECK_EBP_VALUE_RETURN & ~STACK_CHECK_MASK) != STACK_CHECK_RAND)/
24 {/
25 fprintf(stderr, /
26 "Corrupted u_STACK_CHECK_EBP_VALUE_RETURN!! It's %x/n", u_STACK_CHECK_EBP_VALUE_RETURN);/
27 assert((u_STACK_CHECK_EBP_VALUE_RETURN & ~STACK_CHECK_MASK) == STACK_CHECK_RAND);/
28 }/
29 if(u_STACK_CHECK_EBP_VALUE_ENTER != u_STACK_CHECK_EBP_VALUE_RETURN)/
30 {/
31 fprintf(stderr, "Stack overflow!!!/nThe EBP should be %x, but it's %x( %s )/n/n",/
32 u_STACK_CHECK_EBP_VALUE_ENTER, u_STACK_CHECK_EBP_VALUE_RETURN, /
33 (char*)&u_STACK_CHECK_EBP_VALUE_RETURN);/
34 assert(u_STACK_CHECK_EBP_VALUE_RETURN == u_STACK_CHECK_EBP_VALUE_ENTER);/
35 }
36 // end
37
38 #ifndef ARM_9260EK // x86
39 #define STACK_CHECK_BEGIN() unsigned int u_STACK_CHECK_EBP_VALUE_ENTER = 0; /
40 asm("mov (%%ebp),%0"/
41 : "=r" (u_STACK_CHECK_EBP_VALUE_ENTER));/
42 u_STACK_CHECK_EBP_VALUE_ENTER = (u_STACK_CHECK_EBP_VALUE_ENTER & STACK_CHECK_MASK) | STACK_CHECK_RAND
43
44 #define STACK_CHECK_END() do{unsigned int u_STACK_CHECK_EBP_VALUE_RETURN = 0;/
45 asm("mov (%%ebp),%0"/
46 : "=r" (u_STACK_CHECK_EBP_VALUE_RETURN));/
47 STACK_CHECK_END_INTERNAL();}while(0)
48
49
50 #else // arm
51 #define STACK_CHECK_BEGIN() unsigned int u_STACK_CHECK_EBP_VALUE_ENTER = 0; /
52 asm("str fp, %0 /n" /
53 : "=m" (u_STACK_CHECK_EBP_VALUE_ENTER)); /
54 u_STACK_CHECK_EBP_VALUE_ENTER = (u_STACK_CHECK_EBP_VALUE_ENTER & STACK_CHECK_MASK) | STACK_CHECK_RAND
55
56 #define STACK_CHECK_END() do{unsigned int u_STACK_CHECK_EBP_VALUE_RETURN = 0;/
57 asm("str fp, %0 /n" /
58 : "=m" (u_STACK_CHECK_EBP_VALUE_RETURN));/
59 STACK_CHECK_END_INTERNAL();}while(0)
60
61 #endif
62
63
64 #else // STACK Check disabled
65
66 #define STACK_CHECK_BEGIN() do{}while(0)
67 #define STACK_CHECK_END() do{}while(0)
68
69 #endif
70
71 #endif // _h_DBGHELPER
The basic idea of the macro is pretty much the same as I mentioned before. One thing to note is the variables used to keep the value of %ebp register are defined on the stack too. So they are on the current frame and may be corrupted too. In order to avoid this, we have several options. First, we can define them as static so that they will be in global data region rather than stack. But it will be unusable in mutl-threading environment. Second, we can define them on heap. Third, we can use a predefined random value to guard these variables and make sure they're not overwritten.
The third option is the one we used here.
Usage:
We can update previous code to take advantage of this feature as follows:
1 int foo(int a)
2 {
3 STACK_CHECK_BEGIN();
4 char var[4];
5 strcpy(var, "corrupt me!!!");
6 int a, b;
7 a = a + b;
8 STACK_CHECK_END();
9 return 0;
10 }
11
12 int bar()
13 {
14 return foo();
15 }
The application will gracefully assert that it detects a stack corruption just before the foo() method returns.
Microsoft's c++ compiler and gcc have already provide stack checking functions. But I still think the macro is convenient and my effort has greatly consolidated my understanding of stack structure.
References:
http://blogs.msdn.com/vcblog/archive/2009/03/19/gs.aspx