Reversing MS VC++Part I: Exception Handling

Reversing MS VC++Part I: Exception Handling

摘要

MS VC++ 是Win32平台上最广泛使用的编译器,因此熟悉它的内部工作机制对于Win32逆向爱好者非常重要。能够理解编译器生成的附加(glue)代码有助于快速理解程序员写的实际代码。同样也有助于恢复程序的高级结构。

在这个两部分组成的系列文章的Part I中,我会专注于栈的结构,异常处理和由MSVC编译出的程序的相关结构。前提是假设你对汇编器,寄存器,调用习惯有一定程度的熟悉。

 

术语:

·        栈帧:堆栈上由一个函数占用的一段。通常包括函数参数,返回到调用者的地址,保存的寄存器值,局部变量和这个函数中的其它特定数据。在X86(以及其它大多数架构)中调用者和被调用者的栈帧是连续的。

·        帧指针:它是一个寄存器或者变量,指向栈帧内部的一个固定地址。通常栈帧内所有数据都是以相对于这个指针的地址引用的。在X86上通常是ebp,并且指向返回地址的下一个位置。

·        对象。一个C++类的实例。

·        可展开对象。由auto storage-class指示符修饰的局部对象,它分配在栈上,并且当超出域作用范围(scope)时需要析构。

·        栈展开。当发生异常,控制离开对象域作用范围(scope)时会导致对象的自动析构,就是栈展开。

 

有两种类型的异常可以用在C或C++程序中。

Ÿ          SEH异常(Structured Exception Handling)。也被叫做Win32异常或系统异常。它们已经被著名的Matt Pietrek[1]解释的非常详尽。它们只能被用在C程序中。编译器级的支持包括关键字__try, __except,__finally和其它一些。

Ÿ          C++异常(有时候也叫做EH)。它基于SEH实现,C++异常允许抛出和捕获任意类型的异常。C++的一个非常重要的特点是在异常处理过程中自动的栈展开,并且MSVC使用了一种非常复杂的底层框架来确保它在任何情况都能正常运作。

 

在下面的图例中,内存地址从上到下增加,所以栈是“增长”的。这也是IDA采用的描述栈的方法,但和几乎其它所有描述相反。

基本的帧布局

最基本的栈帧布局如下,

 

   ...

   Local variables

   Other saved registers

   Saved ebp

   Return address

   Function arguments

...

 

注意:如果允许了忽略帧指针 (frame pointeromission),则saved ebp可能不存在。

SEH

在使用了编译器级SEH (__try/__except/__finally)的时候,栈的布局变得有一点复杂。

 

SEH3Stack Layout

 

当在某函数中没有__except块(只有__finally)时,不再使用saved ebp。Scopetable是一个记录(record)的数组,每个record描述了一个__try块,以及块之间的关系。

   

struct_SCOPETABLE_ENTRY {

     DWORD EnclosingLevel;

     void* FilterFunc;

     void* HandlerFunc;

}

 

更多的SEH实现细节请看[1]。为了恢复try块,请注意观察try块的层次变量是如何更新的。每一个try块都分配了一个唯一的数作为标识,scopetable表中条目(entry)间的关系则描述了try块的嵌套关系。例如,如果scopetable的第i项的EnclosingLevel等于j,则表示try块j包围了try块i。 函数体自身被认为拥有级别-1。请参看附录1作为例子。

Buffer Overrun Protection

Whidbey(MSVC2005)编译器为SEH帧增加了一些缓冲区溢出(overrun)保护。完整的栈帧布局如下:

 

SEH4Stack Layout

 

GS cookie只有在编译时打开/GS参数才存在。EH cookie总是存在。SEH4 scopetable基本和SEH3一样,只是加了一个头,

 

   struct _EH4_SCOPETABLE {

       DWORD GSCookieOffset;

       DWORD GSCookieXOROffset;

       DWORD EHCookieOffset;

       DWORD EHCookieXOROffset;

       _EH4_SCOPETABLE_RECORD ScopeRecord[1];

   };

 

   struct _EH4_SCOPETABLE_RECORD {

       DWORD EnclosingLevel;

       long (*FilterFunc)();

           union {

           void (*HandlerAddress)();

           void (*FinallyFunc)();

       };

   };

 

GSCookieOffset =-2 意味着没有使用GScookie。 EH cookie总是存在。偏移量是相对于ebp的。检查按照下列方式进行: (ebp+CookieXOROffset) ^ [ebp+CookieOffset] == _security_cookie指向栈中scopetable的指针同样也和__security_cookie进行了异或。而且,在SEH4中最外层的级别是-2,而不是SEH3的-1。

C++异常模块实现

当函数采用C++异常处理(try/catch)或者有可展开对象时,情形更加复杂。

C++EH Stack Layout

 

EH handler对每个函数都不相同(SEH正好相反),通常像这样,

   

(VC7+)

   mov eax, OFFSET __ehfuncinfo

   jmp ___CxxFrameHandler

 

__ehfuncinfo是一个类型为FuncInfo的结构体,它完整地描述了所有 try/catch块和所有可展开对象。

 

   struct FuncInfo {

     // compiler version.

     // 0x19930520: up to VC6, 0x19930521: VC7.x(2002-2003), 0x19930522: VC8(2005)

     DWORD magicNumber;

 

     // number of entries in unwind table

     int maxState;

 

     // table of unwind destructors

     UnwindMapEntry* pUnwindMap;

 

     // number of try blocks in the function

     DWORD nTryBlocks;

 

     // mapping of catch blocks to try blocks

     TryBlockMapEntry* pTryBlockMap;

 

     // not used on x86

     DWORD nIPMapEntries;

 

     // not used on x86

     void* pIPtoStateMap;

 

     // VC7+ only, expected exceptions list (function "throw"specifier)

     ESTypeList* pESTypeList;

 

     // VC8+ only, bit 0 set if function was compiled with /EHs

      int EHFlags;

};

 

Unwind map和SHE的scopetable类似,但没有过滤(filter)函数。

  

structUnwindMapEntry {

     int toState;        // targetstate

     void (*action)();   // action toperform (unwind funclet address)

};

 

Try块描述子,描述了一个try块及其相关的catch块,

 

struct TryBlockMapEntry{

     int tryLow;

     int tryHigh;    // this try {}covers states ranging from tryLow to tryHigh

     int catchHigh;  // highest stateinside catch handlers of this try

     int nCatches;   // number of catchhandlers

     HandlerType* pHandlerArray; //catch handlers table

};

 

Catch块描述子,描述了一个try块的某一个catch块(因为一个try可以同时有几个catch块)。

 

structHandlerType {

  // 0x01: const, 0x02: volatile, 0x08:reference

  DWORD adjectives;

 

  // RTTI descriptor of the exception type.0=any (ellipsis)

  TypeDescriptor* pType;

 

  // ebp-based offset of the exception objectin the function stack.

  // 0 = no object (catch by type)

  int dispCatchObj;

 

  // address of the catch handler code.

  // returns address where to continuesexecution (i.e. code after the try block)

  void* addressOfHandler;

};

 

可预期异常链表(expected exceptions)(默认情况下,MSVC实现了它但没有打开,可以用/d1ESrt使之生效)。

 

   struct ESTypeList {

     // number of entries in the list

     int nCount;

 

     // list of exceptions; it seems only pType field in HandlerType is used

     HandlerType* pTypeArray;

};

 

RTTI类型描述子。描述了单个的C++类型。在这里用它来匹配抛出的异常类型。

 

struct TypeDescriptor {

  // vtable of type_info class

  const void * pVFTable;

 

  // used to keep thedemangled name returned by type_info::name()

  void* spare;

 

  // mangled type name, e.g.".H" = "int", ".?AUA@@" = "struct A",".?AVA@@" = "class A"

  char name[0];

};

 

不似SEH,每个try块并没有一个与之相关的状态值。编译器不仅在进入和退出try块时修改状态值,还在每次构造和析构对象时修改。这样它就有可能在发生异常时知道哪个对象需要展开。你仍然可以通过检查与之关联的状态范围和由catch handler返回的地址来恢复try块的边界(参看附录2)。

抛出C++异常

Throw语句被转换为对_CxxThrowException()的调用,后者才真正的抛出一个Win32异常,以及异常代码0xE06D7363('msc'|0xE0000000)。 可自定义的Win32异常参数包括指向异常对象的指针,和它的ThrowInfo结构,使用该结构可以让异常处理程序(handler)检查catch处理程序(handler)期待的类型和抛出异常的类型是否匹配。

 

   struct ThrowInfo {

     // 0x01: const, 0x02: volatile

     DWORD attributes;

 

     // exception destructor

     void (*pmfnUnwind)();

 

     // forward compatibility handler

     int (*pForwardCompat)();

 

     // list of types that can catch this exception.

     // i.e. the actual type and all its ancestors.

     CatchableTypeArray* pCatchableTypeArray;

   };

 

   struct CatchableTypeArray {

     // number of entries in the following array

     int nCatchableTypes;

     CatchableType* arrayOfCatchableTypes[0];

};

 

下面描述了一个可以捕获该异常的类型。

 

   struct CatchableType {

     // 0x01: simple type (can be copied by memmove), 0x02: can be caught byreference only, 0x04: has virtual bases

     DWORD properties;

 

     // see above

     TypeDescriptor* pType;

 

     // how to cast the thrown object to this type

     PMD thisDisplacement;

 

     // object size

     int sizeOrOffset;

 

     // copy constructor address

     void (*copyFunction)();

   };

 

   // Pointer-to-member descriptor.

   struct PMD {

     // member offset

     int mdisp;

 

     // offset of the vbtable (-1 if not a virtual base)

     int pdisp;

 

     // offset to the displacement value inside the vbtable

     int vdisp;

};

 

在下一篇文章中我们会更加深入。

Prologs and Epilogs

相对于在函数体内生成代码来建立栈帧的方法,编译器可能会选择调用特定的prolog和epilog函数。它们有若干变种,每一种用于特定的函数类型。

 

 

Name

Type

EH Cookie

GS Cookie

Catch Handlers

_SEH_prolog/_SEH_epilog

SEH3

-

-

 

_SEH_prolog4/_SEH_epilog4 S

EH4

+

-

 

_SEH_prolog4_GS/_SEH_epilog4_GS

SEH4

+

+

 

_EH_prolog

C++ EH

-

-

+/-

_EH_prolog3/_EH_epilog3

C++ EH

+

-

-

_EH_prolog3_catch/_EH_epilog3

C++ EH

+

-

+

_EH_prolog3_GS/_EH_epilog3_GS

C++ EH

+

+

-

_EH_prolog3_catch_GS/_EH_epilog3_catch_GS

C++ EH

+

+

+

 

SEH2

显然,在过去它用于MSVC 1.XX编译器(由crtdll.dll导出)。可能会在一些老的NT程序中碰到它。

   ...

   Saved edi

   Saved esi

   Saved ebx

   Next SEH frame

   Current SEH handler (__except_handler2)

   Pointer to the scopetable

   Try level

   Saved ebp (of this function)

   Exception pointers

   Local variables

   Saved ESP

   Local variables

   Callee EBP

   Return address

   Function arguments

...

Appendix I: SEH 样例

让我们思考下面的反汇编代码。

 

func1           proc near

 

_excCode        = dword ptr -28h

buf             = byte ptr -24h

_saved_esp     = dword ptr -18h

_exception_info = dword ptr -14h

_next           = dword ptr -10h

_handler        = dword ptr -0Ch

_scopetable     = dword ptr -8

_trylevel       = dword ptr -4

str             = dword ptr  8

 

 push    ebp

 mov     ebp, esp

 push    -1

 push    offset _func1_scopetable

 push    offset _except_handler3

 mov     eax, large fs:0

 push    eax

 mov     large fs:0, esp

 add     esp, -18h

 push    ebx

 push    esi

 push    edi

 

  ;--- end of prolog ---

 

 mov     [ebp+_trylevel], 0;trylevel -1 -> 0: beginning of try block 0

 mov     [ebp+_trylevel], 1;trylevel 0 -> 1: beginning of try block 1

 mov     large dword ptr ds:123,456

 mov     [ebp+_trylevel], 0;trylevel 1 -> 0: end of try block 1

 jmp     short _endoftry1

 

_func1_filter1:                         ; __except() filter oftry block 1

 mov     ecx, [ebp+_exception_info]

 mov     edx,[ecx+EXCEPTION_POINTERS.ExceptionRecord]

 mov     eax,[edx+EXCEPTION_RECORD.ExceptionCode]

 mov     [ebp+_excCode], eax

 mov     ecx, [ebp+_excCode]

 xor     eax, eax

 cmp     ecx,EXCEPTION_ACCESS_VIOLATION

 setz    al

 retn

 

_func1_handler1:                        ; beginning of handlerfor try block 1

 mov     esp, [ebp+_saved_esp]

 push    offset aAccessViolatio ;"Access violation"

 call    _printf

 add     esp, 4

 mov     [ebp+_trylevel], 0;trylevel 1 -> 0: end of try block 1

 

_endoftry1:

 mov     edx, [ebp+str]

 push    edx

 lea     eax, [ebp+buf]

 push    eax

 call    _strcpy

 add     esp, 8

 mov     [ebp+_trylevel], -1 ;trylevel 0 -> -1: end of try block 0

 call    _func1_handler0     ; execute __finally of try block 0

 jmp     short _endoftry0

 

_func1_handler0:                        ; __finally handler oftry block 0

 push    offset aInFinally ;"in finally"

 call    _puts

 add     esp, 4

 retn

 

_endoftry0:

  ;--- epilog ---

 mov     ecx, [ebp+_next]

 mov     large fs:0, ecx

 pop     edi

 pop     esi

 pop     ebx

 mov     esp, ebp

 pop     ebp

 retn

func1           endp

 

_func1_scopetable

 ;try block 0

  dd-1                      ;EnclosingLevel

  dd0                       ;FilterFunc

  ddoffset _func1_handler0  ;HandlerFunc

 

 ;try block 1

  dd0                       ;EnclosingLevel

  ddoffset _func1_filter1   ;FilterFunc

  ddoffset _func1_handler1  ;HandlerFunc

 

Try块0没有filter,因此它的handler是一个__finally块。Try块1的EnclosingLevel是0,所以它被置于try块0内部。考虑到这些,我们就可以试着重构出函数的结构:

 

    void func1 (char* str)
    {
      char buf[12];
      __try // try block 0
      {
         __try // try block 1
         {
           *(int*)123=456;
         }
         __except(GetExceptCode() == EXCEPTION_ACCESS_VIOLATION)
         {
            printf("Access violation");
         }
         strcpy(buf,str);
      }
      __finally
      {
         puts("in finally");
      }
    }

Appendix II: C++异常样例

func1           proc near

 

_a1             = dword ptr -24h

_exc            = dword ptr -20h

e               = dword ptr -1Ch

a2              = dword ptr -18h

a1              = dword ptr -14h

_saved_esp      = dword ptr -10h

_next          = dword ptr -0Ch

_handler        = dword ptr -8

_state          = dword ptr -4

 

 push    ebp

 mov     ebp, esp

 push    0FFFFFFFFh

 push    offset func1_ehhandler

 mov     eax, large fs:0

 push    eax

 mov     large fs:0, esp

 push    ecx

  sub     esp, 14h

 push    ebx

 push    esi

 push    edi

 mov     [ebp+_saved_esp], esp

 

  ;--- end of prolog ---

 

 lea     ecx, [ebp+a1]

 call    A::A(void)

 mov     [ebp+_state], 0          ; state -1 -> 0: a1 constructed

 mov     [ebp+a1], 1              ; a1.m1 = 1

 mov     byte ptr [ebp+_state], 1 ;state 0 -> 1: try {

 lea     ecx, [ebp+a2]

 call    A::A(void)

 mov     [ebp+_a1], eax

 mov     byte ptr [ebp+_state], 2 ;state 2: a2 constructed

 mov     [ebp+a2], 2              ; a2.m1 = 2

 mov     eax, [ebp+a1]

 cmp     eax, [ebp+a2]            ; a1.m1 == a2.m1?

 jnz     short loc_40109F

 mov     [ebp+_exc], offsetaAbc  ; _exc = "abc"

 push    offset __TI1?PAD         ; char *

 lea     ecx, [ebp+_exc]

 push    ecx

 call    _CxxThrowException       ; throw "abc";

 

loc_40109F:

 mov     byte ptr [ebp+_state], 1 ;state 2 -> 1: destruct a2

 lea     ecx, [ebp+a2]

 call    A::~A(void)

 jmp     short func1_try0end

 

; catch (char * e)

func1_try0handler_pchar:

 mov     edx, [ebp+e]

  push    edx

 push    offset aCaughtS ;"Caught %s\n"

 call    ds:printf       ;

 add     esp, 8

 mov     eax, offset func1_try0end

 retn

 

; catch (...)

func1_try0handler_ellipsis:

 push    offset aCaught___ ;"Caught ...\n"

 call    ds:printf

 add     esp, 4

 mov     eax, offset func1_try0end

 retn

 

func1_try0end:

 mov     [ebp+_state], 0          ; state 1 -> 0: }//try

 push    offset aAfterTry ;"after try\n"

 call    ds:printf

 add     esp, 4

 mov     [ebp+_state], -1         ; state 0 -> -1: destruct a1

 lea     ecx, [ebp+a1]

 call    A::~A(void)

  ;--- epilog ---

 mov     ecx, [ebp+_next]

 mov     large fs:0, ecx

 pop     edi

 pop     esi

 pop     ebx

 mov     esp, ebp

 pop     ebp

 retn

func1           endp

 

func1_ehhandler proc near

 mov     eax, offset func1_funcinfo

 jmp     __CxxFrameHandler

func1_ehhandler endp

 

func1_funcinfo

  dd19930520h            ; magicNumber

  dd4                    ; maxState

  ddoffset func1_unwindmap ; pUnwindMap

  dd1                    ; nTryBlocks

  ddoffset func1_trymap  ; pTryBlockMap

  dd0                    ; nIPMapEntries

  dd0                    ; pIPtoStateMap

  dd0                    ; pESTypeList

 

func1_unwindmap

  dd-1

  ddoffset func1_unwind_1tobase ; action

  dd0                    ; toState

  dd0                    ; action

  dd1                    ; toState

  ddoffset func1_unwind_2to1 ; action

  dd0                    ; toState

  dd0                    ; action

 

func1_trymap

  dd1                    ; tryLow

  dd 2                    ; tryHigh

  dd3                    ; catchHigh

  dd2                    ; nCatches

  ddoffset func1_tryhandlers_0 ; pHandlerArray

  dd0

 

func1_tryhandlers_0

dd 0                    ; adjectives

dd offset char * `RTTI Type Descriptor' ;pType

dd -1Ch                 ; dispCatchObj

dd offset func1_try0handler_pchar ;addressOfHandler

dd 0                    ; adjectives

dd 0                    ; pType

dd 0                    ; dispCatchObj

dd offset func1_try0handler_ellipsis ;addressOfHandler

 

func1_unwind_1tobase proc near

a1 = byte ptr -14h

 lea     ecx, [ebp+a1]

 call    A::~A(void)

 retn

func1_unwind_1tobase endp

 

func1_unwind_2to1 proc near

a2 = byte ptr -18h

 lea     ecx, [ebp+a2]

 call    A::~A(void)

 retn

func1_unwind_2to1 endp

 

我们看看能找到些什么。FuncInfo结构的maxState域是4,表示我们在unwindmap中有4项,从0到3。通过检查这个map,我们看到下列动作在栈展开中被执行:

 

Ÿ          state 3 -> state 0 (noaction)

Ÿ          state 2 -> state 1 (destructa2)

Ÿ          state 1 -> state 0 (noaction)

Ÿ          state 0 -> state -1(destruct a1)

 

再看看try map,我们可以推断状态1和2对应于try块,状态3对应于catch块。这样,从状态0转换到1指明了try块的开始,从1到0表示try块执行完毕。从函数代码,我们也可以看到从-1到0是构造a1,从1到2是构造a2。所以状态图应该象这样:

 

那箭头1到3从何而来?我们在函数代码中看不到,在FuncInfo也看不到,因为它是异常handler完成的。如果一个异常发生在try块内部,异常handler首先展开栈到tryLow表示的状态(这里指状态1),然后在调用catch handler前设置状态值为tryHigh+1(2+1=3)。

这个try块有两个catchhandlers。第一个指定了一个期待的异常类型(char*),并从栈中获得异常对象e(-1Ch=e)。第二个没有指定类型(比如那个省略号)。它们都返回用于恢复执行流的地址,例如,刚好在try块后面的那个地址。现在,我们恢复的函数代码如下:

   
     void func1 ()
    {
      A a1;
      a1.m1 = 1;
      try {
        A a2;
        a2.m1 = 2;
        if (a1.m1 == a1.m2) throw "abc";
      }
      catch(char* e)
      {
        printf("Caught %s\n",e);
      }
      catch(...)
      {
        printf("Caught ...\n");
      }
      printf("after try\n");
    }

Appendix III: IDC Helper Script

我写过一个IDC脚本用于辅助逆向MSVC程序。它在整个程序中搜索典型的SEH/EH代码序列,并标注出所有相关的结构和域。类似于栈变量,异常处理程序,异常类型等等都被标注了出来。它还试图修复有时候会被IDA错误判定的函数边界。你可以从这里下载。

Links and References

[1] Matt Pietrek. A Crash Course on the Depths of Win32 StructuredException Handling.
http://www.microsoft.com/msj/0197/exception/exception.aspx
Still THE definitive guide on the implementation of SEH in Win32.

[2] Brandon Bray. Security Improvements to the Whidbey Compiler.
http://blogs.msdn.com/branbray/archive/2003/11/11/51012.aspx
Short description on changes in the stack layout for cookie checks.

[3] Chris Brumme. The Exception Model.
http://blogs.msdn.com/cbrumme/archive/2003/10/01/51524.aspx
Mostly about .NET exceptions, but still contains a good deal of informationabout SEH and C++ exceptions.

[4] Vishal Kochhar. How a C++ compiler implements exception handling.
http://www.codeproject.com/cpp/exceptionhandler.asp
An overview of C++ exceptions implementation.

[5] Calling Standard for Alpha Systems. Chapter 5. Event Processing.
http://www.cs.arizona.edu/computer.help/policy/DIGITAL_unix/AA-PY8AC-TET1_html/callCH5.html
Win32 takes a lot from the way Alpha handles exceptions and this manual has avery detailed description on how it happens.

Structure definitions and flag values were also recovered from the followingsources:

  • VC8 CRT debug information (many structure definitions)
  • VC8 assembly output (/FAs)
  • VC8 WinCE CRT source


//////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////

Reversing Microsoft Visual C++ Part I: Exception Handling

原文链接http://www.openrce.org/articles/full_view/21


Abstract

Microsoft Visual C++ is the most widely used compiler for Win32 so it is important for the Win32 reverser to be familiar with its inner working. Being able to recognize the compiler-generated glue code helps to quickly concentrate on the actual code written by the programmer. It also helps in recovering the high-level structure of the program. 

In part I of this 2-part article (see also: Part II: Classes, Methods and RTTI), I will concentrate on the stack layout, exception handling and related structures in MSVC-compiled programs. Some familiarity with assembler, registers, calling conventions etc. is assumed. 

Terms:
  • Stack frame: A fragment of the stack segment used by a function. Usually contains function arguments, return-to-caller address, saved registers, local variables and other data specific to this function. On x86 (and most other architectures) caller and callee stack frames are contiguous.
  • Frame pointer: A register or other variable that points to a fixed location inside the stack frame. Usually all data inside the stack frame is addressed relative to the frame pointer. On x86 it's usually ebp and it usually points just below the return address.
  • Object: An instance of a (C++) class.
  • Unwindable Object: A local object with auto storage-class specifier that is allocated on the stack and needs to be destructed when it goes out of scope.
  • Stack UInwinding: Automatic destruction of such objects that happens when the control leaves the scope due to an exception.
There are two types of exceptions that can be used in a C or C++ program.
  • SEH exceptions (from "Structured Exception Handling"). Also known as Win32 or system exceptions. These are exhaustively covered in the famous Matt Pietrek article[1]. They are the only exceptions available to C programs. The compiler-level support includes keywords __try, __except, __finally and a few others.
  • C++ exceptions (sometimes referred to as "EH"). Implemented on top of SEH, C++ exceptions allow throwing and catching of arbitrary types. A very important feature of C++ is automatic stack unwinding during exception processing, and MSVC uses a pretty complex underlying framework to ensure that it works properly in all cases.
In the following diagrams memory addresses increase from top to bottom, so the stack grows "up". It's the way the stack is represented in IDA and opposite to the most other publications. 

Basic Frame Layout

The most basic stack frame looks like following: 
  1. ...  
  2.     Local variables  
  3.     Other saved registers  
  4.     Saved ebp  
  5.     Return address  
  6.     Function arguments  
  7.     ...  

Note: If frame pointer omission is enabled, saved ebp might be absent. 

SEH

In cases where the compiler-level SEH (__try/__except/__finally) is used, the stack layout gets a little more complicated. 

Reversing MS VC++Part I: Exception Handling_第1张图片 
SEH3 Stack Layout


When there are no __except blocks in a function (only __finally), Saved ESP is not used. Scopetable is an array of records which describe each __try block and relationships between them: 
  1. struct _SCOPETABLE_ENTRY {  
  2.   DWORD EnclosingLevel;  
  3.   void* FilterFunc;  
  4.   void* HandlerFunc;  
  5. }  

For more details on SEH implementation see[1]. To recover try blocks watch how the try level variable is updated. It's assigned a unique number per try block, and nesting is described by relationship between scopetable entries. E.g. if scopetable entry i has EnclosingLevel=j, then try block j encloses try block i. The function body is considered to have try level -1. See Appendix 1 for an example. 


Buffer Overrun Protection

The Whidbey (MSVC 2005) compiler adds some buffer overrun protection for the SEH frames. The full stack frame layout in it looks like following: 

Reversing MS VC++Part I: Exception Handling_第2张图片 
SEH4 Stack Layout


The GS cookie is present only if the function was compiled with /GS switch. The EH cookie is always present. The SEH4 scopetable is basically the same as SEH3 one, only with added header: 
  1. struct _EH4_SCOPETABLE {  
  2.     DWORD GSCookieOffset;  
  3.     DWORD GSCookieXOROffset;  
  4.     DWORD EHCookieOffset;  
  5.     DWORD EHCookieXOROffset;  
  6.     _EH4_SCOPETABLE_RECORD ScopeRecord[1];  
  7. };  
  8.   
  9. struct _EH4_SCOPETABLE_RECORD {  
  10.     DWORD EnclosingLevel;  
  11.     long (*FilterFunc)();  
  12.         union {  
  13.         void (*HandlerAddress)();  
  14.         void (*FinallyFunc)();  
  15.     };  
  16. };  
GSCookieOffset = -2 means that GS cookie is not used. EH cookie is always present. Offsets are ebp relative. Check is done the following way: (ebp+CookieXOROffset) ^ [ebp+CookieOffset] == _security_cookie Pointer to the scopetable in the stack is XORed with the _security_cookie too. Also, in SEH4 the outermost scope level is -2, not -1 as in SEH3. 

C++ Exception Model Implementation

When C++ exceptions handling (try/catch) or unwindable objects are present in the function, things get pretty complex. 

Reversing MS VC++Part I: Exception Handling_第3张图片 
C++ EH Stack Layout


EH handler is different for each function (unlike the SEH case) and usually looks like this: 
  1.  (VC7+)  
  2. mov eax, OFFSET __ehfuncinfo  
  3. jmp ___CxxFrameHandler  

__ehfuncinfo is a structure of type FuncInfo which fully describes all try/catch blocks and unwindable objects in the function. 

  1. struct FuncInfo {  
  2.    // compiler version.  
  3.    // 0x19930520: up to VC6, 0x19930521: VC7.x(2002-2003), 0x19930522: VC8 (2005)  
  4.    DWORD magicNumber;  
  5.   
  6.    // number of entries in unwind table  
  7.    int maxState;  
  8.   
  9.    // table of unwind destructors  
  10.    UnwindMapEntry* pUnwindMap;  
  11.   
  12.    // number of try blocks in the function  
  13.    DWORD nTryBlocks;  
  14.   
  15.    // mapping of catch blocks to try blocks  
  16.    TryBlockMapEntry* pTryBlockMap;  
  17.   
  18.    // not used on x86  
  19.    DWORD nIPMapEntries;  
  20.   
  21.    // not used on x86  
  22.    void* pIPtoStateMap;  
  23.   
  24.    // VC7+ only, expected exceptions list (function "throw" specifier)   
  25.    ESTypeList* pESTypeList;  
  26.   
  27.    // VC8+ only, bit 0 set if function was compiled with /EHs  
  28.    int EHFlags;  
  29.  };  

Unwind map is similar to the SEH scopetable, only without filter functions: 

  1. struct UnwindMapEntry {  
  2.   int toState;        // target state  
  3.   void (*action)();   // action to perform (unwind funclet address)  
  4. };  

Try block descriptor. Describes a try{} block with associated catches. 

  1. struct TryBlockMapEntry {  
  2.   int tryLow;  
  3.   int tryHigh;    // this try {} covers states ranging from tryLow to tryHigh  
  4.   int catchHigh;  // highest state inside catch handlers of this try  
  5.   int nCatches;   // number of catch handlers  
  6.   HandlerType* pHandlerArray; //catch handlers table  
  7. };  

Catch block descriptor. Describes a single catch() of a try block. 

  1. struct HandlerType {  
  2.   // 0x01: const, 0x02: volatile, 0x08: reference  
  3.   DWORD adjectives;  
  4.   
  5.   // RTTI descriptor of the exception type. 0=any (ellipsis)  
  6.   TypeDescriptor* pType;  
  7.   
  8.   // ebp-based offset of the exception object in the function stack.  
  9.   // 0 = no object (catch by type)  
  10.   int dispCatchObj;  
  11.   
  12.   // address of the catch handler code.  
  13.   // returns address where to continues execution (i.e. code after the try block)  
  14.   void* addressOfHandler;  
  15. };  

List of expected exceptions (implemented but not enabled in MSVC by default, use /d1ESrt to enable). 

  1. struct ESTypeList {  
  2.    // number of entries in the list  
  3.    int nCount;  
  4.   
  5.    // list of exceptions; it seems only pType field in HandlerType is used  
  6.    HandlerType* pTypeArray;  
  7.  };  

RTTI type descriptor. Describes a single C++ type. Used here to match the thrown exception type with catch type. 

  1. struct TypeDescriptor {  
  2.   // vtable of type_info class  
  3.   const void * pVFTable;  
  4.   
  5.   // used to keep the demangled name returned by type_info::name()  
  6.   void* spare;  
  7.   
  8.   // mangled type name, e.g. ".H" = "int", ".?AUA@@" = "struct A", ".?AVA@@" = "class A"  
  9.   char name[0];  
  10. };  
Unlike SEH, each try block doesn't have a single associated state value. The compiler changes the state value not only on entering/leaving a try block, but also for each constructed/destroyed object. That way it's possible to know which objects need unwinding when an exception happens. You can still recover try blocks boundaries by inspecting the associated state range and the addresses returned by catch handlers (see Appendix 2). 

Throwing C++ Exceptions

throw statements are converted into calls of _CxxThrowException(), which actually raises a Win32 (SEH) exception with the code 0xE06D7363 ('msc'|0xE0000000). The custom parameters of the Win32 exception include pointers to the exception object and its ThrowInfo structure, using which the exception handler can match the thrown exception type against the types expected by catch handlers. 
  1. struct ThrowInfo {  
  2.    // 0x01: const, 0x02: volatile  
  3.    DWORD attributes;  
  4.   
  5.    // exception destructor  
  6.    void (*pmfnUnwind)();  
  7.   
  8.    // forward compatibility handler  
  9.    int (*pForwardCompat)();  
  10.   
  11.    // list of types that can catch this exception.  
  12.    // i.e. the actual type and all its ancestors.  
  13.    CatchableTypeArray* pCatchableTypeArray;  
  14.  };  
  15.   
  16.  struct CatchableTypeArray {  
  17.    // number of entries in the following array  
  18.    int nCatchableTypes;   
  19.    CatchableType* arrayOfCatchableTypes[0];  
  20.  };  

Describes a type that can catch this exception. 
  1. struct CatchableType {  
  2.    // 0x01: simple type (can be copied by memmove), 0x02: can be caught by reference only, 0x04: has virtual bases  
  3.    DWORD properties;  
  4.   
  5.    // see above  
  6.    TypeDescriptor* pType;  
  7.   
  8.    // how to cast the thrown object to this type  
  9.    PMD thisDisplacement;  
  10.   
  11.    // object size  
  12.    int sizeOrOffset;  
  13.   
  14.    // copy constructor address  
  15.    void (*copyFunction)();  
  16.  };  
  17.   
  18.  // Pointer-to-member descriptor.  
  19.  struct PMD {  
  20.    // member offset  
  21.    int mdisp;  
  22.   
  23.    // offset of the vbtable (-1 if not a virtual base)  
  24.    int pdisp;  
  25.   
  26.    // offset to the displacement value inside the vbtable  
  27.    int vdisp;  
  28.  };  

We'll delve more into this in the next article. 

Prologs and Epilogs

Instead of emitting the code for setting up the stack frame in the function body, the compiler might choose to call specific prolog and epilog functions instead. There are several variants, each used for specific function type: 

Name Type EH Cookie GS Cookie Catch Handlers
_SEH_prolog/_SEH_epilog SEH3 - -
_SEH_prolog4/_SEH_epilog4 S EH4 + -
_SEH_prolog4_GS/_SEH_epilog4_GS SEH4 + +
_EH_prolog C++ EH - - +/-
_EH_prolog3/_EH_epilog3 C++ EH + - -
_EH_prolog3_catch/_EH_epilog3 C++ EH + - +
_EH_prolog3_GS/_EH_epilog3_GS C++ EH + + -
_EH_prolog3_catch_GS/_EH_epilog3_catch_GS C++ EH + + +


SEH2

Apparently was used by MSVC 1.XX (exported by crtdll.dll). Encountered in some old NT programs. 
  1. ...  
  2. Saved edi  
  3. Saved esi  
  4. Saved ebx  
  5. Next SEH frame  
  6. Current SEH handler (__except_handler2)  
  7. Pointer to the scopetable  
  8. Try level  
  9. Saved ebp (of this function)  
  10. Exception pointers  
  11. Local variables  
  12. Saved ESP  
  13. Local variables  
  14. Callee EBP  
  15. Return address  
  16. Function arguments  
  17. ...  

Appendix I: Sample SEH Program

Let's consider the following sample disassembly. 
  1. func1           proc near  
  2.   
  3. _excCode        = dword ptr -28h  
  4. buf             = byte ptr -24h  
  5. _saved_esp      = dword ptr -18h  
  6. _exception_info = dword ptr -14h  
  7. _next           = dword ptr -10h  
  8. _handler        = dword ptr -0Ch  
  9. _scopetable     = dword ptr -8  
  10. _trylevel       = dword ptr -4  
  11. str             = dword ptr  8  
  12.   
  13.   push    ebp  
  14.   mov     ebp, esp  
  15.   push    -1  
  16.   push    offset _func1_scopetable  
  17.   push    offset _except_handler3  
  18.   mov     eax, large fs:0  
  19.   push    eax  
  20.   mov     large fs:0, esp  
  21.   add     esp, -18h  
  22.   push    ebx  
  23.   push    esi  
  24.   push    edi  
  25.   
  26.   ; --- end of prolog ---  
  27.   
  28.   mov     [ebp+_trylevel], 0 ;trylevel -1 -> 0: beginning of try block 0  
  29.   mov     [ebp+_trylevel], 1 ;trylevel 0 -> 1: beginning of try block 1  
  30.   mov     large dword ptr ds:123, 456  
  31.   mov     [ebp+_trylevel], 0 ;trylevel 1 -> 0: end of try block 1  
  32.   jmp     short _endoftry1  
  33.   
  34. _func1_filter1:                         ; __except() filter of try block 1  
  35.   mov     ecx, [ebp+_exception_info]  
  36.   mov     edx, [ecx+EXCEPTION_POINTERS.ExceptionRecord]  
  37.   mov     eax, [edx+EXCEPTION_RECORD.ExceptionCode]  
  38.   mov     [ebp+_excCode], eax  
  39.   mov     ecx, [ebp+_excCode]  
  40.   xor     eax, eax  
  41.   cmp     ecx, EXCEPTION_ACCESS_VIOLATION  
  42.   setz    al  
  43.   retn  
  44.   
  45. _func1_handler1:                        ; beginning of handler for try block 1  
  46.   mov     esp, [ebp+_saved_esp]  
  47.   push    offset aAccessViolatio ; "Access violation"  
  48.   call    _printf  
  49.   add     esp, 4  
  50.   mov     [ebp+_trylevel], 0 ;trylevel 1 -> 0: end of try block 1  
  51.   
  52. _endoftry1:  
  53.   mov     edx, [ebp+str]  
  54.   push    edx  
  55.   lea     eax, [ebp+buf]  
  56.   push    eax  
  57.   call    _strcpy  
  58.   add     esp, 8  
  59.   mov     [ebp+_trylevel], -1 ; trylevel 0 -> -1: end of try block 0  
  60.   call    _func1_handler0     ; execute __finally of try block 0  
  61.   jmp     short _endoftry0  
  62.   
  63. _func1_handler0:                        ; __finally handler of try block 0  
  64.   push    offset aInFinally ; "in finally"  
  65.   call    _puts  
  66.   add     esp, 4  
  67.   retn  
  68.   
  69. _endoftry0:  
  70.   ; --- epilog ---  
  71.   mov     ecx, [ebp+_next]  
  72.   mov     large fs:0, ecx  
  73.   pop     edi  
  74.   pop     esi  
  75.   pop     ebx  
  76.   mov     esp, ebp  
  77.   pop     ebp  
  78.   retn  
  79. func1           endp  
  80.   
  81. _func1_scopetable  
  82.   ;try block 0  
  83.   dd -1                      ;EnclosingLevel  
  84.   dd 0                       ;FilterFunc  
  85.   dd offset _func1_handler0  ;HandlerFunc  
  86.   
  87.   ;try block 1  
  88.   dd 0                       ;EnclosingLevel  
  89.   dd offset _func1_filter1   ;FilterFunc  
  90.   dd offset _func1_handler1  ;HandlerFunc  

The try block 0 has no filter, therefore its handler is a __finally{} block. EnclosingLevel of try block 1 is 0, so it's placed inside try block 0. Considering this, we can try to reconstruct the function structure: 
  1. void func1 (char* str)  
  2. {  
  3.   char buf[12];  
  4.   __try // try block 0  
  5.   {  
  6.      __try // try block 1  
  7.      {  
  8.        *(int*)123=456;  
  9.      }  
  10.      __except(GetExceptCode() == EXCEPTION_ACCESS_VIOLATION)  
  11.      {  
  12.         printf("Access violation");  
  13.      }  
  14.      strcpy(buf,str);  
  15.   }  
  16.   __finally  
  17.   {  
  18.      puts("in finally");  
  19.   }  
  20. }  

Appendix II: Sample Program with C++ Exceptions
  1. func1           proc near  
  2.   
  3. _a1             = dword ptr -24h  
  4. _exc            = dword ptr -20h  
  5. e               = dword ptr -1Ch  
  6. a2              = dword ptr -18h  
  7. a1              = dword ptr -14h  
  8. _saved_esp      = dword ptr -10h  
  9. _next           = dword ptr -0Ch  
  10. _handler        = dword ptr -8  
  11. _state          = dword ptr -4  
  12.   
  13.   push    ebp  
  14.   mov     ebp, esp  
  15.   push    0FFFFFFFFh  
  16.   push    offset func1_ehhandler  
  17.   mov     eax, large fs:0  
  18.   push    eax  
  19.   mov     large fs:0, esp  
  20.   push    ecx  
  21.   sub     esp, 14h  
  22.   push    ebx  
  23.   push    esi  
  24.   push    edi  
  25.   mov     [ebp+_saved_esp], esp  
  26.   
  27.   ; --- end of prolog ---  
  28.   
  29.   lea     ecx, [ebp+a1]  
  30.   call    A::A(void)  
  31.   mov     [ebp+_state], 0          ; state -1 -> 0: a1 constructed  
  32.   mov     [ebp+a1], 1              ; a1.m1 = 1  
  33.   mov     byte ptr [ebp+_state], 1 ; state 0 -> 1: try {  
  34.   lea     ecx, [ebp+a2]  
  35.   call    A::A(void)  
  36.   mov     [ebp+_a1], eax  
  37.   mov     byte ptr [ebp+_state], 2 ; state 2: a2 constructed  
  38.   mov     [ebp+a2], 2              ; a2.m1 = 2  
  39.   mov     eax, [ebp+a1]  
  40.   cmp     eax, [ebp+a2]            ; a1.m1 == a2.m1?  
  41.   jnz     short loc_40109F  
  42.   mov     [ebp+_exc], offset aAbc  ; _exc = "abc"  
  43.   push    offset __TI1?PAD         ; char *  
  44.   lea     ecx, [ebp+_exc]  
  45.   push    ecx  
  46.   call    _CxxThrowException       ; throw "abc";  
  47.   
  48. loc_40109F:  
  49.   mov     byte ptr [ebp+_state], 1 ; state 2 -> 1: destruct a2  
  50.   lea     ecx, [ebp+a2]  
  51.   call    A::~A(void)  
  52.   jmp     short func1_try0end  
  53.   
  54. catch (char * e)  
  55. func1_try0handler_pchar:  
  56.   mov     edx, [ebp+e]  
  57.   push    edx  
  58.   push    offset aCaughtS ; "Caught %s\n"  
  59.   call    ds:printf       ;  
  60.   add     esp, 8  
  61.   mov     eax, offset func1_try0end  
  62.   retn  
  63.   
  64. catch (...)  
  65. func1_try0handler_ellipsis:  
  66.   push    offset aCaught___ ; "Caught ...\n"  
  67.   call    ds:printf  
  68.   add     esp, 4  
  69.   mov     eax, offset func1_try0end  
  70.   retn  
  71.   
  72. func1_try0end:  
  73.   mov     [ebp+_state], 0          ; state 1 -> 0: }//try  
  74.   push    offset aAfterTry ; "after try\n"  
  75.   call    ds:printf  
  76.   add     esp, 4  
  77.   mov     [ebp+_state], -1         ; state 0 -> -1: destruct a1  
  78.   lea     ecx, [ebp+a1]  
  79.   call    A::~A(void)  
  80.   ; --- epilog ---  
  81.   mov     ecx, [ebp+_next]  
  82.   mov     large fs:0, ecx  
  83.   pop     edi  
  84.   pop     esi  
  85.   pop     ebx  
  86.   mov     esp, ebp  
  87.   pop     ebp  
  88.   retn  
  89. func1           endp  
  90.   
  91. func1_ehhandler proc near  
  92.   mov     eax, offset func1_funcinfo  
  93.   jmp     __CxxFrameHandler  
  94. func1_ehhandler endp  
  95.   
  96. func1_funcinfo  
  97.   dd 19930520h            ; magicNumber  
  98.   dd 4                    ; maxState  
  99.   dd offset func1_unwindmap ; pUnwindMap  
  100.   dd 1                    ; nTryBlocks  
  101.   dd offset func1_trymap  ; pTryBlockMap  
  102.   dd 0                    ; nIPMapEntries  
  103.   dd 0                    ; pIPtoStateMap  
  104.   dd 0                    ; pESTypeList  
  105.   
  106. func1_unwindmap  
  107.   dd -1  
  108.   dd offset func1_unwind_1tobase ; action  
  109.   dd 0                    ; toState  
  110.   dd 0                    ; action  
  111.   dd 1                    ; toState  
  112.   dd offset func1_unwind_2to1 ; action  
  113.   dd 0                    ; toState  
  114.   dd 0                    ; action  
  115.   
  116. func1_trymap  
  117.   dd 1                    ; tryLow  
  118.   dd 2                    ; tryHigh  
  119.   dd 3                    ; catchHigh  
  120.   dd 2                    ; nCatches  
  121.   dd offset func1_tryhandlers_0 ; pHandlerArray  
  122.   dd 0  
  123.   
  124. func1_tryhandlers_0  
  125. dd 0                    ; adjectives  
  126. dd offset char * `RTTI Type Descriptor' ; pType  
  127. dd -1Ch                 ; dispCatchObj  
  128. dd offset func1_try0handler_pchar ; addressOfHandler  
  129. dd 0                    ; adjectives  
  130. dd 0                    ; pType  
  131. dd 0                    ; dispCatchObj  
  132. dd offset func1_try0handler_ellipsis ; addressOfHandler  
  133.   
  134. func1_unwind_1tobase proc near  
  135. a1 = byte ptr -14h  
  136.   lea     ecx, [ebp+a1]  
  137.   call    A::~A(void)  
  138.   retn  
  139. func1_unwind_1tobase endp  
  140.   
  141. func1_unwind_2to1 proc near  
  142. a2 = byte ptr -18h  
  143.   lea     ecx, [ebp+a2]  
  144.   call    A::~A(void)  
  145.   retn  
  146. func1_unwind_2to1 endp  

Let's see what we can find out here. The maxState field in FuncInfo structure is 4 which means we have four entries in the unwind map, from 0 to 3. Examining the map, we see that the following actions are executed during unwinding:

  • state 3 -> state 0 (no action)
  • state 2 -> state 1 (destruct a2)
  • state 1 -> state 0 (no action)
  • state 0 -> state -1 (destruct a1)
Checking the try map, we can infer that states 1 and 2 correspond to the try block body and state 3 to the catch blocks bodies. Thus, change from state 0 to state 1 denotes the beginning of try block, and change from 1 to 0 its end. From the function code we can also see that -1 -> 0 is construction of a1, and 1 -> 2 is construction of a2. So the state diagram looks like this: 

Reversing MS VC++Part I: Exception Handling_第4张图片

Where did the arrow 1->3 come from? We cannot see it in the function code or FuncInfo structure since it's done by the exception handler. If an exception happens inside try block, the exception handler first unwinds the stack to the tryLow value (1 in our case) and then sets state value to tryHigh+1 (2+1=3) before calling the catch handler. 

The try block has two catch handlers. The first one has a catch type (char*) and gets the exception object on the stack (-1Ch = e). The second one has no type (i.e. ellipsis catch). Both handlers return the address where to resume execution, i.e. the position just after the try block. Now we can recover the function code: 
  1. void func1 ()  
  2. {  
  3.   A a1;  
  4.   a1.m1 = 1;  
  5.   try {  
  6.     A a2;  
  7.     a2.m1 = 2;  
  8.     if (a1.m1 == a1.m2) throw "abc";  
  9.   }  
  10.   catch(char* e)  
  11.   {  
  12.     printf("Caught %s\n",e);  
  13.   }  
  14.   catch(...)  
  15.   {  
  16.     printf("Caught ...\n");  
  17.   }  
  18.   printf("after try\n");  
  19. }  

Appendix III: IDC Helper Scripts

I wrote an IDC script to help with the reversing of MSVC programs. It scans the whole program for typical SEH/EH code sequences and comments all related structures and fields. Commented are stack variables, exception handlers, exception types and other. It also tries to fix function boundaries that are sometimes incorrectly determined by IDA. You can download it from MS SEH/EH Helper

Links and References

[1] Matt Pietrek. A Crash Course on the Depths of Win32 Structured Exception Handling.
http://www.microsoft.com/msj/0197/exception/exception.aspx
Still THE definitive guide on the implementation of SEH in Win32.

[2] Brandon Bray. Security Improvements to the Whidbey Compiler.
http://blogs.msdn.com/branbray/archive/2003/11/11/51012.aspx
Short description on changes in the stack layout for cookie checks.

[3] Chris Brumme. The Exception Model.
http://blogs.msdn.com/cbrumme/archive/2003/10/01/51524.aspx
Mostly about .NET exceptions, but still contains a good deal of information about SEH and C++ exceptions.

[4] Vishal Kochhar. How a C++ compiler implements exception handling.
http://www.codeproject.com/cpp/exceptionhandler.asp
An overview of C++ exceptions implementation.

[5] Calling Standard for Alpha Systems. Chapter 5. Event Processing.
http://www.cs.arizona.edu/computer.help/policy/DIGITAL_unix/AA-PY8AC-TET1_html/callCH5.html
Win32 takes a lot from the way Alpha handles exceptions and this manual has a very detailed description on how it happens. 

Structure definitions and flag values were also recovered from the following sources:
  • VC8 CRT debug information (many structure definitions)
  • VC8 assembly output (/FAs)
  • VC8 WinCE CRT source


你可能感兴趣的:(Reversing MS VC++Part I: Exception Handling)