三、目标文件解析

1. 目标文件的格式

Linux平台的可执行文件、目标文件(.o)、静态库(.a)、动态库(.so)都采用ELF格式存储

  • ELF(Executable Linkable Format)文件的类型
    • 可重定位文件(Relocatable File),包含了代码和数据,其中的符号地址是可以在链接过程修正的。目标文件、静态库文件都属于这种类型。
    • 可执行文件(Executable File),包含了可执行的程序
    • 共享目标文件(Shared Object File),主要是指动态库,用于链接,或者作为进程的一部分来执行
    • Core Dump文件(CoreDump File),当进程或系统挂掉的时候,保存进程地址空间和一些其它信息的文件
  • $ file 文件名 可以用来查看属于何种类型的ELF文件
2. 目标文件的内容

源代码 SimpleSection.c:

int printf( const char *format, ... );

int global_init_var = 84;
int global_uninit_var;

void func( int i )
{
        printf( "%d\n", i );
}

int main(void)
{
        static int static_var = 85;
        static int static_var2;

        int a = 1;
        int b;

        func(static_var + static_var2 + a + b);

        return 0;
}

目标文件的内容包含File Header各种段(Section)符号表调试信息字符串

  • 编译后的机器指令存放于代码段(.text section)
  • 编译后的数据放在数据段
    • 初始化的全局变量和static局部变量存放于.data section
    • 未初始化的全局变量和局部变量存放于.bss section,bss段只是预留位置而已,并不占据文件空间
  • 指令和数据分开的好处(部分编译器对未初始化的全局变量也不存放在bss段,只是预留符号而已)
    • 程序装载之后,指令和数据被映射到两个虚拟内存区域,数据区域的权限为可读写,指令区域的权限为只读,可以防止指令被有意或者无意的改写
    • 当系统中运行着一个程序的多个副本的时候,内存中只需保留一份该程序的指令部分。共享指令
3. 目标文件关键段解析

$ gcc -c SimpleSection.c -o SimpleSection.o -m32


Section Header
$ objdump -h SimpleSection.o 将关键段的基本信息打印出来

SimpleSection.o:     file format elf32-i386

Sections:
Idx Name          Size      VMA       LMA       File off  Algn
  0 .text         00000050  00000000  00000000  00000034  2**2
                  CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
  1 .data         00000008  00000000  00000000  00000084  2**2
                  CONTENTS, ALLOC, LOAD, DATA
  2 .bss          00000004  00000000  00000000  0000008c  2**2
                  ALLOC
  3 .rodata       00000004  00000000  00000000  0000008c  2**0
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  4 .comment      0000002b  00000000  00000000  00000090  2**0
                  CONTENTS, READONLY
  5 .note.GNU-stack 00000000  00000000  00000000  000000bb  2**0
                  CONTENTS, READONLY
  6 .eh_frame     00000058  00000000  00000000  000000bc  2**2
                  CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA

该目标文件有7个section,除了代码段、数据段,还有只读数据段(.rodata)、注释信息段(.comment)、堆栈提示段(.note.GNU-stack)、异常处理帧信息段(.eh_frame)


$ objdump -s -d SimpleSection.o -s 打印关键段的内容 十六进制, -d 反汇编

SimpleSection.o:     file format elf32-i386

Contents of section .text:
 0000 5589e583 ec188b45 08894424 04c70424  U......E..D$...$
 0010 00000000 e8fcffff ffc9c355 89e583e4  ...........U....
 0020 f083ec20 c7442418 01000000 8b150400  ... .D$.........
 0030 0000a100 00000001 d0034424 18034424  ..........D$..D$
 0040 1c890424 e8fcffff ffb80000 0000c9c3  ...$............
Contents of section .data:
 0000 54000000 55000000                    T...U...        
Contents of section .rodata:
 0000 25640a00                             %d..            
Contents of section .comment:
 0000 00474343 3a202855 62756e74 752f4c69  .GCC: (Ubuntu/Li
 0010 6e61726f 20342e36 2e332d31 7562756e  naro 4.6.3-1ubun
 0020 74753529 20342e36 2e3300             tu5) 4.6.3.     
Contents of section .eh_frame:
 0000 14000000 00000000 017a5200 017c0801  .........zR..|..
 0010 1b0c0404 88010000 1c000000 1c000000  ................
 0020 00000000 1b000000 00410e08 8502420d  .........A....B.
 0030 0557c50c 04040000 1c000000 3c000000  .W..........<...
 0040 1b000000 35000000 00410e08 8502420d  ....5....A....B.
 0050 0571c50c 04040000                    .q......        

Disassembly of section .text:

00000000 :
   0:   55                      push   %ebp
   1:   89 e5                   mov    %esp,%ebp
   3:   83 ec 18                sub    $0x18,%esp
   6:   8b 45 08                mov    0x8(%ebp),%eax
   9:   89 44 24 04             mov    %eax,0x4(%esp)
   d:   c7 04 24 00 00 00 00    movl   $0x0,(%esp)
  14:   e8 fc ff ff ff          call   15 
  19:   c9                      leave  
  1a:   c3                      ret    

0000001b 
: 1b: 55 push %ebp 1c: 89 e5 mov %esp,%ebp 1e: 83 e4 f0 and $0xfffffff0,%esp 21: 83 ec 20 sub $0x20,%esp 24: c7 44 24 18 01 00 00 movl $0x1,0x18(%esp) 2b: 00 2c: 8b 15 04 00 00 00 mov 0x4,%edx 32: a1 00 00 00 00 mov 0x0,%eax 37: 01 d0 add %edx,%eax 39: 03 44 24 18 add 0x18(%esp),%eax 3d: 03 44 24 1c add 0x1c(%esp),%eax 41: 89 04 24 mov %eax,(%esp) 44: e8 fc ff ff ff call 45 49: b8 00 00 00 00 mov $0x0,%eax 4e: c9 leave 4f: c3 ret

.text section
从上边的结果,可以清楚的看到.text section的内容,以及它们代表的汇编代码

  • 从header信息可知,代码段的大小为0x50个字节,符合Contents of seciont .text的大小
  • 第一个字节0x55表示汇编指令push %ebp
  • 最后一个字节0xc3表示汇编指令ret

.data section
从源代码可知,有两个初始化的 int 数据 global_init_var(84) 和 static_var(85)

  • 它们保存在文件偏移地址为00000084的位置
  • 0x84保存着global_init_var,值为0x54 00 00 00,4字节大端存储
  • 0x88保存着static_var,值为0x55 00 00 00,4字节大端存储

.rodata section
从源代码可知,有一个字符串"%d\n",它就保存在只读数据段,占用4个字节

$ hexdump -C SimpleSection.o 读取整个二进制文件的内容
$ objdump -h SimpleSection.o 获取关键段的基本信息,比较重要的有在文件中的偏移地址和大小
$ objdump -s -d SimpleSection.o 读取关键段的内容

4. 目标文件结构分析

ELF目标文件的总体结构

三、目标文件解析_第1张图片
ELF.png

ELF文件内容涉及的比较多,这里只关注File Header、表、符号等内容
/usr/include/elf.h 定义了ELF用到的所有数据类型和结构体


File Header

  • 结构体(Elf32_Ehdr)
/* The ELF file header.  This appears at the start of every ELF file.  */

#define EI_NIDENT (16)

typedef struct
{
  unsigned char e_ident[EI_NIDENT];     /* Magic number and other info */
  Elf32_Half    e_type;                 /* Object file type */
  Elf32_Half    e_machine;              /* Architecture */
  Elf32_Word    e_version;              /* Object file version */
  Elf32_Addr    e_entry;                /* Entry point virtual address */
  Elf32_Off     e_phoff;                /* Program header table file offset */
  Elf32_Off     e_shoff;                /* Section header table file offset */ 段表在文件中的偏移地址
  Elf32_Word    e_flags;                /* Processor-specific flags */
  Elf32_Half    e_ehsize;               /* ELF header size in bytes */
  Elf32_Half    e_phentsize;            /* Program header table entry size */
  Elf32_Half    e_phnum;                /* Program header table entry count */
  Elf32_Half    e_shentsize;            /* Section header table entry size */
  Elf32_Half    e_shnum;                /* Section header table entry count */
  Elf32_Half    e_shstrndx;             /* Section header string table index */
} Elf32_Ehdr;
  • $ hexdump -C SimpleSection.o Header内容16进制
00000000  7f 45 4c 46 01 01 01 00  00 00 00 00 00 00 00 00  |.ELF............|
00000010  01 00 03 00 01 00 00 00  00 00 00 00 00 00 00 00  |................|
00000020  74 01 00 00 00 00 00 00  34 00 00 00 00 00 28 00  |t.......4.....(.|
00000030  0d 00 0a 00
  • $ readelf -h SimpleSection.o Header内容格式化输出
ELF Header:
  Magic:   7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 
  Class:                             ELF32
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              REL (Relocatable file)
  Machine:                           Intel 80386
  Version:                           0x1
  Entry point address:               0x0
  Start of program headers:          0 (bytes into file)
  Start of section headers:          372 (bytes into file)
  Flags:                             0x0
  Size of this header:               52 (bytes)
  Size of program headers:           0 (bytes)
  Number of program headers:         0
  Size of section headers:           40 (bytes)
  Number of section headers:         13
  Section header string table index: 10

7f 45 4c 46 是文件的魔数(0x7f 'E' 'L' 'F')
01 01 01 分别代表文件类型(0代表无效文件, 1代表32位,2代表64位)、字节序(0代表无效,1代表小端,2代表大端)、ELF文件版本号(固定为1)


Section Header Table

  • Section Header 段描述符结构体(Elf32_Shdr)
/* Section header.  */

typedef struct
{
  Elf32_Word    sh_name;                /* Section name (string tbl index) */
  Elf32_Word    sh_type;                /* Section type */
  Elf32_Word    sh_flags;               /* Section flags */
  Elf32_Addr    sh_addr;                /* Section virtual addr at execution */
  Elf32_Off     sh_offset;              /* Section file offset */
  Elf32_Word    sh_size;                /* Section size in bytes */
  Elf32_Word    sh_link;                /* Link to another section */
  Elf32_Word    sh_info;                /* Additional section information */
  Elf32_Word    sh_addralign;           /* Section alignment */
  Elf32_Word    sh_entsize;             /* Entry size if section holds table */
} Elf32_Shdr;
  • $ hexdump -C SimpleSection.o Section Header Table 内容16进制

从ELF Header信息,可以获取到

  • Start of section headers: 372 (bytes into file) Section Header Table在文件中的偏移地址
  • Size of section headers: 40 (bytes) 每一个Section Header的大小(sizeof struct Elf32_Shdr)
  • Number of section headers: 13 Section Header的个数
  • 计算得Section Header Table起始偏移:372(0x174) -> 892(0x37C)
00000170  6d 65 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |me..............|
00000180  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000190  00 00 00 00 00 00 00 00  00 00 00 00 1f 00 00 00  |................|
000001a0  01 00 00 00 06 00 00 00  00 00 00 00 34 00 00 00  |............4...|
000001b0  50 00 00 00 00 00 00 00  00 00 00 00 04 00 00 00  |P...............|
000001c0  00 00 00 00 1b 00 00 00  09 00 00 00 00 00 00 00  |................|
000001d0  00 00 00 00 e4 04 00 00  28 00 00 00 0b 00 00 00  |........(.......|
000001e0  01 00 00 00 04 00 00 00  08 00 00 00 25 00 00 00  |............%...|
000001f0  01 00 00 00 03 00 00 00  00 00 00 00 84 00 00 00  |................|
00000200  08 00 00 00 00 00 00 00  00 00 00 00 04 00 00 00  |................|
00000210  00 00 00 00 2b 00 00 00  08 00 00 00 03 00 00 00  |....+...........|
00000220  00 00 00 00 8c 00 00 00  04 00 00 00 00 00 00 00  |................|
00000230  00 00 00 00 04 00 00 00  00 00 00 00 30 00 00 00  |............0...|
00000240  01 00 00 00 02 00 00 00  00 00 00 00 8c 00 00 00  |................|
00000250  04 00 00 00 00 00 00 00  00 00 00 00 01 00 00 00  |................|
00000260  00 00 00 00 38 00 00 00  01 00 00 00 30 00 00 00  |....8.......0...|
00000270  00 00 00 00 90 00 00 00  2b 00 00 00 00 00 00 00  |........+.......|
00000280  00 00 00 00 01 00 00 00  01 00 00 00 41 00 00 00  |............A...|
00000290  01 00 00 00 00 00 00 00  00 00 00 00 bb 00 00 00  |................|
000002a0  00 00 00 00 00 00 00 00  00 00 00 00 01 00 00 00  |................|
000002b0  00 00 00 00 55 00 00 00  01 00 00 00 02 00 00 00  |....U...........|
000002c0  00 00 00 00 bc 00 00 00  58 00 00 00 00 00 00 00  |........X.......|
000002d0  00 00 00 00 04 00 00 00  00 00 00 00 51 00 00 00  |............Q...|
000002e0  09 00 00 00 00 00 00 00  00 00 00 00 0c 05 00 00  |................|
000002f0  10 00 00 00 0b 00 00 00  08 00 00 00 04 00 00 00  |................|
00000300  08 00 00 00 11 00 00 00  03 00 00 00 00 00 00 00  |................|
00000310  00 00 00 00 14 01 00 00  5f 00 00 00 00 00 00 00  |........_.......|
00000320  00 00 00 00 01 00 00 00  00 00 00 00 01 00 00 00  |................|
00000330  02 00 00 00 00 00 00 00  00 00 00 00 7c 03 00 00  |............|...|
00000340  00 01 00 00 0c 00 00 00  0b 00 00 00 04 00 00 00  |................|
00000350  10 00 00 00 09 00 00 00  03 00 00 00 00 00 00 00  |................|
00000360  00 00 00 00 7c 04 00 00  65 00 00 00 00 00 00 00  |....|...e.......|
00000370  00 00 00 00 01 00 00 00  00 00 00 00 00 00 00 00  |................|
  • $ readelf -S SimpleSection.o 段表格式化输出
There are 13 section headers, starting at offset 0x174:

Section Headers:
  [Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            00000000 000000 000000 00      0   0  0
  [ 1] .text             PROGBITS        00000000 000034 000050 00  AX  0   0  4
  [ 2] .rel.text         REL             00000000 0004e4 000028 08     11   1  4
  [ 3] .data             PROGBITS        00000000 000084 000008 00  WA  0   0  4
  [ 4] .bss              NOBITS          00000000 00008c 000004 00  WA  0   0  4
  [ 5] .rodata           PROGBITS        00000000 00008c 000004 00   A  0   0  1
  [ 6] .comment          PROGBITS        00000000 000090 00002b 01  MS  0   0  1
  [ 7] .note.GNU-stack   PROGBITS        00000000 0000bb 000000 00      0   0  1
  [ 8] .eh_frame         PROGBITS        00000000 0000bc 000058 00   A  0   0  4
  [ 9] .rel.eh_frame     REL             00000000 00050c 000010 08     11   8  4
  [10] .shstrtab         STRTAB          00000000 000114 00005f 00      0   0  1
  [11] .symtab           SYMTAB          00000000 00037c 000100 10     12  11  4
  [12] .strtab           STRTAB          00000000 00047c 000065 00      0   0  1
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings)
  I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown)
  O (extra OS processing required) o (OS specific), p (processor specific)

.symtab 符号表 段
ELF文件中的符号存在于一个符号表中,而该表又是作为ELF的一个段:.symtab

  • 符号的结构体(Elf32_Sym)
typedef struct
{
  Elf32_Word    st_name;                /* Symbol name (string tbl index) */
  Elf32_Addr    st_value;               /* Symbol value */
  Elf32_Word    st_size;                /* Symbol size */
  unsigned char st_info;                /* Symbol type and binding */
  unsigned char st_other;               /* Symbol visibility */
  Elf32_Section st_shndx;               /* Section index */
} Elf32_Sym;
  • $ readelf -s SimpleSection.o
Symbol table '.symtab' contains 16 entries:
   Num:    Value  Size Type    Bind   Vis      Ndx Name
     0: 00000000     0 NOTYPE  LOCAL  DEFAULT  UND 
     1: 00000000     0 FILE    LOCAL  DEFAULT  ABS SimpleSection.c
     2: 00000000     0 SECTION LOCAL  DEFAULT    1 
     3: 00000000     0 SECTION LOCAL  DEFAULT    3 
     4: 00000000     0 SECTION LOCAL  DEFAULT    4 
     5: 00000000     0 SECTION LOCAL  DEFAULT    5 
     6: 00000004     4 OBJECT  LOCAL  DEFAULT    3 static_var.1236
     7: 00000000     4 OBJECT  LOCAL  DEFAULT    4 static_var2.1237
     8: 00000000     0 SECTION LOCAL  DEFAULT    7 
     9: 00000000     0 SECTION LOCAL  DEFAULT    8 
    10: 00000000     0 SECTION LOCAL  DEFAULT    6 
    11: 00000000     4 OBJECT  GLOBAL DEFAULT    3 global_init_var
    12: 00000004     4 OBJECT  GLOBAL DEFAULT  COM global_uninit_var
    13: 00000000    27 FUNC    GLOBAL DEFAULT    1 func
    14: 00000000     0 NOTYPE  GLOBAL DEFAULT  UND printf
    15: 0000001b    53 FUNC    GLOBAL DEFAULT    1 main

Bind 表明了该符号(变量、函数)的绑定信息

  • LOCAL 局部符号,对于目标文件以外的文件不可见
  • GLOBAL 全局符号,外部可见
  • WEAK 弱引用
    • 如果该符号存在着定义,编译器将进行该符号的引用决议
    • 如果该符号未定义,则编译器不报错,将其定义为0
    • 编译器的__attribute__ ((weak)),可以将符号声明为弱符号

Ndx 表明了该符号所在的段

  • 如果符号定义在本文件中,那么该值表示所在段的下标
  • 对于不在本文件中定义的符号,有一些特殊值
    • ABS,该符号包含了一个绝对的值,比如文件名的符号
    • COMMON,该符号是一个“Common块”类型的符号,未初始化的全局变量就属于这种类型
    • Undef,该符号未在本文件中定义,引用的其它文件中的

Value 对于不同的符号有不同的意思

  • 对于变量和函数来说,value就是它们相对于所在段的偏移地址
  • 对于可执行文件,value代表符号在虚拟内存中的虚拟地址

你可能感兴趣的:(三、目标文件解析)