在开发过程总,总是希望编译出来的可执行文件尽量小,因为这样可以节省更多的磁盘空间,那么有什么方法可以缩小可执行文件的大小的?
A: 通常我们会首先移除了debug信息,移除了符号表信息,同时我们还希望万一出事了,比如coredump了,我们能获取更多的信息。
Linux下是怎么解决这个矛盾的呢?
先看第一个问题,移除debug相关信息的影响。
如下实现了测试代码,main
调用了 foo
,foo
调用了 bar
,其中bar
故意访问了非法地址,为了引起 core dump
。
#include
#include
static int bar(void)
{
char *p = NULL;
printf("I am bar,I will core dump\n");
printf("%s",p);
*p =0x0;
return 0;
}
static int foo(void)
{
int i ;
printf("I am foo,I will call bar\n");
bar();
return 0;
}
int main(void)
{
printf("I am main, I wll can foo\n");
foo();
return 0;
}
先编译出一个 debug 版本来,然后我们看到可执行程序的大小为 17464
bytes.
gcc -g test.c -o test
ls -rtl test
-rwxrwxr-x 1 codingcos codingcos 17464 8月 14 09:43 test
再看下 section 信息:
readelf -S test
There are 37 section headers, starting at offset 0x3af8:
Section Headers:
[Nr] Name Type Address Offset
Size EntSize Flags Link Info Align
[ 0] NULL 0000000000000000 00000000
0000000000000000 0000000000000000 0 0 0
[ 1] .interp PROGBITS 0000000000000318 00000318
000000000000001c 0000000000000000 A 0 0 1
[ 2] .note.gnu.pr[...] NOTE 0000000000000338 00000338
0000000000000030 0000000000000000 A 0 0 8
[ 3] .note.gnu.bu[...] NOTE 0000000000000368 00000368
0000000000000024 0000000000000000 A 0 0 4
[ 4] .note.ABI-tag NOTE 000000000000038c 0000038c
0000000000000020 0000000000000000 A 0 0 4
[ 5] .gnu.hash GNU_HASH 00000000000003b0 000003b0
0000000000000024 0000000000000000 A 6 0 8
[ 6] .dynsym DYNSYM 00000000000003d8 000003d8
00000000000000c0 0000000000000018 A 7 1 8
[ 7] .dynstr STRTAB 0000000000000498 00000498
0000000000000094 0000000000000000 A 0 0 1
[ 8] .gnu.version VERSYM 000000000000052c 0000052c
0000000000000010 0000000000000002 A 6 0 2
[ 9] .gnu.version_r VERNEED 0000000000000540 00000540
0000000000000030 0000000000000000 A 7 1 8
[10] .rela.dyn RELA 0000000000000570 00000570
00000000000000c0 0000000000000018 A 6 0 8
[11] .rela.plt RELA 0000000000000630 00000630
0000000000000030 0000000000000018 AI 6 24 8
[12] .init PROGBITS 0000000000001000 00001000
000000000000001b 0000000000000000 AX 0 0 4
[13] .plt PROGBITS 0000000000001020 00001020
0000000000000030 0000000000000010 AX 0 0 16
[14] .plt.got PROGBITS 0000000000001050 00001050
0000000000000010 0000000000000010 AX 0 0 16
[15] .plt.sec PROGBITS 0000000000001060 00001060
0000000000000020 0000000000000010 AX 0 0 16
[16] .text PROGBITS 0000000000001080 00001080
0000000000000174 0000000000000000 AX 0 0 16
[17] .fini PROGBITS 00000000000011f4 000011f4
000000000000000d 0000000000000000 AX 0 0 4
[18] .rodata PROGBITS 0000000000002000 00002000
0000000000000055 0000000000000000 A 0 0 4
[19] .eh_frame_hdr PROGBITS 0000000000002058 00002058
0000000000000044 0000000000000000 A 0 0 4
[20] .eh_frame PROGBITS 00000000000020a0 000020a0
00000000000000ec 0000000000000000 A 0 0 8
[21] .init_array INIT_ARRAY 0000000000003db0 00002db0
0000000000000008 0000000000000008 WA 0 0 8
[22] .fini_array FINI_ARRAY 0000000000003db8 00002db8
0000000000000008 0000000000000008 WA 0 0 8
[23] .dynamic DYNAMIC 0000000000003dc0 00002dc0
00000000000001f0 0000000000000010 WA 7 0 8
[24] .got PROGBITS 0000000000003fb0 00002fb0
0000000000000050 0000000000000008 WA 0 0 8
[25] .data PROGBITS 0000000000004000 00003000
0000000000000010 0000000000000000 WA 0 0 8
[26] .bss NOBITS 0000000000004010 00003010
0000000000000008 0000000000000000 WA 0 0 1
[27] .comment PROGBITS 0000000000000000 00003010
000000000000002b 0000000000000001 MS 0 0 1
[28] .debug_aranges PROGBITS 0000000000000000 0000303b
0000000000000030 0000000000000000 0 0 1
[29] .debug_info PROGBITS 0000000000000000 0000306b
000000000000011a 0000000000000000 0 0 1
[30] .debug_abbrev PROGBITS 0000000000000000 00003185
00000000000000cd 0000000000000000 0 0 1
[31] .debug_line PROGBITS 0000000000000000 00003252
0000000000000076 0000000000000000 0 0 1
[32] .debug_str PROGBITS 0000000000000000 000032c8
00000000000000ea 0000000000000001 MS 0 0 1
[33] .debug_line_str PROGBITS 0000000000000000 000033b2
000000000000003d 0000000000000001 MS 0 0 1
[34] .symtab SYMTAB 0000000000000000 000033f0
00000000000003a8 0000000000000018 35 20 8
[35] .strtab STRTAB 0000000000000000 00003798
00000000000001f5 0000000000000000 0 0 1
[36] .shstrtab STRTAB 0000000000000000 0000398d
000000000000016a 0000000000000000 0 0 1
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
L (link order), O (extra OS processing required), G (group), T (TLS),
C (compressed), x (unknown), o (OS specific), E (exclude),
D (mbind), l (large), p (processor specific)
然后,我们用 strip
命令将 debug info 去除,指令如下:
strip --strip-debug test
ls -rtl test
-rwxrwxr-x 1 codingcos codingcos 15912 8月 14 09:43 test
可执行文件的大小从17464
减小到了15912
。
去除掉 debug info 的 test 和之前的 test 有什么区别呢? 我们看下去除后的 section 信息:
readelf -S test
There are 31 section headers, starting at offset 0x3668:
Section Headers:
[Nr] Name Type Address Offset
Size EntSize Flags Link Info Align
[ 0] NULL 0000000000000000 00000000
0000000000000000 0000000000000000 0 0 0
[ 1] .interp PROGBITS 0000000000000318 00000318
000000000000001c 0000000000000000 A 0 0 1
[ 2] .note.gnu.pr[...] NOTE 0000000000000338 00000338
0000000000000030 0000000000000000 A 0 0 8
[ 3] .note.gnu.bu[...] NOTE 0000000000000368 00000368
0000000000000024 0000000000000000 A 0 0 4
[ 4] .note.ABI-tag NOTE 000000000000038c 0000038c
0000000000000020 0000000000000000 A 0 0 4
[ 5] .gnu.hash GNU_HASH 00000000000003b0 000003b0
0000000000000024 0000000000000000 A 6 0 8
[ 6] .dynsym DYNSYM 00000000000003d8 000003d8
00000000000000c0 0000000000000018 A 7 1 8
[ 7] .dynstr STRTAB 0000000000000498 00000498
0000000000000094 0000000000000000 A 0 0 1
[ 8] .gnu.version VERSYM 000000000000052c 0000052c
0000000000000010 0000000000000002 A 6 0 2
[ 9] .gnu.version_r VERNEED 0000000000000540 00000540
0000000000000030 0000000000000000 A 7 1 8
[10] .rela.dyn RELA 0000000000000570 00000570
00000000000000c0 0000000000000018 A 6 0 8
[11] .rela.plt RELA 0000000000000630 00000630
0000000000000030 0000000000000018 AI 6 24 8
[12] .init PROGBITS 0000000000001000 00001000
000000000000001b 0000000000000000 AX 0 0 4
[13] .plt PROGBITS 0000000000001020 00001020
0000000000000030 0000000000000010 AX 0 0 16
[14] .plt.got PROGBITS 0000000000001050 00001050
0000000000000010 0000000000000010 AX 0 0 16
[15] .plt.sec PROGBITS 0000000000001060 00001060
0000000000000020 0000000000000010 AX 0 0 16
[16] .text PROGBITS 0000000000001080 00001080
0000000000000174 0000000000000000 AX 0 0 16
[17] .fini PROGBITS 00000000000011f4 000011f4
000000000000000d 0000000000000000 AX 0 0 4
[18] .rodata PROGBITS 0000000000002000 00002000
0000000000000055 0000000000000000 A 0 0 4
[19] .eh_frame_hdr PROGBITS 0000000000002058 00002058
0000000000000044 0000000000000000 A 0 0 4
[20] .eh_frame PROGBITS 00000000000020a0 000020a0
00000000000000ec 0000000000000000 A 0 0 8
[21] .init_array INIT_ARRAY 0000000000003db0 00002db0
0000000000000008 0000000000000008 WA 0 0 8
[22] .fini_array FINI_ARRAY 0000000000003db8 00002db8
0000000000000008 0000000000000008 WA 0 0 8
[23] .dynamic DYNAMIC 0000000000003dc0 00002dc0
00000000000001f0 0000000000000010 WA 7 0 8
[24] .got PROGBITS 0000000000003fb0 00002fb0
0000000000000050 0000000000000008 WA 0 0 8
[25] .data PROGBITS 0000000000004000 00003000
0000000000000010 0000000000000000 WA 0 0 8
[26] .bss NOBITS 0000000000004010 00003010
0000000000000008 0000000000000000 WA 0 0 1
[27] .comment PROGBITS 0000000000000000 00003010
000000000000002b 0000000000000001 MS 0 0 1
[28] .symtab SYMTAB 0000000000000000 00003040
0000000000000330 0000000000000018 29 15 8
[29] .strtab STRTAB 0000000000000000 00003370
00000000000001db 0000000000000000 0 0 1
[30] .shstrtab STRTAB 0000000000000000 0000354b
000000000000011a 0000000000000000 0 0 1
我们可以看到.debug_aranges .debug_info .debug_abbrev .debug_line .debug_str .debug_line_str
debug 相关的 section 都已经不在了,原来的 37个section减少到了31个 sections。
但是我们注意到.symtab .strtab .shstrtab
符号表信息 和 字符串信息还在。此外,可以通过nm
命可以看到它们的具体信息这:
[09:53:16]shiqiang.zhu@selab-ThinkStation-P350 (*^~^*) ~/workbase/test> nm test
000000000000038c r __abi_tag
0000000000001169 t bar
0000000000004010 B __bss_start
0000000000004010 b completed.0
w __cxa_finalize@GLIBC_2.2.5
0000000000004000 D __data_start
0000000000004000 W data_start
00000000000010b0 t deregister_tm_clones
0000000000001120 t __do_global_dtors_aux
0000000000003db8 d __do_global_dtors_aux_fini_array_entry
0000000000004008 D __dso_handle
0000000000003dc0 d _DYNAMIC
0000000000004010 D _edata
0000000000004018 B _end
00000000000011f4 T _fini
00000000000011ae t foo
0000000000001160 t frame_dummy
0000000000003db0 d __frame_dummy_init_array_entry
0000000000002188 r __FRAME_END__
0000000000003fb0 d _GLOBAL_OFFSET_TABLE_
w __gmon_start__
0000000000002058 r __GNU_EH_FRAME_HDR
0000000000001000 T _init
0000000000002000 R _IO_stdin_used
w _ITM_deregisterTMCloneTable
w _ITM_registerTMCloneTable
U __libc_start_main@GLIBC_2.34
00000000000011d1 T main
U printf@GLIBC_2.2.5
U puts@GLIBC_2.2.5
00000000000010e0 t register_tm_clones
0000000000001080 T _start
0000000000004010 D __TMC_END__
此时如果执行这个 test
可执行程序,会产生coredump
文件,如果使用gdb
调试coredump
文件的时候,我们可以打印出堆栈信息,因为符号表还在。
在往下进行之前我们先学习一个命令: ulimit
:
ulimit
是 如 Linux 中用于控制shell
和其创建的进程可以使用的系统资源。
ulimit -c
选项则用于设置核心文件(core dump
)的最大大小。当一个程序奔溃时,操作系统可以将程序的内存内容和一些调试信息保存到一个核心文件中,以便开发者可以查看这些信息来调试程序。这个文件通常被称为core dump。使用ulimit -c
命令可以查询或设置 core文件的最大大小。例如:
ulimit -c
:查询当前core文件的最大大小。如果返回的是0,那么表示不会生成core文件;ulimit -c unlimited
:设置core文件的最大大小为无限,即允许core文件的大小不受限制。注意:
ulimit -c
设置的限制仅对当前shell及其子进程有效,不会影响到其他shell或全局设置。在一些系统中,为了安全考虑,默认可能不启用 core dump,即使你使用ulimit -c unlimited
也不会生成 core文件。在这种情况下,你可能需要修改系统的核心设置或其他配置来启用core dump。1)使用ulimit -c查看core dump是否打开。如果结果为0,则表示此功能处于关闭状态,不会生成core文件, 执行 ulimit -c 1024
2)修改/etc/sysctl.conf文件【sudo vi /etc/sysctl.conf】,添加需要保存的路径【kernel.core_pattern = /tmp/corefile/core.%e.%t】
3)输入 sudo sysctl -p /etc/sysctl.conf 命令即刻生效
由于 symtab .strtab
.shstrtab
符号表信息 和 字符串信息还在,我们仍然可以使用 gdb
进行调试:
ulimit -c unlimited or ulimit -c 1024
gdb -c /tmp/corefile/core.test.1691982639.1098584 test
Core was generated by `./test'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x0000560eb28091ab in bar () at test.c:10
10 *p =0x0;
(gdb) bt
#0 0x0000560eb28091ab in bar () at test.c:10
#1 0x0000560eb28091d1 in foo () at test.c:19
#2 0x0000560eb28091f4 in main () at test.c:27
(gdb)
虽然 debug 相关section已经去除,但是还有符号表信息,一旦出了core dump还可以进行debug。大部分的发行版的程序都会将符号表信息删除。如果符号表与可执行程序完全隔离,那将是一种什么样的情况的?请见下篇文章。
上篇文章:ARM 嵌入式 编译系列 9-- GCC 编译符号表(Symbol Table)的详细介绍
下篇文章:ARM 嵌入式 编译系列 10.1 – GCC 编译缩减可执行文件 elf 文件大小