brk() and sbrk() change the location of the program break, which defines the end of the process's data segment (i.e., the program break is the first location after the end of the uninitialized data segment).
brk() , sbrk() 的声明如下:
#include <unistd.h> int brk(void *addr); void *sbrk(intptr_t increment);
如 man 里说的:
brk() 和 sbrk() 改变 "program brek" 的位置,这个位置定义了进程数据段的终止处(也就是说,program break 是在未初始化数据段终止处后的第一个位置)。引用brk() and sbrk() change the location of the program break, which defines the end of the process's data segment (i.e., the program break is the first location after the end of the uninitialized data segment).
首先说明一点,一个程序一旦编译好后,text segment ,data segment 和 bss segment 是确定下来的,这也可以通过 objdump 观察到。下面通过一个程序来测试这个 program break 是不是在 bss segment 结束那里:
#include <stdio.h> #include <unistd.h> #include <stdlib.h> #include <sys/time.h> #include <sys/resource.h> int bssvar; //声明一个味定义的变量,它会放在 bss segment 中 int main(void) { char *pmem; long heap_gap_bss; printf ("end of bss section:%p\n", (long)&bssvar + 4); pmem = (char *)malloc(32); //从堆中分配一块内存区,一般从堆的开始处获取 if (pmem == NULL) { perror("malloc"); exit (EXIT_FAILURE); } printf ("pmem:%p\n", pmem); //计算堆的开始地址和 bss segment 结束处得空隙大小,注意每次加载程序时这个空隙都是变化的,但是在同一次加载中它不会改变 heap_gap_bss = (long)pmem - (long)&bssvar - 4; printf ("1-gap between heap and bss:%lu\n", heap_gap_bss); free (pmem); //释放内存,归还给堆 sbrk(32); //调整 program break 位置(假设现在不知道这个位置在堆头还是堆尾) pmem = (char *)malloc(32); //再一次获取内存区 if (pmem == NULL) { perror("malloc"); exit (EXIT_FAILURE); } printf ("pmem:%p\n", pmem); //检查和第一次获取的内存区的起始地址是否一样 heap_gap_bss = (long)pmem - (long)&bssvar - 4; //计算调整 program break 后的空隙 printf ("2-gap between heap and bss:%lu\n", heap_gap_bss); free(pmem); //释放 return 0; }
从上面的输出中,可以发现几点:引用[beyes@localhost C]$ ./sbrk
end of bss section:0x8049938
pmem:0x82ec008
1-gap between heap and bss: 2762448
pmem:0x82ec008
2-gap between heap and bss: 2762448
[beyes@localhost C]$ ./sbrk
end of bss section:0x8049938
pmem:0x8dbc008
1-gap between heap and bss: 14100176
pmem:0x8dbc008
2-gap between heap and bss: 14100176
#include <stdio.h> #include <unistd.h> #include <stdlib.h> #include <sys/time.h> #include <sys/resource.h> int main(void) { void *tret; char *pmem; pmem = (char *)malloc(32); if (pmem == NULL) { perror("malloc"); exit (EXIT_FAILURE); } printf ("pmem:%p\n", pmem); tret = sbrk(0); if (tret != (void *)-1) printf ("heap size on each load: %lu\n", (long)tret - (long)pmem); return 0; }
从输出可以看到,虽然堆的头部地址在每次程序加载后都不一样,但是每次加载后,堆的大小默认分配是一致的。但是这不是不能改的,可以使用 sysctl 命令修改一下内核参数:引用[beyes@localhost C]$ ./sbrk
pmem:0x80c9008
heap size on each load: 135160
[beyes@localhost C]$ ./sbrk
pmem:0x9682008
heap size on each load: 135160
[beyes@localhost C]$ ./sbrk
pmem:0x9a7d008
heap size on each load: 135160
[beyes@localhost C]$ ./sbrk
pmem:0x8d92008
heap size on each load: 135160
[beyes@localhost C]$ vi sbrk.c
这么做之后,再运行 3 次这个程序看看:引用#sysctl -w kernel/randomize_va_space=0
从输出看到,每次加载后,堆头部的其实地址都一样了。但我们不需要这么做,每次堆都一样,容易带来缓冲区溢出攻击(以前老的 linux 内核就是特定地址加载的),所以还是需要保持 randomize_va_space 这个内核变量值为 1 。引用[beyes@localhost C]$ ./sbrk
pmem:0x804a008
heap size on each load: 135160
[beyes@localhost C]$ ./sbrk
pmem:0x804a008
heap size on each load: 135160
[beyes@localhost C]$ ./sbrk
pmem:0x804a008
heap size on each load: 135160
#include <stdio.h> #include <unistd.h> #include <stdlib.h> #include <sys/time.h> #include <sys/resource.h> int main(void) { void *tret; char *pmem; int i; long sbrkret; pmem = (char *)malloc(32); if (pmem == NULL) { perror("malloc"); exit (EXIT_FAILURE); } printf ("pmem:%p\n", pmem); for (i = 0; i < 65; i++) { sbrk(1); printf ("%d\n", sbrk(0) - (long)pmem - 0x20ff8); //0x20ff8 就是堆和 bss段 之间的空隙常数;改变后要用 sbrk(0) 再次获取更新后的program break位置 } free(pmem); return 0; }
引用[beyes@localhost C]$ ./sbrk
pmem:0x804a008
1
2
3
4
5
... ...
61
62
63
64
而 brk() 这个函数的参数是一个地址,假如你已经知道了堆的起始地址,还有堆的大小,那么你就可以据此修改 brk() 中的地址参数已达到调整堆的目的。
实际上,在应用程序中,基本不直接使用这两个函数,取而代之的是 malloc() 一类函数,这一类库函数的执行效率会更高。还需要注意一点,当使用 malloc() 分配过大的空间,比如超出 0x20ff8 这个常数(在我的系统(Fedora15)上是这样,别的系统可能会有变)时,malloc 不再从堆中分配空间,而是使用 mmap() 这个系统调用从映射区寻找可用的内存空间。