参考链接:https://ctf-wiki.github.io/ctf-wiki/pwn/linux/glibc-heap/heap_structure-zh/
无论是主线程还是新创建的线程,在第一次申请内存时,都会有独立的arena。
对于不同的系统,arena 的数量如下:
For 32 bit systems:
Number of arena = 2 * number of cores.
For 64 bit systems:
Number of arena = 8 * number of cores.
显然并不是每一个线程都会有对应的 arena。当线程数大于核数的2倍时,必然有线程处于等待状态,因此没有必要为每个线程分配一个arena。
与 thread 不同的是,main_arena 并不是在申请的 heap 中,而是一个全局变量,在 libc.so的数据段。
heap_info 用来记录申请内存时产生的信息。程序刚开始执行时,每个线程是没有heap区域的。(嗯嗯好像是的,malloc之后才有)此外,一般申请的 heap 是不连续的,因此需要记录不同的 heap 之间的链接结构。
该数据结构是专门为从memory mapping segment 处申请的内存准备的,即为非主线程准备的。
主线程可以通过 sbrk() 拓展 program break location 获得(直接触及mms)。
heap_info 的主要结构如下:
#define HEAP_MIN_SIZE (32 * 1024)
#ifndef HEAP_MAX_SIZE
# ifdef DEFAULT_MMAP_THRESHOLD_MAX
# define HEAP_MAX_SIZE (2 * DEFAULT_MMAP_THRESHOLD_MAX)
# else
# define HEAP_MAX_SIZE (1024 * 1024) /* must be a power of two */
# endif
#endif
/* HEAP_MIN_SIZE and HEAP_MAX_SIZE limit the size of mmap()ed heaps
that are dynamically created for multi-threaded programs. The
maximum size must be a power of two, for fast determination of
which heap belongs to a chunk. It should be much larger than the
mmap threshold, so that requests with a size just below that
threshold can be fulfilled without creating too many heaps. */
/***************************************************************************/
/* A heap is a single contiguous memory region holding (coalesceable)
malloc_chunks. It is allocated with mmap() and always starts at an
address aligned to HEAP_MAX_SIZE. */
typedef struct _heap_info
{
mstate ar_ptr; /* Arena for this heap. */
struct _heap_info *prev; /* Previous heap. */
size_t size; /* Current size in bytes. */
size_t mprotect_size; /* Size in bytes that has been mprotected
PROT_READ|PROT_WRITE. */
/* Make sure the following data is properly aligned, particularly
that sizeof (heap_info) + 2 * SIZE_SZ is a multiple of
MALLOC_ALIGNMENT. */
char pad[-6 * SIZE_SZ & MALLOC_ALIGN_MASK];
} heap_info;
该结构主要是描述堆的基本信息,包括
(1)堆对应的 arena 的地址
(2)一个线程可能有多个堆(这不是很正常?),prev 记录了上一个 heap_info 的地址。单链表链接 heap_info
(3)size 表示当前堆的大小
(4)最后一部分确保对齐(我也不懂啊??)
该结构用于管理堆。记录每个 arena 当前申请内存的具体状态,如是否有空闲 chunk,有什么大小的空闲 chunk 等等。无论是 thread arena 还是 main arena,它们都只有一个 malloc_state结构。
由于 thread 的 arena 可能有多个,malloc_state 结构会在最新申请的 arena中。
main arena 的 malloc_state 并不是 heap segment 的一部分,而是一个全局变量,存储在 libc.so的数据段。
结构如下:
struct malloc_state {
/* Serialize access. */
__libc_lock_define(, mutex);
/* Flags (formerly in max_fast). */
int flags;
/* Fastbins */
mfastbinptr fastbinsY[ NFASTBINS ];
/* Base of the topmost chunk -- not otherwise kept in a bin */
mchunkptr top;
/* The remainder from the most recent split of a small request */
mchunkptr last_remainder;
/* Normal bins packed as described above */
mchunkptr bins[ NBINS * 2 - 2 ];
/* Bitmap of bins, help to speed up the process of determinating if a given bin is definitely empty.*/
unsigned int binmap[ BINMAPSIZE ];
/* Linked list, points to the next arena */
struct malloc_state *next;
/* Linked list for free arenas. Access to this field is serialized
by free_list_lock in arena.c. */
struct malloc_state *next_free;
/* Number of threads attached to this arena. 0 if the arena is on
the free list. Access to this field is serialized by
free_list_lock in arena.c. */
INTERNAL_SIZE_T attached_threads;
/* Memory allocated from the system in this arena. */
INTERNAL_SIZE_T system_mem;
INTERNAL_SIZE_T max_system_mem;
};
(1)_libc_lock_define(,mutex):
该变量用于控制程序串行访问同一个分配区,当一个线程获取了分配区后,其他线程要想访问该分配区,就必须等待该线程分配完成后才能使用。
(2)flags:
记录了一些分配区的标志,比如bit0记录了分配区是否有fast bin chunk,bit1 标识分配区是否能返回连续的虚拟地址空间。具体如下:
/*
FASTCHUNKS_BIT held in max_fast indicates that there are probably
some fastbin chunks. It is set true on entering a chunk into any
fastbin, and cleared only in malloc_consolidate.
The truth value is inverted so that have_fastchunks will be true
upon startup (since statics are zero-filled), simplifying
initialization checks.
*/
#define FASTCHUNKS_BIT (1U)
#define have_fastchunks(M) (((M)->flags & FASTCHUNKS_BIT) == 0)
#define clear_fastchunks(M) catomic_or(&(M)->flags, FASTCHUNKS_BIT)
#define set_fastchunks(M) catomic_and(&(M)->flags, ~FASTCHUNKS_BIT)
/*
NONCONTIGUOUS_BIT indicates that MORECORE does not return contiguous
regions. Otherwise, contiguity is exploited in merging together,
when possible, results from consecutive MORECORE calls.
The initial value comes from MORECORE_CONTIGUOUS, but is
changed dynamically if mmap is ever used as an sbrk substitute.
*/
#define NONCONTIGUOUS_BIT (2U)
#define contiguous(M) (((M)->flags & NONCONTIGUOUS_BIT) == 0)
#define noncontiguous(M) (((M)->flags & NONCONTIGUOUS_BIT) != 0)
#define set_noncontiguous(M) ((M)->flags |= NONCONTIGUOUS_BIT)
#define set_contiguous(M) ((M)->flags &= ~NONCONTIGUOUS_BIT)
/* ARENA_CORRUPTION_BIT is set if a memory corruption was detected on the
arena. Such an arena is no longer used to allocate chunks. Chunks
allocated in that arena before detecting corruption are not freed. */
#define ARENA_CORRUPTION_BIT (4U)
#define arena_is_corrupt(A) (((A)->flags & ARENA_CORRUPTION_BIT))
#define set_arena_corrupt(A) ((A)->flags |= ARENA_CORRUPTION_BIT)
(3)fastbinsY[NFASTBINS]
存放每个 fast chunk 链表头部的指针
(4)top
指向分配区的 top chunk
(5)last_reminder
最新的 chunk 分割之后剩下的那部分
(6)bins
用于存储 unstored bin, small bins, large bins 的 chunk 链表
(7)binmap
ptmalloc 用一个 bit 来标识某一个 bin 中是否包含空闲的 chunk.