# 前言
本系列文章意在记录答主学习CSAPP Lab的过程,也旨在可以帮助后人一二,欢迎大家指正!
tips:本lab主要是为了体验应用程序如何使用和管理虚拟内存,写一个动态存储分配器(dynamic storage allocator)
目标:使用不同方法在内存利用率(memory utilization)和吞吐率(throughput)之中达到trade-off,逐步优化。
修改mm.c
文件,其中包括四个函数:
int mm_init(void);
初始化堆空间,错误返回-1
,否则返回0
。void *mm_malloc(size_t size);
返回指向 size
字节有效载荷的指针,其中堆块保持为 8字节
对齐。void mm_free(void *prt);
释放由 ptr
所指向堆块的空间。void *mm_realloc(void *ptr, size_t size);
尝试重新调整之前调用 malloc
或 calloc
所分配的 ptr
所指向的内存块的大小,size
为新的内存空间的大小。
prt
是 NULL
,则重新分配内存空间,与 mm_malloc(size)
等价。size
是 NULL
,则释放该内存空间,与 mm_free(ptr)
等价。 能够辅助检查堆的一致性,做为debugg
的工具。
memlib.c
void *mem_sbrk(int incr);
按incr
个字节来扩充堆,其中incr
为正数,返回新区域第一个字节的地址。
void *mem_heap_lo(void);
返回指向堆第一个字节的空指针。
void *mem_heap_hi(void);
返回指向堆最后一个字节的空指针。
size_t mem_heapsize(void);
返回当前堆的总大小(以字节为单位)。
size_t mem_pagesize(void);
返回系统页桢的大小(linux
系统中为4K
)。
mm.c
中的函数接口。malloc
,calloc
,free
,realloc
,sbrk
,brk
等)mm.c
中定义任何聚合的变量类型(array,structs,trees,lists等)在最初时可以先使用简单的文件进行测试(eg. short1,2-bal.rep
)
unix> mdriver -V -f short1-bal.rep
理解书中基于隐式空闲列表实现的每一行代码。
在C预处理器宏中封装指针算法。(通过写宏可以降低指针运算的复杂度)
使用一些性能分析器。(eg.gprof
)
在阅读官方handout过程中发现所给的文件中并没有几个trace
,就找寻了下,大家可以从下方的链接中下载,放入目录文档中。
Lab 6 Malloc Lab 完整的 12 traces Github 下载链接
同时在config.h
文件中将以下修改为自己的路径。
#define TRACEDIR "/traces/"
运行测试程序./mdriver -V
输出如下:
Team Name:ateam
Member 1 :Harry Bovik:[email protected]
Using default tracefiles in /mnt/hgfs/CMU15-213/lab/6.malloclab-handout/traces/
Measuring performance with gettimeofday().
Testing mm malloc
Reading tracefile: amptjp-bal.rep
Checking mm_malloc for correctness, efficiency, and performance.
Reading tracefile: cccp-bal.rep
Checking mm_malloc for correctness, efficiency, and performance.
Reading tracefile: cp-decl-bal.rep
Checking mm_malloc for correctness, efficiency, and performance.
Reading tracefile: expr-bal.rep
Checking mm_malloc for correctness, efficiency, and performance.
Reading tracefile: coalescing-bal.rep
ERROR: mem_sbrk failed. Ran out of memory...
Checking mm_malloc for correctness, ERROR [trace 4, line 7673]: mm_malloc failed.
Reading tracefile: random-bal.rep
ERROR: mem_sbrk failed. Ran out of memory...
Checking mm_malloc for correctness, ERROR [trace 5, line 1662]: mm_malloc failed.
Reading tracefile: random2-bal.rep
ERROR: mem_sbrk failed. Ran out of memory...
Checking mm_malloc for correctness, ERROR [trace 6, line 1780]: mm_malloc failed.
Reading tracefile: binary-bal.rep
Checking mm_malloc for correctness, efficiency, and performance.
Reading tracefile: binary2-bal.rep
Checking mm_malloc for correctness, efficiency, and performance.
Reading tracefile: realloc-bal.rep
ERROR: mem_sbrk failed. Ran out of memory...
Checking mm_malloc for correctness, ERROR [trace 9, line 1705]: mm_realloc failed.
Reading tracefile: realloc2-bal.rep
ERROR: mem_sbrk failed. Ran out of memory...
Checking mm_malloc for correctness, ERROR [trace 10, line 6562]: mm_realloc failed.
Results for mm malloc:
trace valid util ops secs Kops
0 yes 23% 5694 0.000023246494
1 yes 19% 5848 0.000020293869
2 yes 30% 6648 0.000024272459
3 yes 40% 5380 0.000031176393
4 no - - - -
5 no - - - -
6 no - - - -
7 yes 55% 12000 0.000041290557
8 yes 51% 24000 0.000079303413
9 no - - - -
10 no - - - -
Total - - - -
Terminated with 5 errors
这样才算是将所有的准备工作都弄好啦。
在Makefile
文件中将
CC = gcc
CFLAGS = -Wall -O2 -m32
修改为
CC = gcc -g # 增加调试信息
CFLAGS = -Wall -O0 -m32 # 降低优化等级,以便于调试
最简单的分配器会把堆组织成一个大的字节数组,还有一个指针p,初始指向这个数组的第一个字节。为了分配 size 个字节,malloc 将p的当前值保存在栈里,将p增加 size,并将p的旧值返回到调用函数。free 只是简单地返回到调用函数,而不做其他事情。
优缺点:
malloc
和free
只执行很少的指令)int mm_init(void);
void *mm_malloc(size_t size);
size
字节有效载荷和包含结点大小的头(注意对齐8字节)。void mm_free(void *prt);
void *mm_realloc(void *ptr, size_t size);
/*
* mm-naive.c - The fastest, least memory-efficient malloc package.
*
* In this naive approach, a block is allocated by simply incrementing
* the brk pointer. A block is pure payload. There are no headers or
* footers. Blocks are never coalesced or reused. Realloc is
* implemented directly using mm_malloc and mm_free.
*
* NOTE TO STUDENTS: Replace this header comment with your own header
* comment that gives a high level description of your solution.
*/
#include
#include
#include
#include
#include
#include "mm.h"
#include "memlib.h"
/*********************************************************
* NOTE TO STUDENTS: Before you do anything else, please
* provide your team information in the following struct.
********************************************************/
team_t team = {
/* Team name */
"ateam",
/* First member's full name */
"Harry Bovik",
/* First member's email address */
"[email protected]",
/* Second member's full name (leave blank if none) */
"",
/* Second member's email address (leave blank if none) */
""
};
/* single word (4) or double word (8) alignment */
#define ALIGNMENT 8
/* rounds up to the nearest multiple of ALIGNMENT */ // 对ALIGNMENT倍数上取整的计算
#define ALIGN(size) (((size) + (ALIGNMENT-1)) & ~0x7)
#define SIZE_T_SIZE (ALIGN(sizeof(size_t))) //将保存大小的部分也对齐是为了访问存储数据块的地址对齐,因为地址对齐的是有效载荷处,而不是块开头处(开关存放了此块的有效载荷大小)
/*
* mm_init - initialize the malloc package.
*/
int mm_init(void)
{
return 0;
}
/*
* mm_malloc - Allocate a block by incrementing the brk pointer.
* Always allocate a block whose size is a multiple of the alignment.
*/
void *mm_malloc(size_t size)
{
int newsize = ALIGN(size + SIZE_T_SIZE);
void *p = mem_sbrk(newsize); //p指向新的分配块的第一个字节
if (p == (void *)-1) //分配堆空间错误
return NULL;
else {
*(size_t *)p = size; //有效载荷大小为size
return (void *)((char *)p + SIZE_T_SIZE);
}
}
/*
* mm_free - Freeing a block does nothing.
*/
void mm_free(void *ptr)
{
}
/*
* mm_realloc - Implemented simply in terms of mm_malloc and mm_free
*/
//无脑~直接在现有的堆后创建新空间
void *mm_realloc(void *ptr, size_t size)
{
void *oldptr = ptr;
void *newptr;
size_t copySize; //旧内存块的大小
newptr = mm_malloc(size);
if (newptr == NULL)
return NULL;
copySize = *(size_t *)((char *)oldptr - SIZE_T_SIZE);
if (size < copySize)
copySize = size;
memcpy(newptr, oldptr, copySize); //复制copeSize大小的数据至newptr中
mm_free(oldptr);
return newptr;
}
优缺点:
int mm_init(void);
void *mm_malloc(size_t size);
void mm_free(void *prt);
void *mm_realloc(void *ptr, size_t size);
static void *extend_heap(size_t size);
static void *find_fit(size_t size);
static void place(char *bp, size_t size);
static void *coalesce(void *bp);
static void mm_printblock(int verbose, const char* func);
/*
* mm-naive.c - The fastest, least memory-efficient malloc package.
*
* In this naive approach, a block is allocated by simply incrementing
* the brk pointer. A block is pure payload. There are no headers or
* footers. Blocks are never coalesced or reused. Realloc is
* implemented directly using mm_malloc and mm_free.
*
* NOTE TO STUDENTS: Replace this header comment with your own header
* comment that gives a high level description of your solution.
*/
#include
#include
#include
#include
#include
#include "mm.h"
#include "memlib.h"
/*********************************************************
* NOTE TO STUDENTS: Before you do anything else, please
* provide your team information in the following struct.
********************************************************/
team_t team = {
/* Team name */
"ateam",
/* First member's full name */
"Harry Bovik",
/* First member's email address */
"[email protected]",
/* Second member's full name (leave blank if none) */
"",
/* Second member's email address (leave blank if none) */
""
};
#define VERBOSE 0
#ifdef DEBUG
#define VERBOSE 1
#endif
/* single word (4) or double word (8) alignment */
#define ALIGNMENT 8
/* rounds up to the nearest multiple of ALIGNMENT */ // 对 ALIGNMENT 倍数上取整的计算
#define ALIGN(size) (((size) + (ALIGNMENT-1)) & ~0x7)
#define SIZE_T_SIZE (ALIGN(sizeof(size_t)))
/* 自定义的宏,有便于操作常量和指针运算 */
#define WSIZE 4 //字、脚部或头部的大小(字节)
#define DSIZE 8 //双字大小(字节)
#define CHUNKSIZE (1<<12) //扩展堆时的默认大小
#define MINBLOCK (DSIZE + 2*WSIZE)
#define MAX(x, y) ((x) > (y) ? (x) : (y))
#define PACK(size, alloc) ((size) | (alloc)) //将 size 和 allocated bit 合并为一个字
#define GET(p) (*(unsigned int *)(p)) //读地址p处的一个字
#define PUT(p, val) (*(unsigned int *)(p) = (val)) //向地址p处写一个字
#define GET_SIZE(p) (GET(p) & ~0x07) //得到地址p处的 size
#define GET_ALLOC(p) (GET(p) & 0x1) //得到地址p处的 allocated bit
//block point --> bp指向有效载荷块指针
#define HDRP(bp) ((char*)(bp) - WSIZE) //获得头部的地址
#define FTRP(bp) ((char*)(bp) + GET_SIZE(HDRP(bp)) - DSIZE) //获得脚部的地址, 与宏定义HDRP有耦合
#define NEXT_BLKP(bp) ((char*)(bp) + GET_SIZE((char*)(bp) - WSIZE)) //计算后块的地址
#define PREV_BLKP(bp) ((char*)(bp) - GET_SIZE((char*)(bp) - DSIZE)) //计算前块的地址
static void* heap_listp; //指向序言块
/* private functions */
static void *extend_heap(size_t size); //拓展堆块
static void *find_fit(size_t size); //寻找空闲块
static void place(char *bp, size_t size); //分割空闲块
static void *coalesce(void *bp); //合并空闲块
//check
/*
static void mm_check(int verbose, const char* func); //heap consistency checker
static void mm_checkblock(int verbose, const char* func, void *bp);
static int mm_checkheap(int verbose, const char* func);
*/
static void mm_printblock(int verbose, const char* func);
/*
* mm_init - initialize the malloc package.
*/
//设立序言块、结尾块,以及序言块前的对齐块(4B),总共需要4个4B的空间
int mm_init(void)
{
if ((heap_listp = mem_sbrk(4*WSIZE)) == (void*)-1)
return -1;
PUT(heap_listp, 0); //堆起绐位置的对齐块,使bp对齐8字节
PUT(heap_listp + 1*WSIZE, PACK(8, 1)); //序言块
PUT(heap_listp + 2*WSIZE, PACK(8, 1)); //序言块
PUT(heap_listp, PACK(0, 1)); //结尾块
heap_listp += (2*WSIZE); //小技巧:使heap_listp指向下一块, 即两个序主块中间
if (extend_heap(CHUNKSIZE) == NULL) //拓展堆块
return -1;
mm_printblock(VERBOSE, __func__);
return 0;
}
static void *extend_heap(size_t size) {
size_t asize;
void *bp;
asize = ALIGN(size);
//printf("extend %d\n", asize);
if ((long)(bp = mem_sbrk(asize)) == -1)
return NULL;
PUT(HDRP(bp), PACK(asize, 0)); //HDRP(bp)指向原结尾块
PUT(FTRP(bp), PACK(asize, 0));
PUT(HDRP(NEXT_BLKP(bp)), PACK(0, 1)); //新结尾块
return coalesce(bp);
}
/*
* mm_malloc - Allocate a block by incrementing the brk pointer.
* Always allocate a block whose size is a multiple of the alignment.
*/
void *mm_malloc(size_t size)
{
size_t asize; //ajusted size
size_t extendsize; //若无适配块则拓展堆的大小
void *bp = NULL;
if (size == 0) //无效的申请
return NULL;
asize = ALIGN(size + 2*WSIZE);
if ((bp = find_fit(asize)) != NULL) {
place((char *)bp, asize);
mm_printblock(VERBOSE, __func__);
return bp;
}
//无足够空间的空闲块用来分配
extendsize = MAX(asize, CHUNKSIZE);
if ((bp = extend_heap(extendsize)) == NULL) {
return NULL;
}
place(bp, asize);
mm_printblock(VERBOSE, __func__);
return bp;
}
//放置策略搜索 首次适配搜索
static void *find_fit(size_t size) {
void *curbp;
for (curbp = heap_listp; GET_SIZE(HDRP(curbp)) > 0; curbp = NEXT_BLKP(curbp)) {
if (!GET_ALLOC(HDRP(curbp)) && (GET_SIZE(HDRP(curbp)) >= size)) return curbp;
}
return NULL; //未适配
}
//分割空闲块
static void place(char *bp, size_t asize) { //注意最小块的限制(16B == DSIZE + 2*WSIZE == MINBLOCK)
size_t total_size = GET_SIZE(HDRP(bp));
size_t remainder_size = total_size - asize;
if (remainder_size >= MINBLOCK) {
PUT(HDRP(bp), PACK(asize, 1));
PUT(FTRP(bp), PACK(asize, 1));
bp = NEXT_BLKP(bp);
PUT(HDRP(bp), PACK(remainder_size, 0));
PUT(FTRP(bp), PACK(remainder_size, 0));
} else { //没有已分配块或空闲块可以比最小块更小
PUT(HDRP(bp), PACK(total_size, 1));
PUT(FTRP(bp), PACK(total_size, 1));
}
}
/*
* mm_free - Freeing a block does nothing.
*/
void mm_free(void *ptr)
{
size_t size = GET_SIZE(HDRP(ptr));
PUT(HDRP(ptr), PACK(size, 0));
PUT(FTRP(ptr), PACK(size, 0));
coalesce(ptr);
mm_printblock(VERBOSE, __func__);
}
/*
* coalesce - 合并内存块
*/
static void *coalesce(void *bp) {
int pre_alloc = GET_ALLOC(HDRP(PREV_BLKP(bp)));
int post_alloc = GET_ALLOC(HDRP(NEXT_BLKP(bp)));
size_t size = GET_SIZE(HDRP(bp));
if (pre_alloc && post_alloc) {
return bp;
} else if (pre_alloc && !post_alloc) { //与后块合并
size += GET_SIZE(HDRP(NEXT_BLKP(bp)));
} else if (!pre_alloc && post_alloc) { //与前块合并
size += GET_SIZE(HDRP(PREV_BLKP(bp)));
bp = PREV_BLKP(bp);
} else { //前后块都合并
size += GET_SIZE(HDRP(PREV_BLKP(bp))) + GET_SIZE(FTRP(NEXT_BLKP(bp)));
bp = PREV_BLKP(bp);
}
PUT(HDRP(bp), PACK(size, 0));
PUT(FTRP(bp), PACK(size, 0)); //FTRP()与GET_SIZE()有耦合,故此时所用的SIZE已经改变
return bp;
}
/*
* mm_realloc - Implemented simply in terms of mm_malloc and mm_free
*/
void *mm_realloc(void *ptr, size_t size)
{
size_t old_size, new_size, extendsize;
void *old_ptr, *new_ptr;
if (ptr == NULL) {
return mm_malloc(size);
}
if (size == 0) {
mm_free(ptr);
return NULL;
}
new_size = ALIGN(size + 2*WSIZE);
old_size = GET_SIZE(HDRP(ptr));
old_ptr = ptr;
if (old_size >= new_size) {
if (old_size - new_size >= MINBLOCK) { //分割内存块
place(old_ptr, new_size);
mm_printblock(VERBOSE, __func__);
return old_ptr;
} else { //剩余块小于最小块大小,不分割
mm_printblock(VERBOSE, __func__);
return old_ptr;
}
} else { //释放原内存块,寻找新内存块
if ((new_ptr = find_fit(new_size)) == NULL) { //无合适内存块
extendsize = MAX(new_size, CHUNKSIZE);
if ((new_ptr = extend_heap(extendsize)) == NULL) //拓展堆空间
return NULL;
}
place(new_ptr, new_size);
memcpy(new_ptr, old_ptr, old_size - 2*WSIZE);
mm_free(old_ptr);
mm_printblock(VERBOSE, __func__);
return new_ptr;
}
}
static void mm_printblock(int verbose, const char* func) {
if (!verbose) return;
char *curbp;
printf("\n=========================== %s ===========================\n" ,func);
for (curbp = heap_listp; GET_SIZE(HDRP(curbp)) > 0; curbp = NEXT_BLKP(curbp)) {
printf("address = %p\n", curbp);
printf("hsize = %d, fsize = %d\n", GET_SIZE(HDRP(curbp)), GET_SIZE(FTRP(curbp)));
printf("halloc = %d, falloc = %d\n", GET_ALLOC(HDRP(curbp)), GET_ALLOC(FTRP(curbp)));
printf("\n");
}
//epilogue blocks
printf("address = %p\n", curbp);
printf("hsize = %d\n", GET_SIZE(HDRP(curbp)));
printf("halloc = %d\n", GET_ALLOC(HDRP(curbp)));
printf("=========================== %s ===========================\n" ,func);
}
/*
static void mm_check(int verbose, const char* func) {
if (!verbose) return;
if (mm_checkheap(verbose, func)) {
void *curbp;
for (curbp = heap_listp; GET_SIZE(HDRP(curbp)) > 0; curbp = NEXT_BLKP(curbp)) {
mm_checkblock(verbose, func, curbp);
}
}
}
static void mm_checkblock(int verbose, const char* func, void* bp) {
if (!verbose) return;
if (GET(HDRP(bp)) != GET(FTRP(bp))) {
printf("\n=========================== %s ===========================\n" ,func);
printf("Error: %p's Header and footer are not match.\n", bp);
printf("hsize = %d, fsize = %d\n", GET_SIZE(HDRP(bp)), GET_SIZE(FTRP(bp)));
printf("halloc = %d, falloc = %d\n", GET_ALLOC(HDRP(bp)), GET_ALLOC(FTRP(bp)));
printf("next_head_alloc = %d, next_footer_alloc = %d\n", GET_ALLOC(HDRP(NEXT_BLKP(bp))), GET_ALLOC(FTRP(NEXT_BLKP(bp))));
printf("=========================== %s ===========================\n" ,func);
}
if ((int)bp % ALIGNMENT != 0)
printf("Error: %p's Payload area is not aligned.\n", bp);
if (GET_SIZE(HDRP(bp)) % ALIGNMENT != 0)
printf("Error: %p payload size is not doubleword aligned.\n", bp);
}
static int mm_checkheap(int verbose, const char* func) {
char *endp = (char *)mem_heap_hi()+1;
char *curbp;
//check prologue blocks
if (GET(HDRP(heap_listp)) != GET(FTRP(heap_listp))) {
printf("Error: Prologue blocks dosn't have same size/alloc fields.\n");
return 0;
}
if (GET_ALLOC(HDRP(heap_listp)) != 1 || GET_SIZE(HDRP(heap_listp)) != 8) {
printf("Error: Prologue blocks dosn't have special size/alloc fields.\n");
return 0;
}
//chekc epilogue blocks
for (curbp = heap_listp; GET_SIZE(HDRP(curbp)) > 0; curbp = NEXT_BLKP(curbp)) {}
if (curbp != endp) {
printf("Error: A block with size 0 isn't endp\n");
printf("Its size is %d, address is %p and alloc is %d\n", GET_SIZE(HDRP(curbp)), curbp, GET_ALLOC(HDRP(curbp)));
return 0;
}
if (GET_ALLOC(HDRP(endp)) != 1 || GET_SIZE(HDRP(endp)) != 0) {
printf("Error: Epilogue blocks are not at specific locations.\n");
return 0;
}
return 1;
}
*/
#### 调试信息
部分调试打印信息:(可以清晰地看出我们的设计方案)
// ./mdriver -V -f short1-bal.rep > out.txt
=========================== mm_init ===========================
address = 0xf698f018
hsize = 8, fsize = 8
halloc = 1, falloc = 1
address = 0xf698f020
hsize = 4096, fsize = 4096
halloc = 0, falloc = 0
address = 0xf6990020
hsize = 0
halloc = 1
=========================== mm_init ===========================
=========================== mm_malloc ===========================
address = 0xf698f018
hsize = 8, fsize = 8
halloc = 1, falloc = 1
address = 0xf698f020
hsize = 2048, fsize = 2048
halloc = 1, falloc = 1
address = 0xf698f820
hsize = 2048, fsize = 2048
halloc = 0, falloc = 0
address = 0xf6990020
hsize = 0
halloc = 1
=========================== mm_malloc ===========================
Results for mm malloc:
trace valid util ops secs Kops
0 yes 99% 5694 0.003636 1566
1 yes 99% 5848 0.003695 1583
2 yes 99% 6648 0.005280 1259
3 yes 100% 5380 0.003648 1475
4 yes 66% 14400 0.000089161254
5 yes 92% 4800 0.004965 967
6 yes 92% 4800 0.004495 1068
7 yes 55% 12000 0.086989 138
8 yes 51% 24000 0.126514 190
9 yes 27% 14401 0.028150 512
10 yes 30% 14401 0.000831 17330
Total 74% 112372 0.268292 419
Perf index = 44 (util) + 28 (thru) = 72/100
注意!!! 使用下一次适配时会有一个问题,可能在下一次使用上次位置的指针时,此指针已不存在(具体表现在合并前块操作中,会将此块的指针合并掉)
static void* next_fitp; //下一次适配指向的指针
//在init函数中要对此指针进行赋值
//find_fit版本1 使用do-while循环
static void *find_fit(size_t size) {
char *endp, *lastp;
next_fitp = NEXT_BLKP(next_fitp); //此次开始搜索的位置
endp = (char *)mem_heap_hi() + 1;
lastp = next_fitp;
do {
if (next_fitp == endp) {
next_fitp = heap_listp;
continue;
}
if (!GET_ALLOC(HDRP(next_fitp)) && (GET_SIZE(HDRP(next_fitp)) >= size))
return next_fitp;
next_fitp = NEXT_BLKP(next_fitp);
} while(next_fitp != lastp);
return NULL;
}
//find_fit版本2 无脑用代码写两次大致相同的for循环
static void *find_fit(size_t size) {
char *lastp;
next_fitp = NEXT_BLKP(next_fitp); //此次搜索开始的位置
lastp = next_fitp;
for (;GET_SIZE(HDRP(next_fitp)) > 0; next_fitp = NEXT_BLKP(next_fitp)) {
if (!GET_ALLOC(HDRP(next_fitp)) && (GET_SIZE(HDRP(next_fitp)) >= size)) {
return next_fitp;
}
}
next_fitp = NEXT_BLKP(heap_listp);
for (;next_fitp != lastp; next_fitp = NEXT_BLKP(next_fitp)) {
if (!GET_ALLOC(HDRP(next_fitp)) && (GET_SIZE(HDRP(next_fitp)) >= size)) {
return next_fitp;
}
}
return NULL;
}
//只针对其中需要与前块合并时的代码进行修改
static void *coalesce(void *bp) {
int pre_alloc = GET_ALLOC(HDRP(PREV_BLKP(bp)));
int post_alloc = GET_ALLOC(HDRP(NEXT_BLKP(bp)));
size_t size = GET_SIZE(HDRP(bp));
if (pre_alloc && post_alloc) {
return bp;
} else if (pre_alloc && !post_alloc) { //与后块合并
size += GET_SIZE(HDRP(NEXT_BLKP(bp)));
} else if (!pre_alloc && post_alloc) { //与前块合并
size += GET_SIZE(HDRP(PREV_BLKP(bp)));
if (bp == next_fitp) {
next_fitp = PREV_BLKP(bp);
}
bp = PREV_BLKP(bp);
} else { //前后块都合并
size += GET_SIZE(HDRP(PREV_BLKP(bp))) + GET_SIZE(FTRP(NEXT_BLKP(bp)));
if (bp == next_fitp) {
next_fitp = PREV_BLKP(bp);
}
bp = PREV_BLKP(bp);
}
PUT(HDRP(bp), PACK(size, 0));
PUT(FTRP(bp), PACK(size, 0)); //FTRP()与GET_SIZE()有耦合,故此时所用的SIZE已经改变
return bp;
}
Results for mm malloc:
trace valid util ops secs Kops
0 yes 90% 5694 0.001078 5280
1 yes 91% 5848 0.000686 8526
2 yes 95% 6648 0.001864 3567
3 yes 96% 5380 0.001754 3067
4 yes 66% 14400 0.000094153518
5 yes 91% 4800 0.002902 1654
6 yes 89% 4800 0.002691 1783
7 yes 55% 12000 0.009888 1214
8 yes 51% 24000 0.003605 6658
9 yes 27% 14401 0.028398 507
10 yes 45% 14401 0.000729 19754
Total 72% 112372 0.053688 2093
Perf index = 43 (util) + 40 (thru) = 83/100
分数对比:原44 (util) + 28 (thru) = 72/100
,现43 (util) + 40 (thru) = 83/100
。
首次适配:
下一次适配:
针对得到的两次分数,可以明显的看到下一次适配的吞吐率比首次适配高很多,这方面next fit
优势明显,但内存利用率要低。
三种放置策略:
研究表明,最佳适配比首次适配和下一次适配的内存利用率都要高一些,但在隐式空闲链表(简单空闲链表)中需要对堆进行彻底的搜索,而后面的分离式空闲链表组织会接近于最佳适配策略,而不需要进行彻底的堆搜索。
在内存块的基本分配下,在空闲块中增加指向前后的指针。以提高搜索适配块的效率,缺点就是增加了最小块的限制,最小块必须把两个指针也放进去。以支持维持一个 freelist 表以便遍历。
优缺点:
分割策略:
合并策略:
与隐式空闲链表相比,只需在其基础上增加空闲块内的两指针,以及针对空闲链表进行的操作即可。
#define MINBLOCK (DSIZE + 2*WSIZE + 2*WSIZE) //头部、脚部、两指针、8字节数据
#define GETADDR(p) (*(unsigned int **)(p)) //读地址p处的一个指针
#define PUTADDR(p, addr) (*(unsigned int **)(p) = (unsigned int *)(addr)) //向地址p处写一个指针
#define PRED_POINT(bp) (bp) //指向祖先指针的指针
#define SUCC_POINT(bp) ((char*)(bp) + WSIZE) //指向后继指针的指针
static void* head_free; //空闲链表的头结点,存放在堆区开关的对齐块中
static void insert_freelist(void *bp);
使用头插法向空闲链表中插入空闲块static void remove_freelist(void *bp);
从空闲链表中移除空闲块,合并中使用(合并策略)static void place_freelist(void *bp);
对空闲链表中的空闲块进行前部分割(分割策略)static void *find_fit(size_t size);
针对链表进行搜索上的修改/*
* mm-naive.c - The fastest, least memory-efficient malloc package.
*
* In this naive approach, a block is allocated by simply incrementing
* the brk pointer. A block is pure payload. There are no headers or
* footers. Blocks are never coalesced or reused. Realloc is
* implemented directly using mm_malloc and mm_free.
*
* NOTE TO STUDENTS: Replace this header comment with your own header
* comment that gives a high level description of your solution.
*/
#include
#include
#include
#include
#include
#include "mm.h"
#include "memlib.h"
/*********************************************************
* NOTE TO STUDENTS: Before you do anything else, please
* provide your team information in the following struct.
********************************************************/
team_t team = {
/* Team name */
"ateam",
/* First member's full name */
"Harry Bovik",
/* First member's email address */
"[email protected]",
/* Second member's full name (leave blank if none) */
"",
/* Second member's email address (leave blank if none) */
""
};
#define VERBOSE 0
#ifdef DEBUG
#define VERBOSE 1
#endif
/* single word (4) or double word (8) alignment */
#define ALIGNMENT 8
/* rounds up to the nearest multiple of ALIGNMENT */ // 对 ALIGNMENT 倍数上取整的计算
#define ALIGN(size) (((size) + (ALIGNMENT-1)) & ~0x7)
#define SIZE_T_SIZE (ALIGN(sizeof(size_t)))
/* 自定义的宏,有便于操作常量和指针运算 */
#define WSIZE 4 //字、脚部或头部的大小(字节)
#define DSIZE 8 //双字大小(字节)
#define CHUNKSIZE (1<<12) //扩展堆时的默认大小
#define MINBLOCK (DSIZE + 2*WSIZE + 2*WSIZE) //头部、脚部、两指针、8字节数据
#define MAX(x, y) ((x) > (y) ? (x) : (y))
#define PACK(size, alloc) ((size) | (alloc)) //将 size 和 allocated bit 合并为一个字
#define GET(p) (*(unsigned int *)(p)) //读地址p处的一个字
#define PUT(p, val) (*(unsigned int *)(p) = (val)) //向地址p处写一个字
#define GETADDR(p) (*(unsigned int **)(p)) //读地址p处的一个指针
#define PUTADDR(p, addr) (*(unsigned int **)(p) = (unsigned int *)(addr)) //向地址p处写一个指针
#define GET_SIZE(p) (GET(p) & ~0x07) //得到地址p处的 size
#define GET_ALLOC(p) (GET(p) & 0x1) //得到地址p处的 allocated bit
//block point --> bp指向有效载荷块指针
#define HDRP(bp) ((char*)(bp) - WSIZE) //获得头部的地址
#define FTRP(bp) ((char*)(bp) + GET_SIZE(HDRP(bp)) - DSIZE) //获得脚部的地址, 与宏定义HDRP有耦合
#define NEXT_BLKP(bp) ((char*)(bp) + GET_SIZE((char*)(bp) - WSIZE)) //计算后块的地址
#define PREV_BLKP(bp) ((char*)(bp) - GET_SIZE((char*)(bp) - DSIZE)) //计算前块的地址
#define PRED_POINT(bp) (bp) //指向祖先指针的指针
#define SUCC_POINT(bp) ((char*)(bp) + WSIZE) //指向后继指针的指针
static void* heap_listp; //指向序言块
static void* head_free; //空闲链表的头结点,存放在堆区开关的对齐块中
/* private functions */
static void *extend_heap(size_t size); //拓展堆块
static void *find_fit(size_t size); //寻找空闲块 first fit
static void place(void *bp, size_t size); //分割空闲块
static void *coalesce(void *bp); //合并空闲块
/* 链表操作 */
static void insert_freelist(void *bp);
static void remove_freelist(void *bp);
static void place_freelist(void *bp);
//check
static void mm_printblock(int verbose, const char* func);
/*
* mm_init - initialize the malloc package.
*/
//设立序言块、结尾块,以及序言块前的对齐块(4B),总共需要4个4B的空间
int mm_init(void)
{
if ((heap_listp = mem_sbrk(4*WSIZE)) == (void*)-1)
return -1;
PUTADDR(heap_listp, NULL); //堆起绐位置的对齐块,使bp对齐8字节
PUT(heap_listp + 1*WSIZE, PACK(8, 1)); //序言块
PUT(heap_listp + 2*WSIZE, PACK(8, 1)); //序言块
PUT(heap_listp, PACK(0, 1)); //结尾块
head_free = heap_listp; //利用对齐块存放空闲链表的头结点
PUTADDR(head_free, NULL);
heap_listp += (2*WSIZE); //小技巧:使heap_listp指向下一块, 即两个序主块中间
if (extend_heap(CHUNKSIZE) == NULL) //拓展堆块
return -1;
mm_printblock(VERBOSE, __func__);
return 0;
}
//使用头插法,将空闲块插入空闲链表中
static void insert_freelist(void *bp) { //LIFO 后进先出,头插法
if (GETADDR(head_free) == NULL) {
PUTADDR(SUCC_POINT(bp), NULL);
PUTADDR(PRED_POINT(bp), head_free);
PUTADDR(head_free, bp);
} else {
void *tmp;
tmp = GETADDR(head_free);
PUTADDR(SUCC_POINT(bp), tmp);
PUTADDR(PRED_POINT(bp), head_free);
PUTADDR(head_free, bp);
PUTADDR(PRED_POINT(tmp), bp);
tmp = NULL;
}
}
//将 bp 所指的空闲块从空闲链表中移除(进行合并操作中会用到)
static void remove_freelist(void *bp) {
void *pre_block, *post_block;
pre_block = GETADDR(PRED_POINT(bp));
post_block = GETADDR(SUCC_POINT(bp));
//处理前序结点
if (pre_block == head_free) {
PUTADDR(head_free, post_block);
} else {
PUTADDR(SUCC_POINT(pre_block), post_block);
}
//处理后序结点
if (post_block != NULL) {
PUTADDR(PRED_POINT(post_block), pre_block);
}
}
//对空闲链表中的空闲块进行前部分割
static void place_freelist(void *bp) {
void *pre_block, *post_block, *next_bp;
//存储前后结点地址
pre_block = GETADDR(PRED_POINT(bp));
post_block = GETADDR(SUCC_POINT(bp));
next_bp = NEXT_BLKP(bp);
//处理新的bp,进行前后连接
PUTADDR(PRED_POINT(next_bp), pre_block);
PUTADDR(SUCC_POINT(next_bp), post_block);
//处理前序结点 针对head_free是前序结点的特殊处理
if (pre_block == head_free) {
PUTADDR(head_free, next_bp);
} else {
PUTADDR(SUCC_POINT(pre_block), next_bp);
}
//处理后序结点
if (post_block != NULL) {
PUTADDR(PRED_POINT(post_block), next_bp);
}
}
/*
* mm_malloc - Allocate a block by incrementing the brk pointer.
* Always allocate a block whose size is a multiple of the alignment.
*/
void *mm_malloc(size_t size)
{
size_t asize; //ajusted size
size_t extendsize; //若无适配块则拓展堆的大小
void *bp = NULL;
if (size == 0) //无效的申请
return NULL;
asize = ALIGN(size + 2*WSIZE);
if ((bp = find_fit(asize)) != NULL) {
place(bp, asize);
return bp;
}
//无足够空间的空闲块用来分配
extendsize = MAX(asize, CHUNKSIZE);
if ((bp = extend_heap(extendsize)) == NULL) {
return NULL;
}
place(bp, asize);
mm_printblock(VERBOSE, __func__);
return bp;
}
static void *extend_heap(size_t size) {
size_t asize;
void *bp;
asize = ALIGN(size);
if ((long)(bp = mem_sbrk(asize)) == -1)
return NULL;
PUT(HDRP(bp), PACK(asize, 0)); //HDRP(bp)指向原结尾块
PUT(FTRP(bp), PACK(asize, 0));
PUT(HDRP(NEXT_BLKP(bp)), PACK(0, 1)); //新结尾块
return coalesce(bp);
}
//放置策略搜索 首次适配搜索
static void *find_fit(size_t size) {
void *curbp;
for (curbp = GETADDR(head_free); curbp != NULL; curbp = GETADDR(SUCC_POINT(curbp))) {
if (GET_SIZE(HDRP(curbp)) >= size)
return curbp;
}
return NULL; //未适配
}
//分割空闲块
static void place(void *bp, size_t asize) { //注意最小块的限制(24B == MINBLOCK)
size_t total_size = GET_SIZE(HDRP(bp));
size_t remainder_size = total_size - asize;
if (remainder_size >= MINBLOCK) {
PUT(HDRP(bp), PACK(asize, 1));
PUT(FTRP(bp), PACK(asize, 1));
void *next_bp = NEXT_BLKP(bp);
PUT(HDRP(next_bp), PACK(remainder_size, 0));
PUT(FTRP(next_bp), PACK(remainder_size, 0));
place_freelist(bp);
} else { //没有已分配块或空闲块可以比最小块更小
PUT(HDRP(bp), PACK(total_size, 1));
PUT(FTRP(bp), PACK(total_size, 1));
remove_freelist(bp);
}
}
/*
* mm_free - Freeing a block does nothing.
*/
void mm_free(void *ptr)
{
size_t size = GET_SIZE(HDRP(ptr));
PUT(HDRP(ptr), PACK(size, 0));
PUT(FTRP(ptr), PACK(size, 0));
coalesce(ptr);
mm_printblock(VERBOSE, __func__);
}
/*
* coalesce - 合并内存块
*/
static void *coalesce(void *bp) {
char *pre_block, *post_block;
int pre_alloc = GET_ALLOC(HDRP(PREV_BLKP(bp)));
int post_alloc = GET_ALLOC(HDRP(NEXT_BLKP(bp)));
size_t size = GET_SIZE(HDRP(bp));
if (pre_alloc && post_alloc) {
insert_freelist(bp);
return bp;
} else if (pre_alloc && !post_alloc) { //与后块合并
size += GET_SIZE(HDRP(NEXT_BLKP(bp)));
post_block = NEXT_BLKP(bp); //记录后块的指针
remove_freelist(post_block);
insert_freelist(bp);
} else if (!pre_alloc && post_alloc) { //与前块合并
size += GET_SIZE(HDRP(PREV_BLKP(bp)));
bp = PREV_BLKP(bp);
remove_freelist(bp);
insert_freelist(bp);
} else { //前后块都合并
size += GET_SIZE(HDRP(PREV_BLKP(bp))) + GET_SIZE(FTRP(NEXT_BLKP(bp)));
pre_block = PREV_BLKP(bp);
post_block = NEXT_BLKP(bp);
bp = PREV_BLKP(bp);
remove_freelist(pre_block);
remove_freelist(post_block);
insert_freelist(bp);
}
PUT(HDRP(bp), PACK(size, 0));
PUT(FTRP(bp), PACK(size, 0));
return bp;
}
/*
* mm_realloc - Implemented simply in terms of mm_malloc and mm_free
*/
void *mm_realloc(void *ptr, size_t size)
{
size_t old_size, new_size, extendsize;
void *old_ptr, *new_ptr;
if (ptr == NULL) {
return mm_malloc(size);
}
if (size == 0) {
mm_free(ptr);
return NULL;
}
new_size = ALIGN(size + 2*WSIZE);
old_size = GET_SIZE(HDRP(ptr));
old_ptr = ptr;
if (old_size >= new_size) {
if (old_size - new_size >= MINBLOCK) { //分割内存块
place(old_ptr, new_size);
mm_printblock(VERBOSE, __func__);
return old_ptr;
} else { //剩余块小于最小块大小,不分割
mm_printblock(VERBOSE, __func__);
return old_ptr;
}
} else { //释放原内存块,寻找新内存块
if ((new_ptr = find_fit(new_size)) == NULL) { //无合适内存块
extendsize = MAX(new_size, CHUNKSIZE);
if ((new_ptr = extend_heap(extendsize)) == NULL) //拓展堆空间
return NULL;
}
place(new_ptr, new_size);
memcpy(new_ptr, old_ptr, old_size - 2*WSIZE);
mm_free(old_ptr);
mm_printblock(VERBOSE, __func__);
return new_ptr;
}
}
static void mm_printblock(int verbose, const char* func) {
if (!verbose) return;
void *curbp;
printf("\n=========================== $%s$ ===========================\n" ,func);
printf("================ block ================\n");
for (curbp = heap_listp; GET_SIZE(HDRP(curbp)) > 0; curbp = NEXT_BLKP(curbp)) {
printf("address = %p\n", curbp);
printf("hsize = %d, fsize = %d\n", GET_SIZE(HDRP(curbp)), GET_SIZE(FTRP(curbp)));
printf("halloc = %d, falloc = %d\n", GET_ALLOC(HDRP(curbp)), GET_ALLOC(FTRP(curbp)));
printf("\n");
}
//epilogue blocks
printf("address = %p\n", curbp);
printf("hsize = %d\n", GET_SIZE(HDRP(curbp)));
printf("halloc = %d\n", GET_ALLOC(HDRP(curbp)));
printf("================ block ================\n");
printf("\n");
printf("=============== freelist ===============\n");
for (curbp = GETADDR(head_free); curbp != NULL; curbp = GETADDR(SUCC_POINT(curbp))) {
printf("address = %p, size = %d,%d, alloc = %d,%d\n",
curbp, GET_SIZE(HDRP(curbp)), GET_SIZE(FTRP(curbp)), GET_ALLOC(HDRP(curbp)), GET_ALLOC(FTRP(curbp)));
}
printf("address = %p\n", curbp);
printf("=============== freelist ===============\n");
printf("=========================== $%s$ ===========================\n" ,func);
}
部分调试打印信息:(可以清晰地看出我们的设计方案)
# 进行初始化堆块后内存块以及空闲链表的情况
=========================== $mm_init$ ===========================
================ block ================
address = 0xf69aa018
hsize = 8, fsize = 8
halloc = 1, falloc = 1
address = 0xf69aa020
hsize = 4096, fsize = 4096
halloc = 0, falloc = 0
address = 0xf69ab020
hsize = 0
halloc = 1
================ block ================
=============== freelist ===============
address = 0xf69aa020, size = 4096,4096, alloc = 0,0
address = (nil)
=============== freelist ===============
=========================== $mm_init$ ===========================
# 显式空闲链表 + LIFO
Results for mm malloc:
trace valid util ops secs Kops
0 yes 93% 5694 0.000093 60964
1 yes 94% 5848 0.000088 66455
2 yes 96% 6648 0.000150 44290
3 yes 97% 5380 0.000158 34115
4 yes 66% 14400 0.000131109756
5 yes 89% 4800 0.000306 15686
6 yes 85% 4800 0.000360 13315
7 yes 55% 12000 0.001054 11385
8 yes 51% 24000 0.001897 12652
9 yes 26% 14401 0.032686 441
10 yes 30% 14401 0.000994 14486
Total 71% 112372 0.037918 2964
Perf index = 43 (util) + 40 (thru) = 83/100
static void insert_freelist(void *bp) { //按地址顺序维护链表(线性时间)
void *pre_block, *post_block, *tmp;
tmp = head_free;
for (post_block = GETADDR(head_free); post_block != NULL; post_block = GETADDR(SUCC_POINT(post_block))) {
if (post_block > bp) {
pre_block = GETADDR(PRED_POINT(post_block));
// bp 结点前后序块
PUTADDR(PRED_POINT(bp), pre_block);
PUTADDR(SUCC_POINT(bp), post_block);
//前序块
if (pre_block == head_free) {
PUTADDR(head_free, bp);
} else {
PUTADDR(SUCC_POINT(pre_block), bp);
}
//后序块
PUTADDR(PRED_POINT(post_block), bp);
return;
}
tmp = post_block; //若只能插入链表未尾,则存储最后一个结点
}
// curbp == NULL
//前序结点地址
pre_block = tmp;
//bp 结点前后序块
PUTADDR(PRED_POINT(bp), pre_block);
PUTADDR(SUCC_POINT(bp), NULL);
//前序结点块
if (pre_block == head_free) {
PUTADDR(head_free, bp);
} else {
PUTADDR(SUCC_POINT(pre_block), bp);
}
}
//另外因调试的需要,部分修改了调试函数(关于freelist部分)
static void mm_printblock(int verbose, const char* func) {
if (!verbose) return;
void *curbp;
printf("\n=========================== $%s$ ===========================\n" ,func);
printf("================ block ================\n");
for (curbp = heap_listp; GET_SIZE(HDRP(curbp)) > 0; curbp = NEXT_BLKP(curbp)) {
printf("address = %p\n", curbp);
printf("hsize = %d, fsize = %d\n", GET_SIZE(HDRP(curbp)), GET_SIZE(FTRP(curbp)));
printf("halloc = %d, falloc = %d\n", GET_ALLOC(HDRP(curbp)), GET_ALLOC(FTRP(curbp)));
printf("\n");
}
//epilogue blocks
printf("address = %p\n", curbp);
printf("hsize = %d\n", GET_SIZE(HDRP(curbp)));
printf("halloc = %d\n", GET_ALLOC(HDRP(curbp)));
printf("================ block ================\n");
printf("\n");
printf("=============== freelist ===============\n");
printf("head_address = %p, next_address = %p\n", head_free, GETADDR(head_free));
for (curbp = GETADDR(head_free); curbp != NULL; curbp = GETADDR(SUCC_POINT(curbp))) {
printf("address = %p, next_address = %p, size = %d\n",curbp,
GETADDR(SUCC_POINT(curbp)), GET_SIZE(HDRP(curbp)));
}
printf("address = %p\n", curbp);
printf("=============== freelist ===============\n");
printf("=========================== $%s$ ===========================\n" ,func);
}
# ./mdriver -V -f short1-bal.rep > out.txt
=========================== $mm_init$ ===========================
================ block ================
address = 0xf697d018
hsize = 8, fsize = 8
halloc = 1, falloc = 1
address = 0xf697d020
hsize = 4096, fsize = 4096
halloc = 0, falloc = 0
address = 0xf697e020
hsize = 0
halloc = 1
================ block ================
=============== freelist ===============
head_address = 0xf697d010, next_address = 0xf697d020
address = 0xf697d020, next_address = (nil), size = 4096
address = (nil)
=============== freelist ===============
=========================== $mm_init$ ===========================
# 显式空闲链表+按地址顺序维护链表
Results for mm malloc:
trace valid util ops secs Kops
0 yes 99% 5694 0.000137 41471
1 yes 99% 5848 0.000123 47506
2 yes 99% 6648 0.000168 39501
3 yes 99% 5380 0.000136 39646
4 yes 66% 14400 0.000129111369
5 yes 92% 4800 0.001686 2846
6 yes 92% 4800 0.001677 2863
7 yes 55% 12000 0.012042 996
8 yes 51% 24000 0.071326 336
9 yes 27% 14401 0.029718 485
10 yes 30% 14401 0.000880 16361
Total 74% 112372 0.118024 952
Perf index = 44 (util) + 40 (thru) = 84/100
可以看出提升并不是很大(原为43 (util) + 40 (thru) = 83/100
,现为44 (util) + 40 (thru) = 84/100
)。
因为按照地址顺序来维护空闲链表更接近于最佳适配,故内存利用率有所提高。
而吞吐率方面因插入链表为线性时间,而原搜索适配空闲块为线性时间,方法修改后时间上只为原方法线性时间的常数倍,对大规模操作而言,几乎无变化。
正如我们在前面所看到的,一个使用单向空闲块链表的分配器需要与空闲块数量呈线性关系的时间来分配块,为了近似达到最佳适配以及更快寻找适配块,可以根据不同的_大小类_来维护多个空闲链表。本代码采用的每个大小类都是2的幂。
static void *segList[25]; //左闭右开,根据MAX_HEAP(20*(1<<20)) 即最大不到1<<25
主要修改的是关于链表的操作,与单个链表类似。
同时,因为在realloc
的操作中,如果原内存块不够需求的话,之前的方案是直接寻找适配块,若寻找不到的话就新分配堆块进行操作,但我们没有考虑到的是若原块相邻的空闲块的话那就可以进行合并操作以存放新块,这样就提高了内存使用率,改进的realloc
迎运而生。
/*
* mm-naive.c - The fastest, least memory-efficient malloc package.
*
* In this naive approach, a block is allocated by simply incrementing
* the brk pointer. A block is pure payload. There are no headers or
* footers. Blocks are never coalesced or reused. Realloc is
* implemented directly using mm_malloc and mm_free.
*
* NOTE TO STUDENTS: Replace this header comment with your own header
* comment that gives a high level description of your solution.
*/
#include
#include
#include
#include
#include
#include "mm.h"
#include "memlib.h"
/*********************************************************
* NOTE TO STUDENTS: Before you do anything else, please
* provide your team information in the following struct.
********************************************************/
team_t team = {
/* Team name */
"ateam",
/* First member's full name */
"Harry Bovik",
/* First member's email address */
"[email protected]",
/* Second member's full name (leave blank if none) */
"",
/* Second member's email address (leave blank if none) */
""
};
#define VERBOSE 0
#ifdef DEBUG
#define VERBOSE 1
#endif
/* single word (4) or double word (8) alignment */
#define ALIGNMENT 8
/* rounds up to the nearest multiple of ALIGNMENT */ // 对 ALIGNMENT 倍数上取整的计算
#define ALIGN(size) (((size) + (ALIGNMENT-1)) & ~0x7)
#define SIZE_T_SIZE (ALIGN(sizeof(size_t)))
/* 自定义的宏,有便于操作常量和指针运算 */
#define WSIZE 4 //字、脚部或头部的大小(字节)
#define DSIZE 8 //双字大小(字节)
#define CHUNKSIZE (1<<12) //扩展堆时的默认大小
#define MINBLOCK (DSIZE + 2*WSIZE + 2*WSIZE) //头部、脚部、两指针、8字节数据
#define MAX(x, y) ((x) > (y) ? (x) : (y))
#define PACK(size, alloc) ((size) | (alloc)) //将 size 和 allocated bit 合并为一个字
#define GET(p) (*(unsigned int *)(p)) //读地址p处的一个字
#define PUT(p, val) (*(unsigned int *)(p) = (val)) //向地址p处写一个字
#define GETADDR(p) (*(unsigned int **)(p)) //读地址p处的一个指针
#define PUTADDR(p, addr) (*(unsigned int **)(p) = (unsigned int *)(addr)) //向地址p处写一个指针
#define GET_SIZE(p) (GET(p) & ~0x07) //得到地址p处的 size
#define GET_ALLOC(p) (GET(p) & 0x1) //得到地址p处的 allocated bit
//block point --> bp指向有效载荷块指针
#define HDRP(bp) ((char*)(bp) - WSIZE) //获得头部的地址
#define FTRP(bp) ((char*)(bp) + GET_SIZE(HDRP(bp)) - DSIZE) //获得脚部的地址, 与宏定义HDRP有耦合
#define NEXT_BLKP(bp) ((char*)(bp) + GET_SIZE((char*)(bp) - WSIZE)) //计算后块的地址
#define PREV_BLKP(bp) ((char*)(bp) - GET_SIZE((char*)(bp) - DSIZE)) //计算前块的地址
#define PRED_POINT(bp) (bp) //指向祖先指针的指针
#define SUCC_POINT(bp) ((char*)(bp) + WSIZE) //指向后继指针的指针
static void *heap_listp; //指向序言块
static void *segList[25]; //左闭右开,根据MAX_HEAP(20*(1<<20)) 即最大不到1<<25
/* private functions */
static void *extend_heap(size_t size); //拓展堆块
static void *find_fit(size_t size); //寻找空闲块 first fit
static void place(void *bp, size_t size); //分割空闲块
static void *coalesce(void *bp); //合并空闲块
static void *mm_realloc_coalesce(void *old_ptr, size_t new_size); //针对realloc优化的合并函数
/* 链表操作 */
static void insert_freelist(void *bp);
static void remove_freelist(void *bp);
static int isSegList(void *bp); //判断是否为segList
//check
static void mm_printblock(int verbose, const char* func);
/*
* mm_init - initialize the malloc package.
*/
//设立序言块、结尾块,以及序言块前的对齐块(4B),总共需要4个4B的空间
int mm_init(void)
{
for (int index = 0; index < 25; index++) {
segList[index] = NULL;
}
if ((heap_listp = mem_sbrk(4*WSIZE)) == (void*)-1)
return -1;
PUTADDR(heap_listp, NULL); //堆起绐位置的对齐块,使bp对齐8字节
PUT(heap_listp + 1*WSIZE, PACK(8, 1)); //序言块
PUT(heap_listp + 2*WSIZE, PACK(8, 1)); //序言块
PUT(heap_listp, PACK(0, 1)); //结尾块
heap_listp += (2*WSIZE); //小技巧:使heap_listp指向下一块, 即两个序主块中间
if (extend_heap(CHUNKSIZE) == NULL) //拓展堆块
return -1;
mm_printblock(VERBOSE, __func__);
return 0;
}
static int isSegList(void *bp) {
if (bp >= segList && bp <= (segList+23))
return 1;
return 0;
}
//寻找适合大小的空闲链表并使用头插法插入
static void insert_freelist(void *bp) {
size_t size;
int index;
size = GET_SIZE(HDRP(bp));
for (index = 4; index < 25; index++) { //最小块为16B,即下标从4起有效
if ((1 << index) <= size && (1 << (index+1)) > size)
break;
}
if (segList[index] == NULL) {
PUTADDR(SUCC_POINT(bp), NULL);
PUTADDR(PRED_POINT(bp), &segList[index]); //为了判断该结点的前序结点是segList,将些赋值为NULL
segList[index] = bp;
} else {
void *tmp;
tmp = segList[index];
PUTADDR(SUCC_POINT(bp), tmp);
PUTADDR(PRED_POINT(bp), &segList[index]);
segList[index] = bp;
PUTADDR(PRED_POINT(tmp), bp);
tmp = NULL;
}
}
//将 bp 所指的空闲块从空闲链表中移除(进行合并、放置操作中会用到)
static void remove_freelist(void *bp) {
void *pre_block, *post_block;
pre_block = GETADDR(PRED_POINT(bp));
post_block = GETADDR(SUCC_POINT(bp));
//处理前序结点
if (isSegList(pre_block)) { //前序是头结点
PUTADDR(pre_block, post_block);
} else {
PUTADDR(SUCC_POINT(pre_block), post_block);
}
//处理后序结点
if (post_block != NULL) {
PUTADDR(PRED_POINT(post_block), pre_block);
}
}
/*
* mm_malloc - Allocate a block by incrementing the brk pointer.
* Always allocate a block whose size is a multiple of the alignment.
*/
void *mm_malloc(size_t size)
{
size_t asize; //ajusted size
size_t extendsize; //若无适配块则拓展堆的大小
void *bp = NULL;
if (size == 0) //无效的申请
return NULL;
asize = ALIGN(size + 2*WSIZE);
if ((bp = find_fit(asize)) != NULL) {
place(bp, asize);
mm_printblock(VERBOSE, __func__);
return bp;
}
//无足够空间的空闲块用来分配
extendsize = MAX(asize, CHUNKSIZE);
if ((bp = extend_heap(extendsize)) == NULL) {
return NULL;
}
place(bp, asize);
mm_printblock(VERBOSE, __func__);
return bp;
}
static void *extend_heap(size_t size) {
size_t asize;
void *bp;
asize = ALIGN(size);
if ((long)(bp = mem_sbrk(asize)) == -1)
return NULL;
PUT(HDRP(bp), PACK(asize, 0)); //HDRP(bp)指向原结尾块
PUT(FTRP(bp), PACK(asize, 0));
PUT(HDRP(NEXT_BLKP(bp)), PACK(0, 1)); //新结尾块
return coalesce(bp);
}
//放置策略搜索 首次适配搜索+分离适配
static void *find_fit(size_t size) {
for (int index = 4; index < 25; index++) {
if (size < (1 << (index+1))) {
unsigned int *curbp;
for (curbp = segList[index]; curbp != NULL; curbp = GETADDR(SUCC_POINT(curbp))) {
if (size <= GET_SIZE(HDRP(curbp))) {
return curbp;
}
}
}
}
return NULL; //未适配
}
//分割空闲块
static void place(void *bp, size_t asize) { //注意最小块的限制(24B == MINBLOCK)
size_t total_size = GET_SIZE(HDRP(bp));
size_t remainder_size = total_size - asize;
if (remainder_size >= MINBLOCK) {
PUT(HDRP(bp), PACK(asize, 1));
PUT(FTRP(bp), PACK(asize, 1));
remove_freelist(bp);
void *next_bp = NEXT_BLKP(bp);
PUT(HDRP(next_bp), PACK(remainder_size, 0));
PUT(FTRP(next_bp), PACK(remainder_size, 0));
insert_freelist(next_bp);
} else { //没有已分配块或空闲块可以比最小块更小
PUT(HDRP(bp), PACK(total_size, 1));
PUT(FTRP(bp), PACK(total_size, 1));
remove_freelist(bp);
}
}
/*
* mm_free - Freeing a block does nothing.
*/
void mm_free(void *ptr)
{
size_t size = GET_SIZE(HDRP(ptr));
PUT(HDRP(ptr), PACK(size, 0));
PUT(FTRP(ptr), PACK(size, 0));
coalesce(ptr);
mm_printblock(VERBOSE, __func__);
}
/*
* coalesce - 合并内存块
*/
static void *coalesce(void *bp) {
char *pre_block, *post_block;
int pre_alloc = GET_ALLOC(HDRP(PREV_BLKP(bp)));
int post_alloc = GET_ALLOC(HDRP(NEXT_BLKP(bp)));
size_t size = GET_SIZE(HDRP(bp));
if (pre_alloc && post_alloc) {
insert_freelist(bp);
return bp;
} else if (!pre_alloc && post_alloc) { //与前块合并
size += GET_SIZE(HDRP(PREV_BLKP(bp)));
bp = PREV_BLKP(bp);
remove_freelist(bp);
} else if (pre_alloc && !post_alloc) { //与后块合并
size += GET_SIZE(HDRP(NEXT_BLKP(bp)));
post_block = NEXT_BLKP(bp); //记录后块的指针
remove_freelist(post_block);
} else { //前后块都合并
size += GET_SIZE(HDRP(PREV_BLKP(bp))) + GET_SIZE(FTRP(NEXT_BLKP(bp)));
pre_block = PREV_BLKP(bp);
post_block = NEXT_BLKP(bp);
bp = PREV_BLKP(bp);
remove_freelist(pre_block);
remove_freelist(post_block);
}
PUT(HDRP(bp), PACK(size, 0));
PUT(FTRP(bp), PACK(size, 0));
insert_freelist(bp);
return bp;
}
/*
* mm_realloc - Implemented simply in terms of mm_malloc and mm_free
*/
void *mm_realloc(void *ptr, size_t size)
{
size_t old_size, new_size, extendsize;
void *old_ptr, *new_ptr;
if (ptr == NULL) {
return mm_malloc(size);
}
if (size == 0) {
mm_free(ptr);
return NULL;
}
new_size = ALIGN(size + 2*WSIZE);
old_size = GET_SIZE(HDRP(ptr));
old_ptr = ptr;
if (old_size >= new_size) {
if (old_size - new_size >= MINBLOCK) { //分割内存块
place(old_ptr, new_size);
mm_printblock(VERBOSE, __func__);
return old_ptr;
} else { //剩余块小于最小块大小,不分割
mm_printblock(VERBOSE, __func__);
return old_ptr;
}
} else { //寻找合并内存块或新内存块
if ((new_ptr = mm_realloc_coalesce(old_ptr, new_size)) != NULL) { //合并相邻内存块并数据迁移后返回
mm_printblock(VERBOSE, __func__);
return new_ptr;
}
if ((new_ptr = find_fit(new_size)) == NULL) { //无合适内存块
extendsize = MAX(new_size, CHUNKSIZE);
if ((new_ptr = extend_heap(extendsize)) == NULL) //拓展堆空间
return NULL;
}
//针对非相邻内存块进行数据迁移
place(new_ptr, new_size);
memcpy(new_ptr, old_ptr, old_size - 2*WSIZE);
mm_free(old_ptr);
mm_printblock(VERBOSE, __func__);
return new_ptr;
}
}
//针对realloc的前后合并
static void *mm_realloc_coalesce(void *old_ptr, size_t new_size) {
void *pre_block, *post_block, *new_ptr;
int pre_alloc, post_alloc;
size_t pre_size, post_size, old_size, total_size;
pre_block = PREV_BLKP(old_ptr);
post_block = NEXT_BLKP(old_ptr);
pre_alloc = GET_ALLOC(HDRP(pre_block));
post_alloc = GET_ALLOC(HDRP(post_block));
pre_size = GET_SIZE(HDRP(pre_block));
post_size = GET_SIZE(HDRP(post_block));
old_size = GET_SIZE(HDRP(old_ptr));
if (!pre_alloc && ((total_size = old_size + pre_size) >= new_size)) { //与前块合并分配
new_ptr = pre_block;
remove_freelist(pre_block);
} else if (!post_alloc && ((total_size = old_size + post_size) >= new_size)){ //与后块合并分配
new_ptr = old_ptr;
remove_freelist(post_block);
} else if (!pre_alloc && !post_alloc && ((total_size = old_size + pre_size + post_size) >= new_size)){ //与前后块合并分配
new_ptr = pre_block;
remove_freelist(pre_block);
remove_freelist(post_block);
} else { //无合并分配的可能
return NULL;
}
memcpy(new_ptr, old_ptr, old_size - 2*WSIZE);
if (total_size - new_size >= MINBLOCK) {
PUT(HDRP(new_ptr), PACK(new_size, 1));
PUT(FTRP(new_ptr), PACK(new_size, 1));
void *next_bp = NEXT_BLKP(new_ptr);
PUT(HDRP(next_bp), PACK(total_size - new_size, 0));
PUT(FTRP(next_bp), PACK(total_size - new_size, 0));
coalesce(next_bp);
} else {
PUT(HDRP(new_ptr), PACK(total_size, 1));
PUT(FTRP(new_ptr), PACK(total_size, 1));
}
return new_ptr;
}
static void mm_printblock(int verbose, const char* func) {
if (!verbose) return;
void *curbp;
printf("\n=========================== $%s$ ===========================\n" ,func);
printf("================ block ================\n");
for (curbp = heap_listp; GET_SIZE(HDRP(curbp)) > 0; curbp = NEXT_BLKP(curbp)) {
printf("address = %p\n", curbp);
printf("size = %d, %d, alloc = %d, %d\n", GET_SIZE(HDRP(curbp)), GET_SIZE(FTRP(curbp)),
GET_ALLOC(HDRP(curbp)), GET_ALLOC(FTRP(curbp)));
printf("\n");
}
//epilogue blocks
printf("address = %p\n", curbp);
printf("size = %d, alloc = %d\n", GET_SIZE(HDRP(curbp)), GET_ALLOC(HDRP(curbp)));
printf("================ block ================\n");
printf("\n");
printf("=============== freelist ===============\n");
for (int index = 4; index < 25; index++) {
if (segList[index] == NULL) continue;
printf("segList[%d]: [%d,%d)\n", index, (1 << index), (1 << (index+1)));
for (curbp = segList[index]; curbp != NULL; curbp = GETADDR(SUCC_POINT(curbp))) {
printf("address = %p, size = %d,%d, alloc = %d,%d\n",
curbp, GET_SIZE(HDRP(curbp)), GET_SIZE(FTRP(curbp)), GET_ALLOC(HDRP(curbp)), GET_ALLOC(FTRP(curbp)));
}
printf("address = %p\n", curbp);
}
printf("=============== freelist ===============\n");
printf("=========================== $%s$ ===========================\n" ,func);
}
=========================== $mm_malloc$ ===========================
================ block ================
address = 0xf69df018
size = 8, 8, alloc = 1, 1
address = 0xf69df020
size = 2104, 2104, alloc = 0, 0
address = 0xf69df858
size = 4080, 4080, alloc = 1, 1
address = 0xf69e0848
size = 4080, 4080, alloc = 1, 1
address = 0xf69e1838
size = 2024, 2024, alloc = 0, 0
address = 0xf69e2020
size = 0, alloc = 1
================ block ================
=============== freelist ===============
segList[10]: [1024,2048)
address = 0xf69e1838, size = 2024,2024, alloc = 0,0
address = (nil)
segList[11]: [2048,4096)
address = 0xf69df020, size = 2104,2104, alloc = 0,0
address = (nil)
=============== freelist ===============
=========================== $mm_malloc$ ===========================
Results for mm malloc:
trace valid util ops secs Kops
0 yes 98% 5694 0.000287 19833
1 yes 97% 5848 0.000311 18780
2 yes 99% 6648 0.000322 20652
3 yes 99% 5380 0.000278 19339
4 yes 66% 14400 0.000528 27283
5 yes 93% 4800 0.000459 10462
6 yes 90% 4800 0.000377 12719
7 yes 55% 12000 0.000447 26864
8 yes 51% 24000 0.001032 23254
9 yes 45% 14401 0.023348 617
10 yes 45% 14401 0.001142 12609
Total 76% 112372 0.028531 3939
Perf index = 46 (util) + 40 (thru) = 86/100
//寻找适合大小的空闲链表并按照从小到插入,这样利用分离适配不需要搜索所有的堆就可达到最佳适配的效果
static void insert_freelist(void *bp) {
size_t size;
int index;
size = GET_SIZE(HDRP(bp));
for (index = 4; index < 25; index++) { //最小块为16B,即下标从4起有效
if ((1 << index) <= size && (1 << (index+1)) > size)
break;
}
void *pre_block, *post_block, *tmp;
tmp = segList + index;
for (post_block = segList[index]; post_block != NULL; post_block = GETADDR(SUCC_POINT(post_block))) {
if (GET_SIZE(HDRP(post_block)) >= size) {
pre_block = GETADDR(PRED_POINT(post_block));
// bp 结点前后序块
PUTADDR(PRED_POINT(bp), pre_block);
PUTADDR(SUCC_POINT(bp), post_block);
//前序块
if (isSegList(pre_block)) {
PUTADDR(pre_block, bp);
} else {
PUTADDR(SUCC_POINT(pre_block), bp);
}
//后序块
PUTADDR(PRED_POINT(post_block), bp);
return;
}
tmp = post_block; //若只能插入链表未尾,则存储最后一个结点
}
//前序结点地址
pre_block = tmp;
//bp 结点前后序块
PUTADDR(PRED_POINT(bp), pre_block);
PUTADDR(SUCC_POINT(bp), NULL);
//前序结点块
if (isSegList(pre_block)) {
PUTADDR(pre_block, bp);
} else {
PUTADDR(SUCC_POINT(pre_block), bp);
}
}
Results for mm malloc:
trace valid util ops secs Kops
0 yes 99% 5694 0.000171 33240
1 yes 99% 5848 0.000147 39674
2 yes 99% 6648 0.000192 34661
3 yes 99% 5380 0.000138 38986
4 yes 66% 14400 0.000226 63717
5 yes 96% 4800 0.000343 13978
6 yes 95% 4800 0.000341 14068
7 yes 55% 12000 0.000345 34732
8 yes 51% 24000 0.000608 39487
9 yes 40% 14401 0.022937 628
10 yes 45% 14401 0.000986 14611
Total 77% 112372 0.026435 4251
Perf index = 46 (util) + 40 (thru) = 86/100
与分离链表+首次适配的得分一样啊,内存利用率竟然没有提升。。。。
Results for mm malloc:
trace valid util ops secs Kops
0 yes 99% 5694 0.003636 1566
1 yes 99% 5848 0.003695 1583
2 yes 99% 6648 0.005280 1259
3 yes 100% 5380 0.003648 1475
4 yes 66% 14400 0.000089161254
5 yes 92% 4800 0.004965 967
6 yes 92% 4800 0.004495 1068
7 yes 55% 12000 0.086989 138
8 yes 51% 24000 0.126514 190
9 yes 27% 14401 0.028150 512
10 yes 30% 14401 0.000831 17330
Total 74% 112372 0.268292 419
Perf index = 44 (util) + 28 (thru) = 72/100
Results for mm malloc:
trace valid util ops secs Kops
0 yes 90% 5694 0.001078 5280
1 yes 91% 5848 0.000686 8526
2 yes 95% 6648 0.001864 3567
3 yes 96% 5380 0.001754 3067
4 yes 66% 14400 0.000094153518
5 yes 91% 4800 0.002902 1654
6 yes 89% 4800 0.002691 1783
7 yes 55% 12000 0.009888 1214
8 yes 51% 24000 0.003605 6658
9 yes 27% 14401 0.028398 507
10 yes 45% 14401 0.000729 19754
Total 72% 112372 0.053688 2093
Perf index = 43 (util) + 40 (thru) = 83/100
# 显式空闲链表 + LIFO
Results for mm malloc:
trace valid util ops secs Kops
0 yes 93% 5694 0.000093 60964
1 yes 94% 5848 0.000088 66455
2 yes 96% 6648 0.000150 44290
3 yes 97% 5380 0.000158 34115
4 yes 66% 14400 0.000131109756
5 yes 89% 4800 0.000306 15686
6 yes 85% 4800 0.000360 13315
7 yes 55% 12000 0.001054 11385
8 yes 51% 24000 0.001897 12652
9 yes 26% 14401 0.032686 441
10 yes 30% 14401 0.000994 14486
Total 71% 112372 0.037918 2964
Perf index = 43 (util) + 40 (thru) = 83/100
# 显式空闲链表+按地址顺序维护链表
Results for mm malloc:
trace valid util ops secs Kops
0 yes 99% 5694 0.000137 41471
1 yes 99% 5848 0.000123 47506
2 yes 99% 6648 0.000168 39501
3 yes 99% 5380 0.000136 39646
4 yes 66% 14400 0.000129111369
5 yes 92% 4800 0.001686 2846
6 yes 92% 4800 0.001677 2863
7 yes 55% 12000 0.012042 996
8 yes 51% 24000 0.071326 336
9 yes 27% 14401 0.029718 485
10 yes 30% 14401 0.000880 16361
Total 74% 112372 0.118024 952
Perf index = 44 (util) + 40 (thru) = 84/100
Results for mm malloc:
trace valid util ops secs Kops
0 yes 98% 5694 0.000287 19833
1 yes 97% 5848 0.000311 18780
2 yes 99% 6648 0.000322 20652
3 yes 99% 5380 0.000278 19339
4 yes 66% 14400 0.000528 27283
5 yes 93% 4800 0.000459 10462
6 yes 90% 4800 0.000377 12719
7 yes 55% 12000 0.000447 26864
8 yes 51% 24000 0.001032 23254
9 yes 45% 14401 0.023348 617
10 yes 45% 14401 0.001142 12609
Total 76% 112372 0.028531 3939
Perf index = 46 (util) + 40 (thru) = 86/100
Results for mm malloc:
trace valid util ops secs Kops
0 yes 99% 5694 0.000171 33240
1 yes 99% 5848 0.000147 39674
2 yes 99% 6648 0.000192 34661
3 yes 99% 5380 0.000138 38986
4 yes 66% 14400 0.000226 63717
5 yes 96% 4800 0.000343 13978
6 yes 95% 4800 0.000341 14068
7 yes 55% 12000 0.000345 34732
8 yes 51% 24000 0.000608 39487
9 yes 40% 14401 0.022937 628
10 yes 45% 14401 0.000986 14611
Total 77% 112372 0.026435 4251
Perf index = 46 (util) + 40 (thru) = 86/100
会发现吞吐率很容易拿到满分,而内存利用率才是更为重要的选项.
但这个方法都用上了,搞不清楚为什么内存利用率没有提升…