CSAPP(CMU 15-213):Lab6 Malloclab详解

# 前言

本系列文章意在记录答主学习CSAPP Lab的过程,也旨在可以帮助后人一二,欢迎大家指正!


tips:本lab主要是为了体验应用程序如何使用和管理虚拟内存,写一个动态存储分配器(dynamic storage allocator)
目标:使用不同方法在内存利用率(memory utilization)和吞吐率(throughput)之中达到trade-off,逐步优化。

Handout

要求

修改mm.c文件,其中包括四个函数:

  • int mm_init(void); 初始化堆空间,错误返回-1,否则返回0
  • void *mm_malloc(size_t size); 返回指向 size 字节有效载荷的指针,其中堆块保持为 8字节 对齐。
  • void mm_free(void *prt); 释放由 ptr 所指向堆块的空间。
  • void *mm_realloc(void *ptr, size_t size); 尝试重新调整之前调用 malloccalloc 所分配的 ptr 所指向的内存块的大小,size 为新的内存空间的大小。
    • 描述:
    • 如果 prtNULL,则重新分配内存空间,与 mm_malloc(size) 等价。
    • 如果 sizeNULL,则释放该内存空间,与 mm_free(ptr) 等价。
    • 新内存块的地址与旧内存块的地址可能相同,也可能不同,取决于分割策略,需求大小以及原内存块的内部碎片。
    • 新内存块与旧内存块内容相同的大小取决于新旧内存块大小的最小值。

堆的一致性检查器(Heap Consistency Checker)

​ 能够辅助检查堆的一致性,做为debugg的工具。

功能

  • 在空闲链表中的每一块是否都是空闲的?
  • 是否存在一些连续的空闲块而没有合并?
  • 每一个空闲块是否都在空闲链表中?
  • 在空闲链表中的指针是否指向有效的空闲块?
  • 已分配的块是否存在重叠的现象?
  • 堆块中的指针是否指向有效的堆地址????

Support Routines

memlib.c

  • void *mem_sbrk(int incr);incr个字节来扩充堆,其中incr为正数,返回新区域第一个字节的地址。

  • void *mem_heap_lo(void); 返回指向堆第一个字节的空指针。

  • void *mem_heap_hi(void); 返回指向堆最后一个字节的空指针。

  • size_t mem_heapsize(void); 返回当前堆的总大小(以字节为单位)。

  • size_t mem_pagesize(void); 返回系统页桢的大小(linux系统中为4K)。

Programming Rules

  • 不可改变mm.c中的函数接口。
  • 不可以调用任何与内存管理相关的库函数。(malloccallocfreereallocsbrkbrk等)
  • 不可以在mm.c中定义任何聚合的变量类型(array,structs,trees,lists等)

Hints

  • 在最初时可以先使用简单的文件进行测试(eg. short1,2-bal.rep

  • unix> mdriver -V -f short1-bal.rep

  • 理解书中基于隐式空闲列表实现的每一行代码。

  • 在C预处理器宏中封装指针算法。(通过写宏可以降低指针运算的复杂度)

  • 使用一些性能分析器。(eg.gprof

准备工作

trace文件

​ 在阅读官方handout过程中发现所给的文件中并没有几个trace,就找寻了下,大家可以从下方的链接中下载,放入目录文档中。

Lab 6 Malloc Lab 完整的 12 traces Github 下载链接

​ 同时在config.h文件中将以下修改为自己的路径。

#define TRACEDIR "/traces/"

​ 运行测试程序./mdriver -V

​ 输出如下:

Team Name:ateam
Member 1 :Harry Bovik:[email protected]
Using default tracefiles in /mnt/hgfs/CMU15-213/lab/6.malloclab-handout/traces/
Measuring performance with gettimeofday().

Testing mm malloc
Reading tracefile: amptjp-bal.rep
Checking mm_malloc for correctness, efficiency, and performance.
Reading tracefile: cccp-bal.rep
Checking mm_malloc for correctness, efficiency, and performance.
Reading tracefile: cp-decl-bal.rep
Checking mm_malloc for correctness, efficiency, and performance.
Reading tracefile: expr-bal.rep
Checking mm_malloc for correctness, efficiency, and performance.
Reading tracefile: coalescing-bal.rep
ERROR: mem_sbrk failed. Ran out of memory...
Checking mm_malloc for correctness, ERROR [trace 4, line 7673]: mm_malloc failed.
Reading tracefile: random-bal.rep
ERROR: mem_sbrk failed. Ran out of memory...
Checking mm_malloc for correctness, ERROR [trace 5, line 1662]: mm_malloc failed.
Reading tracefile: random2-bal.rep
ERROR: mem_sbrk failed. Ran out of memory...
Checking mm_malloc for correctness, ERROR [trace 6, line 1780]: mm_malloc failed.
Reading tracefile: binary-bal.rep
Checking mm_malloc for correctness, efficiency, and performance.
Reading tracefile: binary2-bal.rep
Checking mm_malloc for correctness, efficiency, and performance.
Reading tracefile: realloc-bal.rep
ERROR: mem_sbrk failed. Ran out of memory...
Checking mm_malloc for correctness, ERROR [trace 9, line 1705]: mm_realloc failed.
Reading tracefile: realloc2-bal.rep
ERROR: mem_sbrk failed. Ran out of memory...
Checking mm_malloc for correctness, ERROR [trace 10, line 6562]: mm_realloc failed.

Results for mm malloc:
trace  valid  util     ops      secs  Kops
 0       yes   23%    5694  0.000023246494
 1       yes   19%    5848  0.000020293869
 2       yes   30%    6648  0.000024272459
 3       yes   40%    5380  0.000031176393
 4        no     -       -         -     -
 5        no     -       -         -     -
 6        no     -       -         -     -
 7       yes   55%   12000  0.000041290557
 8       yes   51%   24000  0.000079303413
 9        no     -       -         -     -
10        no     -       -         -     -
Total            -       -         -     -

Terminated with 5 errors

​ 这样才算是将所有的准备工作都弄好啦。

GDB调试

​ 在Makefile文件中将

CC = gcc 
CFLAGS = -Wall -O2 -m32

​ 修改为

CC = gcc -g                  # 增加调试信息
CFLAGS = -Wall -O0 -m32      # 降低优化等级,以便于调试

Implementation

原生方法(naive method)

思想

​ 最简单的分配器会把堆组织成一个大的字节数组,还有一个指针p,初始指向这个数组的第一个字节。为了分配 size 个字节,malloc 将p的当前值保存在栈里,将p增加 size,并将p的旧值返回到调用函数。free 只是简单地返回到调用函数,而不做其他事情。

优缺点:

  • 吞吐率会极好(mallocfree只执行很少的指令)
  • 内存利用率极差(因为分配器从不重复使用任何块)

函数设计原则

  • int mm_init(void);
    • 无需操作
  • void *mm_malloc(size_t size);
    • 在堆后直接创建空间存放size字节有效载荷和包含结点大小的头(注意对齐8字节)。
  • void mm_free(void *prt);
    • 无需释放空闲内存块。
  • void *mm_realloc(void *ptr, size_t size);
    • 直接将创建新内存空间,无需管理原空间。

代码

/*
 * mm-naive.c - The fastest, least memory-efficient malloc package.
 * 
 * In this naive approach, a block is allocated by simply incrementing
 * the brk pointer.  A block is pure payload. There are no headers or
 * footers.  Blocks are never coalesced or reused. Realloc is
 * implemented directly using mm_malloc and mm_free.
 *
 * NOTE TO STUDENTS: Replace this header comment with your own header
 * comment that gives a high level description of your solution.
 */
#include 
#include 
#include 
#include 
#include 

#include "mm.h"
#include "memlib.h"

/*********************************************************
 * NOTE TO STUDENTS: Before you do anything else, please
 * provide your team information in the following struct.
 ********************************************************/
team_t team = {
    /* Team name */
    "ateam",
    /* First member's full name */
    "Harry Bovik",
    /* First member's email address */
    "[email protected]",
    /* Second member's full name (leave blank if none) */
    "",
    /* Second member's email address (leave blank if none) */
    ""
};

/* single word (4) or double word (8) alignment */
#define ALIGNMENT 8

/* rounds up to the nearest multiple of ALIGNMENT */  // 对ALIGNMENT倍数上取整的计算
#define ALIGN(size) (((size) + (ALIGNMENT-1)) & ~0x7)


#define SIZE_T_SIZE (ALIGN(sizeof(size_t)))  //将保存大小的部分也对齐是为了访问存储数据块的地址对齐,因为地址对齐的是有效载荷处,而不是块开头处(开关存放了此块的有效载荷大小)

/* 
 * mm_init - initialize the malloc package.
 */
int mm_init(void)
{
    return 0;
}

/* 
 * mm_malloc - Allocate a block by incrementing the brk pointer.
 *     Always allocate a block whose size is a multiple of the alignment.
 */
void *mm_malloc(size_t size)
{
    int newsize = ALIGN(size + SIZE_T_SIZE);  
    void *p = mem_sbrk(newsize);  //p指向新的分配块的第一个字节
    if (p == (void *)-1)   //分配堆空间错误
	return NULL;
    else {
        *(size_t *)p = size;     //有效载荷大小为size
        return (void *)((char *)p + SIZE_T_SIZE);
    }
}

/*
 * mm_free - Freeing a block does nothing.
 */
void mm_free(void *ptr)
{
}

/*
 * mm_realloc - Implemented simply in terms of mm_malloc and mm_free
 */
//无脑~直接在现有的堆后创建新空间
void *mm_realloc(void *ptr, size_t size)
{
    void *oldptr = ptr;
    void *newptr;
    size_t copySize;    //旧内存块的大小
    
    newptr = mm_malloc(size);
    if (newptr == NULL)
      return NULL;
    copySize = *(size_t *)((char *)oldptr - SIZE_T_SIZE);
    if (size < copySize)
      copySize = size;
    memcpy(newptr, oldptr, copySize);   //复制copeSize大小的数据至newptr中
    mm_free(oldptr);
    return newptr;
}

隐式空闲链表

思想

  • 空闲块组织:隐式空闲链表
  • 放置:首次适配(first fit) / 下一次适配(next fit)
  • 分割:分割空闲块的前部使用。
  • 合并:立即合并(immediate coalescing)

CSAPP(CMU 15-213):Lab6 Malloclab详解_第1张图片
CSAPP(CMU 15-213):Lab6 Malloclab详解_第2张图片

优缺点:

  • 分配内存块 cost :线性的 cost
  • 释放内存块 cost :与合并时间相加为常数时间
  • 内存 cost :依赖于放置策略:fist-fit / next-fit / best-fit

函数设计原则

  • int mm_init(void);

    • 初始化对齐块,序言块,结尾块,并扩充堆块大小。
  • void *mm_malloc(size_t size);

    • 寻求合适的空闲块并分配,其间调用其他函数完成。
  • void mm_free(void *prt);

    • 修改内存块的分配位并合并内存块。
  • void *mm_realloc(void *ptr, size_t size);

    • 按要求分配内存块。
  • static void *extend_heap(size_t size);

    • 扩充堆空间,并返回指向结尾扩充后的空闲块的指针
  • static void *find_fit(size_t size);

    • 首次适配(first-fit)
  • static void place(char *bp, size_t size);

    • 负责在空闲块内放置数据,使其分割为分配块和空闲块,注意最小块的限制。
  • static void *coalesce(void *bp);

    • 合并空闲块,四种情况,返回指向该空闲块的指针。
  • static void mm_printblock(int verbose, const char* func);

    • 打印输出整个堆块的模型,方便DEBUG

代码以及跑分

代码mm.c

/*
 * mm-naive.c - The fastest, least memory-efficient malloc package.
 * 
 * In this naive approach, a block is allocated by simply incrementing
 * the brk pointer.  A block is pure payload. There are no headers or
 * footers.  Blocks are never coalesced or reused. Realloc is
 * implemented directly using mm_malloc and mm_free.
 *
 * NOTE TO STUDENTS: Replace this header comment with your own header
 * comment that gives a high level description of your solution.
 */
#include 
#include 
#include 
#include 
#include 

#include "mm.h"
#include "memlib.h"

/*********************************************************
 * NOTE TO STUDENTS: Before you do anything else, please
 * provide your team information in the following struct.
 ********************************************************/
team_t team = {
    /* Team name */
    "ateam",
    /* First member's full name */
    "Harry Bovik",
    /* First member's email address */
    "[email protected]",
    /* Second member's full name (leave blank if none) */
    "",
    /* Second member's email address (leave blank if none) */
    ""
};

#define VERBOSE 0
#ifdef DEBUG
#define VERBOSE 1
#endif



/* single word (4) or double word (8) alignment */
#define ALIGNMENT 8

/* rounds up to the nearest multiple of ALIGNMENT */  // 对 ALIGNMENT 倍数上取整的计算
#define ALIGN(size) (((size) + (ALIGNMENT-1)) & ~0x7)
#define SIZE_T_SIZE (ALIGN(sizeof(size_t)))

/* 自定义的宏,有便于操作常量和指针运算 */
#define WSIZE       4        //字、脚部或头部的大小(字节)
#define DSIZE       8        //双字大小(字节)
#define CHUNKSIZE  (1<<12)   //扩展堆时的默认大小
#define MINBLOCK (DSIZE + 2*WSIZE)

#define MAX(x, y)  ((x) > (y) ? (x) : (y))

#define PACK(size, alloc)  ((size) | (alloc))         //将 size 和 allocated bit 合并为一个字

#define GET(p)        (*(unsigned int *)(p))          //读地址p处的一个字
#define PUT(p, val)   (*(unsigned int *)(p) = (val))  //向地址p处写一个字

#define GET_SIZE(p)   (GET(p) & ~0x07)    //得到地址p处的 size
#define GET_ALLOC(p)  (GET(p) & 0x1)      //得到地址p处的 allocated bit
//block point --> bp指向有效载荷块指针
#define HDRP(bp)     ((char*)(bp) - WSIZE)                       //获得头部的地址
#define FTRP(bp)     ((char*)(bp) + GET_SIZE(HDRP(bp)) - DSIZE)  //获得脚部的地址, 与宏定义HDRP有耦合

#define NEXT_BLKP(bp)    ((char*)(bp) + GET_SIZE((char*)(bp) - WSIZE))  //计算后块的地址
#define PREV_BLKP(bp)    ((char*)(bp) - GET_SIZE((char*)(bp) - DSIZE))  //计算前块的地址

static void* heap_listp;    //指向序言块
/* private functions */
static void *extend_heap(size_t size);     //拓展堆块
static void *find_fit(size_t size);        //寻找空闲块
static void place(char *bp, size_t size);  //分割空闲块
static void *coalesce(void *bp);           //合并空闲块
//check
/*
static void mm_check(int verbose, const char* func);                 //heap consistency checker
static void mm_checkblock(int verbose, const char* func, void *bp);
static int mm_checkheap(int verbose, const char* func);
*/
static void mm_printblock(int verbose, const char* func);

/* 
 * mm_init - initialize the malloc package.
 */
//设立序言块、结尾块,以及序言块前的对齐块(4B),总共需要4个4B的空间
int mm_init(void)
{
    if ((heap_listp = mem_sbrk(4*WSIZE)) == (void*)-1) 
        return -1;
    PUT(heap_listp, 0);                     //堆起绐位置的对齐块,使bp对齐8字节
    PUT(heap_listp + 1*WSIZE, PACK(8, 1));  //序言块
    PUT(heap_listp + 2*WSIZE, PACK(8, 1));  //序言块
    PUT(heap_listp, PACK(0, 1));            //结尾块
    heap_listp += (2*WSIZE);     //小技巧:使heap_listp指向下一块, 即两个序主块中间

    if (extend_heap(CHUNKSIZE) == NULL)   //拓展堆块
        return -1;
    mm_printblock(VERBOSE, __func__);
    return 0;
}

static void *extend_heap(size_t size) {
    size_t asize;   
    void *bp;

    asize = ALIGN(size);
     //printf("extend %d\n", asize);
    if ((long)(bp = mem_sbrk(asize)) == -1)
        return NULL;

    PUT(HDRP(bp), PACK(asize, 0));          //HDRP(bp)指向原结尾块
    PUT(FTRP(bp), PACK(asize, 0));          
    PUT(HDRP(NEXT_BLKP(bp)), PACK(0, 1));   //新结尾块
    return coalesce(bp);
}

/* 
 * mm_malloc - Allocate a block by incrementing the brk pointer.
 *     Always allocate a block whose size is a multiple of the alignment.
 */
void *mm_malloc(size_t size)
{


    size_t asize;     //ajusted size
    size_t extendsize;  //若无适配块则拓展堆的大小
    void *bp = NULL;

    if (size == 0)    //无效的申请
        return NULL;

    asize = ALIGN(size + 2*WSIZE);
    
    if ((bp = find_fit(asize)) != NULL) {
        place((char *)bp, asize);
        mm_printblock(VERBOSE, __func__);
        return bp;
    }
    
    //无足够空间的空闲块用来分配
    extendsize = MAX(asize, CHUNKSIZE);
    if ((bp = extend_heap(extendsize)) == NULL) {
        return NULL;
    }    
    place(bp, asize);
    mm_printblock(VERBOSE, __func__);
    return bp;
}

//放置策略搜索   首次适配搜索
static void *find_fit(size_t size) {         
    void *curbp;
    for (curbp = heap_listp; GET_SIZE(HDRP(curbp)) > 0; curbp = NEXT_BLKP(curbp)) {
        if (!GET_ALLOC(HDRP(curbp)) && (GET_SIZE(HDRP(curbp)) >= size)) return curbp;
    }
    return NULL;    //未适配
} 

//分割空闲块
static void place(char *bp, size_t asize) {     //注意最小块的限制(16B == DSIZE + 2*WSIZE == MINBLOCK)
    size_t total_size = GET_SIZE(HDRP(bp));
    size_t remainder_size = total_size - asize;

    if (remainder_size >= MINBLOCK) {
        PUT(HDRP(bp), PACK(asize, 1));
        PUT(FTRP(bp), PACK(asize, 1));
        bp = NEXT_BLKP(bp);
        PUT(HDRP(bp), PACK(remainder_size, 0));
        PUT(FTRP(bp), PACK(remainder_size, 0));
    } else {          //没有已分配块或空闲块可以比最小块更小
        PUT(HDRP(bp), PACK(total_size, 1));
        PUT(FTRP(bp), PACK(total_size, 1));
    }
}

/*
 * mm_free - Freeing a block does nothing.
 */
void mm_free(void *ptr)
{
    size_t size = GET_SIZE(HDRP(ptr));
    PUT(HDRP(ptr), PACK(size, 0));
    PUT(FTRP(ptr), PACK(size, 0));
    coalesce(ptr);
    mm_printblock(VERBOSE, __func__);
}
/*
* coalesce - 合并内存块
*/
static void *coalesce(void *bp) {          
    int pre_alloc = GET_ALLOC(HDRP(PREV_BLKP(bp)));
    int post_alloc = GET_ALLOC(HDRP(NEXT_BLKP(bp)));
    size_t size = GET_SIZE(HDRP(bp));

    if (pre_alloc && post_alloc) {
        return bp;
    } else if (pre_alloc && !post_alloc) {   //与后块合并
        size += GET_SIZE(HDRP(NEXT_BLKP(bp)));
    } else if (!pre_alloc && post_alloc) {   //与前块合并
        size += GET_SIZE(HDRP(PREV_BLKP(bp)));
        bp = PREV_BLKP(bp);
    } else {  //前后块都合并
        size += GET_SIZE(HDRP(PREV_BLKP(bp))) + GET_SIZE(FTRP(NEXT_BLKP(bp)));
        bp = PREV_BLKP(bp);
    }
    PUT(HDRP(bp), PACK(size, 0));
    PUT(FTRP(bp), PACK(size, 0));     //FTRP()与GET_SIZE()有耦合,故此时所用的SIZE已经改变

    return bp;
}

/*
 * mm_realloc - Implemented simply in terms of mm_malloc and mm_free
 */
void *mm_realloc(void *ptr, size_t size)
{
    size_t old_size, new_size, extendsize;
    void *old_ptr, *new_ptr;

    if (ptr == NULL) {
        return mm_malloc(size);
    }
    if (size == 0) {
        mm_free(ptr);
        return NULL;
    }

    new_size = ALIGN(size + 2*WSIZE);
    old_size = GET_SIZE(HDRP(ptr));
    old_ptr = ptr;
    if (old_size >= new_size) {
        if (old_size - new_size >= MINBLOCK) {  //分割内存块
            place(old_ptr, new_size);
            mm_printblock(VERBOSE, __func__);
            return old_ptr;
        } else {   //剩余块小于最小块大小,不分割
            mm_printblock(VERBOSE, __func__);
            return old_ptr;
        }
    } else {  //释放原内存块,寻找新内存块
        if ((new_ptr = find_fit(new_size)) == NULL) {  //无合适内存块
            extendsize = MAX(new_size, CHUNKSIZE);
            if ((new_ptr = extend_heap(extendsize)) == NULL)   //拓展堆空间
                return NULL;
        }
        place(new_ptr, new_size);
        memcpy(new_ptr, old_ptr, old_size - 2*WSIZE);
        mm_free(old_ptr);
        mm_printblock(VERBOSE, __func__);
        return new_ptr;
    }
}

static void mm_printblock(int verbose, const char* func) {
    if (!verbose) return;
    char *curbp;
    printf("\n=========================== %s ===========================\n" ,func);
    for (curbp = heap_listp; GET_SIZE(HDRP(curbp)) > 0; curbp = NEXT_BLKP(curbp)) {
        printf("address = %p\n", curbp);
        printf("hsize = %d, fsize = %d\n", GET_SIZE(HDRP(curbp)), GET_SIZE(FTRP(curbp)));
        printf("halloc = %d, falloc = %d\n", GET_ALLOC(HDRP(curbp)), GET_ALLOC(FTRP(curbp)));
        printf("\n");
    }
    //epilogue blocks
    printf("address = %p\n", curbp);
    printf("hsize = %d\n", GET_SIZE(HDRP(curbp)));
    printf("halloc = %d\n", GET_ALLOC(HDRP(curbp)));
    printf("=========================== %s ===========================\n" ,func);
}

/*
static void mm_check(int verbose, const char* func) {
    if (!verbose)  return;
    if (mm_checkheap(verbose, func)) {
        void *curbp;
        for (curbp = heap_listp; GET_SIZE(HDRP(curbp)) > 0; curbp = NEXT_BLKP(curbp)) {
            mm_checkblock(verbose, func, curbp);
            }
        }
}

static void mm_checkblock(int verbose, const char* func, void* bp) {
    if (!verbose) return;
    if (GET(HDRP(bp)) != GET(FTRP(bp))) {
        printf("\n=========================== %s ===========================\n" ,func);
        printf("Error: %p's Header and footer are not match.\n", bp);
        printf("hsize = %d, fsize = %d\n", GET_SIZE(HDRP(bp)), GET_SIZE(FTRP(bp)));
        printf("halloc = %d, falloc = %d\n", GET_ALLOC(HDRP(bp)), GET_ALLOC(FTRP(bp)));
        printf("next_head_alloc = %d, next_footer_alloc = %d\n", GET_ALLOC(HDRP(NEXT_BLKP(bp))), GET_ALLOC(FTRP(NEXT_BLKP(bp))));
        printf("=========================== %s ===========================\n" ,func);
    }
    if ((int)bp % ALIGNMENT != 0) 
        printf("Error: %p's Payload area is not aligned.\n", bp);
    if (GET_SIZE(HDRP(bp)) % ALIGNMENT != 0)
        printf("Error: %p payload size is not doubleword aligned.\n", bp);
}

static int mm_checkheap(int verbose, const char* func) {
    char *endp = (char *)mem_heap_hi()+1;
    char *curbp;
    //check prologue blocks
    if (GET(HDRP(heap_listp)) != GET(FTRP(heap_listp))) {
        printf("Error: Prologue blocks dosn't have same size/alloc fields.\n");
        return 0;
    }
    if (GET_ALLOC(HDRP(heap_listp)) != 1 || GET_SIZE(HDRP(heap_listp)) != 8) {
        printf("Error: Prologue blocks dosn't have special size/alloc fields.\n");
        return 0;
    }

    //chekc epilogue blocks
    for (curbp = heap_listp; GET_SIZE(HDRP(curbp)) > 0; curbp = NEXT_BLKP(curbp)) {}
    if (curbp != endp) {
        printf("Error: A block with size 0 isn't endp\n");
        printf("Its size is %d, address is %p and alloc is %d\n", GET_SIZE(HDRP(curbp)), curbp, GET_ALLOC(HDRP(curbp)));
        return 0;
    }
    if (GET_ALLOC(HDRP(endp)) != 1 || GET_SIZE(HDRP(endp)) != 0) {
        printf("Error: Epilogue blocks are not at specific locations.\n");
        return 0;
    }
    return 1;
}
*/
#### 调试信息

部分调试打印信息:(可以清晰地看出我们的设计方案)

// ./mdriver -V -f short1-bal.rep > out.txt

=========================== mm_init ===========================
address = 0xf698f018
hsize = 8, fsize = 8
halloc = 1, falloc = 1

address = 0xf698f020
hsize = 4096, fsize = 4096
halloc = 0, falloc = 0

address = 0xf6990020
hsize = 0
halloc = 1
=========================== mm_init ===========================

=========================== mm_malloc ===========================
address = 0xf698f018
hsize = 8, fsize = 8
halloc = 1, falloc = 1

address = 0xf698f020
hsize = 2048, fsize = 2048
halloc = 1, falloc = 1

address = 0xf698f820
hsize = 2048, fsize = 2048
halloc = 0, falloc = 0

address = 0xf6990020
hsize = 0
halloc = 1
=========================== mm_malloc ===========================

跑分(隐式空闲链表+首次适配)

Results for mm malloc:
trace  valid  util     ops      secs  Kops
 0       yes   99%    5694  0.003636  1566
 1       yes   99%    5848  0.003695  1583
 2       yes   99%    6648  0.005280  1259
 3       yes  100%    5380  0.003648  1475
 4       yes   66%   14400  0.000089161254
 5       yes   92%    4800  0.004965   967
 6       yes   92%    4800  0.004495  1068
 7       yes   55%   12000  0.086989   138
 8       yes   51%   24000  0.126514   190
 9       yes   27%   14401  0.028150   512
10       yes   30%   14401  0.000831 17330
Total          74%  112372  0.268292   419

Perf index = 44 (util) + 28 (thru) = 72/100

优化方向

去除已分配块的脚部

  • 已分配块无需脚部,只有空闲块需要脚部(因为只有前块为空闲块时才需要用到它的脚部用来合并)。
  • 分析:个人认为对于大部分情况来说感觉对内存利用率提升不会太大。毕竟一个脚部才 4 B 4B 4B

放置策略使用下一次适配(next fit)

注意!!! 使用下一次适配时会有一个问题,可能在下一次使用上次位置的指针时,此指针已不存在(具体表现在合并前块操作中,会将此块的指针合并掉)

修改代码find_fit 和 coalesce
static void* next_fitp;     //下一次适配指向的指针
//在init函数中要对此指针进行赋值

//find_fit版本1    使用do-while循环
static void *find_fit(size_t size) {
    char *endp, *lastp;
    next_fitp = NEXT_BLKP(next_fitp);    //此次开始搜索的位置
    endp = (char *)mem_heap_hi() + 1;
    lastp = next_fitp;

    do {
        if (next_fitp == endp) {
            next_fitp = heap_listp;
            continue;
        }
        if (!GET_ALLOC(HDRP(next_fitp)) && (GET_SIZE(HDRP(next_fitp)) >= size)) 
            return next_fitp;
        next_fitp = NEXT_BLKP(next_fitp);
    } while(next_fitp != lastp);

    return NULL;
}
//find_fit版本2    无脑用代码写两次大致相同的for循环
static void *find_fit(size_t size) {
    char *lastp;
    next_fitp = NEXT_BLKP(next_fitp);  //此次搜索开始的位置
    lastp = next_fitp;
    for (;GET_SIZE(HDRP(next_fitp)) > 0; next_fitp = NEXT_BLKP(next_fitp)) {
        if (!GET_ALLOC(HDRP(next_fitp)) && (GET_SIZE(HDRP(next_fitp)) >= size)) {
            return next_fitp;
        }
    }
    next_fitp = NEXT_BLKP(heap_listp);
    for (;next_fitp != lastp; next_fitp = NEXT_BLKP(next_fitp)) {
        if (!GET_ALLOC(HDRP(next_fitp)) && (GET_SIZE(HDRP(next_fitp)) >= size)) {
            return next_fitp;
        }
    }
    return NULL;
}


//只针对其中需要与前块合并时的代码进行修改
static void *coalesce(void *bp) {          
    int pre_alloc = GET_ALLOC(HDRP(PREV_BLKP(bp)));
    int post_alloc = GET_ALLOC(HDRP(NEXT_BLKP(bp)));
    size_t size = GET_SIZE(HDRP(bp));

    if (pre_alloc && post_alloc) {
        return bp;
    } else if (pre_alloc && !post_alloc) {   //与后块合并
        size += GET_SIZE(HDRP(NEXT_BLKP(bp)));
    } else if (!pre_alloc && post_alloc) {   //与前块合并
        size += GET_SIZE(HDRP(PREV_BLKP(bp)));
        if (bp == next_fitp) {
            next_fitp = PREV_BLKP(bp);
        }
        bp = PREV_BLKP(bp);
    } else {  //前后块都合并
        size += GET_SIZE(HDRP(PREV_BLKP(bp))) + GET_SIZE(FTRP(NEXT_BLKP(bp)));
        if (bp == next_fitp) {
            next_fitp = PREV_BLKP(bp);
        }
        bp = PREV_BLKP(bp);
    }
    PUT(HDRP(bp), PACK(size, 0));
    PUT(FTRP(bp), PACK(size, 0));     //FTRP()与GET_SIZE()有耦合,故此时所用的SIZE已经改变

    return bp;
}
跑分
Results for mm malloc:
trace  valid  util     ops      secs  Kops
 0       yes   90%    5694  0.001078  5280
 1       yes   91%    5848  0.000686  8526
 2       yes   95%    6648  0.001864  3567
 3       yes   96%    5380  0.001754  3067
 4       yes   66%   14400  0.000094153518
 5       yes   91%    4800  0.002902  1654
 6       yes   89%    4800  0.002691  1783
 7       yes   55%   12000  0.009888  1214
 8       yes   51%   24000  0.003605  6658
 9       yes   27%   14401  0.028398   507
10       yes   45%   14401  0.000729 19754
Total          72%  112372  0.053688  2093

Perf index = 43 (util) + 40 (thru) = 83/100
优化成果分析

​ 分数对比:原44 (util) + 28 (thru) = 72/100,现43 (util) + 40 (thru) = 83/100

首次适配:

  • 优点:趋向于将大的空闲块保留在链表的后面
  • 缺点:趋向于在靠近链表起始处留下小空闲块的“碎片”,增加了对较大块的搜索时间。

下一次适配:

  • 优点:上一次在某个空闲块中发现匹配,下一次也有可能(倾向于)在这个剩余块中发现匹配。
  • 缺点:研究表明,下一次适配的内存利用率要比首次适配低得多。

针对得到的两次分数,可以明显的看到下一次适配的吞吐率比首次适配高很多,这方面next fit优势明显,但内存利用率要低。

三种放置策略:

​ 研究表明,最佳适配比首次适配和下一次适配的内存利用率都要高一些,但在隐式空闲链表(简单空闲链表)中需要对堆进行彻底的搜索,而后面的分离式空闲链表组织会接近于最佳适配策略,而不需要进行彻底的堆搜索。

显式空闲链表

思想

​ 在内存块的基本分配下,在空闲块中增加指向前后的指针。以提高搜索适配块的效率,缺点就是增加了最小块的限制,最小块必须把两个指针也放进去。以支持维持一个 freelist 表以便遍历。

  • 空闲块组织:显式空闲链表,利用对齐块存放空闲链表的“头结点”。
  • 放置:后进先出(LIFO),针对空闲链表使用头插法
  • 分割:分割空闲块的前部使用。
  • 合并:立即合并(immediate coalescing)

优缺点:

  • 因为指针的开销,最小块限制增加为 24B , 潜在地提高了内部碎片的程度,但同样从块分配与堆块的总数呈线性关系降低至与空闲块数量相关的线性时间,改进还是相当大的。(可以很明显地从跑分中看出,内存利用率下降很少,但吞吐量却提升很大。)
  • 利用双指针进行空闲链表的遍历增加了内部碎片的大小,但遍历时降为常数时间。

CSAPP(CMU 15-213):Lab6 Malloclab详解_第3张图片
CSAPP(CMU 15-213):Lab6 Malloclab详解_第4张图片

分割策略:

CSAPP(CMU 15-213):Lab6 Malloclab详解_第5张图片

合并策略:

CSAPP(CMU 15-213):Lab6 Malloclab详解_第6张图片
CSAPP(CMU 15-213):Lab6 Malloclab详解_第7张图片
CSAPP(CMU 15-213):Lab6 Malloclab详解_第8张图片
CSAPP(CMU 15-213):Lab6 Malloclab详解_第9张图片

函数设计原则

​ 与隐式空闲链表相比,只需在其基础上增加空闲块内的两指针,以及针对空闲链表进行的操作即可。

宏定义(对指针进行操作)

#define MINBLOCK (DSIZE + 2*WSIZE + 2*WSIZE)  //头部、脚部、两指针、8字节数据
#define GETADDR(p)         (*(unsigned int **)(p))   //读地址p处的一个指针
#define PUTADDR(p, addr)   (*(unsigned int **)(p) = (unsigned int *)(addr))  //向地址p处写一个指针
#define PRED_POINT(bp)   (bp)            //指向祖先指针的指针
#define SUCC_POINT(bp)   ((char*)(bp) + WSIZE)  //指向后继指针的指针

static void* head_free;     //空闲链表的头结点,存放在堆区开关的对齐块中

函数

  • static void insert_freelist(void *bp); 使用头插法向空闲链表中插入空闲块
  • static void remove_freelist(void *bp); 从空闲链表中移除空闲块,合并中使用(合并策略)
  • static void place_freelist(void *bp); 对空闲链表中的空闲块进行前部分割(分割策略)
  • static void *find_fit(size_t size); 针对链表进行搜索上的修改

代码以及跑分

代码mm.c

/*
 * mm-naive.c - The fastest, least memory-efficient malloc package.
 * 
 * In this naive approach, a block is allocated by simply incrementing
 * the brk pointer.  A block is pure payload. There are no headers or
 * footers.  Blocks are never coalesced or reused. Realloc is
 * implemented directly using mm_malloc and mm_free.
 *
 * NOTE TO STUDENTS: Replace this header comment with your own header
 * comment that gives a high level description of your solution.
 */
#include 
#include 
#include 
#include 
#include 

#include "mm.h"
#include "memlib.h"

/*********************************************************
 * NOTE TO STUDENTS: Before you do anything else, please
 * provide your team information in the following struct.
 ********************************************************/
team_t team = {
    /* Team name */
    "ateam",
    /* First member's full name */
    "Harry Bovik",
    /* First member's email address */
    "[email protected]",
    /* Second member's full name (leave blank if none) */
    "",
    /* Second member's email address (leave blank if none) */
    ""
};

#define VERBOSE 0
#ifdef DEBUG
#define VERBOSE 1
#endif

/* single word (4) or double word (8) alignment */
#define ALIGNMENT 8

/* rounds up to the nearest multiple of ALIGNMENT */  // 对 ALIGNMENT 倍数上取整的计算
#define ALIGN(size) (((size) + (ALIGNMENT-1)) & ~0x7)
#define SIZE_T_SIZE (ALIGN(sizeof(size_t)))

/* 自定义的宏,有便于操作常量和指针运算 */   
#define WSIZE       4        //字、脚部或头部的大小(字节)
#define DSIZE       8        //双字大小(字节)
#define CHUNKSIZE  (1<<12)   //扩展堆时的默认大小
#define MINBLOCK (DSIZE + 2*WSIZE + 2*WSIZE)  //头部、脚部、两指针、8字节数据

#define MAX(x, y)  ((x) > (y) ? (x) : (y))

#define PACK(size, alloc)  ((size) | (alloc))         //将 size 和 allocated bit 合并为一个字

#define GET(p)             (*(unsigned int *)(p))          //读地址p处的一个字
#define PUT(p, val)        (*(unsigned int *)(p) = (val))  //向地址p处写一个字
#define GETADDR(p)         (*(unsigned int **)(p))   //读地址p处的一个指针
#define PUTADDR(p, addr)   (*(unsigned int **)(p) = (unsigned int *)(addr))  //向地址p处写一个指针


#define GET_SIZE(p)   (GET(p) & ~0x07)    //得到地址p处的 size
#define GET_ALLOC(p)  (GET(p) & 0x1)      //得到地址p处的 allocated bit
//block point --> bp指向有效载荷块指针
#define HDRP(bp)     ((char*)(bp) - WSIZE)                       //获得头部的地址
#define FTRP(bp)     ((char*)(bp) + GET_SIZE(HDRP(bp)) - DSIZE)  //获得脚部的地址, 与宏定义HDRP有耦合

#define NEXT_BLKP(bp)    ((char*)(bp) + GET_SIZE((char*)(bp) - WSIZE))  //计算后块的地址
#define PREV_BLKP(bp)    ((char*)(bp) - GET_SIZE((char*)(bp) - DSIZE))  //计算前块的地址

#define PRED_POINT(bp)   (bp)            //指向祖先指针的指针
#define SUCC_POINT(bp)   ((char*)(bp) + WSIZE)  //指向后继指针的指针

static void* heap_listp;    //指向序言块
static void* head_free;     //空闲链表的头结点,存放在堆区开关的对齐块中
/* private functions */
static void *extend_heap(size_t size);     //拓展堆块
static void *find_fit(size_t size);        //寻找空闲块   first fit
static void place(void *bp, size_t size);  //分割空闲块
static void *coalesce(void *bp);           //合并空闲块
/* 链表操作 */
static void insert_freelist(void *bp);
static void remove_freelist(void *bp);
static void place_freelist(void *bp);
//check
static void mm_printblock(int verbose, const char* func);

/* 
 * mm_init - initialize the malloc package.
 */
//设立序言块、结尾块,以及序言块前的对齐块(4B),总共需要4个4B的空间
int mm_init(void)
{
    if ((heap_listp = mem_sbrk(4*WSIZE)) == (void*)-1) 
        return -1;
    PUTADDR(heap_listp, NULL);                     //堆起绐位置的对齐块,使bp对齐8字节
    PUT(heap_listp + 1*WSIZE, PACK(8, 1));  //序言块
    PUT(heap_listp + 2*WSIZE, PACK(8, 1));  //序言块
    PUT(heap_listp, PACK(0, 1));            //结尾块
    head_free = heap_listp;                 //利用对齐块存放空闲链表的头结点
    PUTADDR(head_free, NULL);
    heap_listp += (2*WSIZE);     //小技巧:使heap_listp指向下一块, 即两个序主块中间

    if (extend_heap(CHUNKSIZE) == NULL)   //拓展堆块
        return -1;
    mm_printblock(VERBOSE, __func__);
    return 0;
}

//使用头插法,将空闲块插入空闲链表中
static void insert_freelist(void *bp) {   //LIFO  后进先出,头插法
    if (GETADDR(head_free) == NULL) {
        PUTADDR(SUCC_POINT(bp), NULL);
        PUTADDR(PRED_POINT(bp), head_free);
        PUTADDR(head_free, bp);
    } else {
        void *tmp;
        tmp = GETADDR(head_free);
        PUTADDR(SUCC_POINT(bp), tmp);
        PUTADDR(PRED_POINT(bp), head_free);
        PUTADDR(head_free, bp);
        PUTADDR(PRED_POINT(tmp), bp);
        tmp = NULL;
    }
}

//将 bp 所指的空闲块从空闲链表中移除(进行合并操作中会用到)
static void remove_freelist(void *bp) {
    void *pre_block, *post_block;
    pre_block = GETADDR(PRED_POINT(bp));
    post_block = GETADDR(SUCC_POINT(bp));
    //处理前序结点
    if (pre_block == head_free) {
        PUTADDR(head_free, post_block);
    } else {
        PUTADDR(SUCC_POINT(pre_block), post_block);
    }
    //处理后序结点
    if (post_block != NULL) {
        PUTADDR(PRED_POINT(post_block), pre_block);
    }
}

//对空闲链表中的空闲块进行前部分割
static void place_freelist(void *bp) {
    void *pre_block, *post_block, *next_bp;
    //存储前后结点地址
    pre_block = GETADDR(PRED_POINT(bp));
    post_block = GETADDR(SUCC_POINT(bp));
    next_bp = NEXT_BLKP(bp);
    //处理新的bp,进行前后连接
    PUTADDR(PRED_POINT(next_bp), pre_block);
    PUTADDR(SUCC_POINT(next_bp), post_block);
    //处理前序结点  针对head_free是前序结点的特殊处理
    if (pre_block == head_free) {
        PUTADDR(head_free, next_bp);
    } else {
        PUTADDR(SUCC_POINT(pre_block), next_bp);
    }
    //处理后序结点
    if (post_block != NULL) {
        PUTADDR(PRED_POINT(post_block), next_bp);
    }
}

/* 
 * mm_malloc - Allocate a block by incrementing the brk pointer.
 *     Always allocate a block whose size is a multiple of the alignment.
 */
void *mm_malloc(size_t size)
{
    size_t asize;     //ajusted size
    size_t extendsize;  //若无适配块则拓展堆的大小
    void *bp = NULL;

    if (size == 0)    //无效的申请
        return NULL;

    asize = ALIGN(size + 2*WSIZE);
    
    if ((bp = find_fit(asize)) != NULL) {
        place(bp, asize);
        return bp;
    }
    
    //无足够空间的空闲块用来分配
    extendsize = MAX(asize, CHUNKSIZE);
    if ((bp = extend_heap(extendsize)) == NULL) {
        return NULL;
    }    
    place(bp, asize);
    mm_printblock(VERBOSE, __func__);
    return bp;
}

static void *extend_heap(size_t size) {
    size_t asize;   
    void *bp;

    asize = ALIGN(size);
    if ((long)(bp = mem_sbrk(asize)) == -1)
        return NULL;

    PUT(HDRP(bp), PACK(asize, 0));          //HDRP(bp)指向原结尾块
    PUT(FTRP(bp), PACK(asize, 0));          
    PUT(HDRP(NEXT_BLKP(bp)), PACK(0, 1));   //新结尾块
 
    return coalesce(bp);
}

//放置策略搜索   首次适配搜索
static void *find_fit(size_t size) {         
    void *curbp;
    for (curbp = GETADDR(head_free); curbp != NULL; curbp = GETADDR(SUCC_POINT(curbp))) {
        if (GET_SIZE(HDRP(curbp)) >= size)
            return curbp;
    }
    return NULL;    //未适配
} 


//分割空闲块
static void place(void *bp, size_t asize) {     //注意最小块的限制(24B == MINBLOCK)
    size_t total_size = GET_SIZE(HDRP(bp));
    size_t remainder_size = total_size - asize;
    if (remainder_size >= MINBLOCK) {
        PUT(HDRP(bp), PACK(asize, 1));
        PUT(FTRP(bp), PACK(asize, 1));
        void *next_bp = NEXT_BLKP(bp);
        PUT(HDRP(next_bp), PACK(remainder_size, 0));
        PUT(FTRP(next_bp), PACK(remainder_size, 0));
        place_freelist(bp);
    } else {          //没有已分配块或空闲块可以比最小块更小
        PUT(HDRP(bp), PACK(total_size, 1));
        PUT(FTRP(bp), PACK(total_size, 1));
        remove_freelist(bp);
    }
}

/*
 * mm_free - Freeing a block does nothing.
 */
void mm_free(void *ptr)
{
    size_t size = GET_SIZE(HDRP(ptr));
    PUT(HDRP(ptr), PACK(size, 0));
    PUT(FTRP(ptr), PACK(size, 0));
    coalesce(ptr);
    mm_printblock(VERBOSE, __func__);
}
/*
* coalesce - 合并内存块
*/
static void *coalesce(void *bp) {
    char *pre_block, *post_block;
    int pre_alloc = GET_ALLOC(HDRP(PREV_BLKP(bp)));
    int post_alloc = GET_ALLOC(HDRP(NEXT_BLKP(bp)));
    size_t size = GET_SIZE(HDRP(bp));
    if (pre_alloc && post_alloc) {
        insert_freelist(bp);
        return bp;
    } else if (pre_alloc && !post_alloc) {   //与后块合并
        size += GET_SIZE(HDRP(NEXT_BLKP(bp)));
        post_block = NEXT_BLKP(bp);  //记录后块的指针
        remove_freelist(post_block);
        insert_freelist(bp);
    } else if (!pre_alloc && post_alloc) {   //与前块合并
        size += GET_SIZE(HDRP(PREV_BLKP(bp)));
        bp = PREV_BLKP(bp);
        remove_freelist(bp);
        insert_freelist(bp);
    } else {  //前后块都合并
        size += GET_SIZE(HDRP(PREV_BLKP(bp))) + GET_SIZE(FTRP(NEXT_BLKP(bp)));
        pre_block = PREV_BLKP(bp);
        post_block = NEXT_BLKP(bp);
        bp = PREV_BLKP(bp);
        remove_freelist(pre_block);
        remove_freelist(post_block);
        insert_freelist(bp);
    }
    PUT(HDRP(bp), PACK(size, 0));
    PUT(FTRP(bp), PACK(size, 0));
    return bp;
}

/*
 * mm_realloc - Implemented simply in terms of mm_malloc and mm_free
 */
void *mm_realloc(void *ptr, size_t size)
{
    size_t old_size, new_size, extendsize;
    void *old_ptr, *new_ptr;

    if (ptr == NULL) {
        return mm_malloc(size);
    }
    if (size == 0) {
        mm_free(ptr);
        return NULL;
    }

    new_size = ALIGN(size + 2*WSIZE);
    old_size = GET_SIZE(HDRP(ptr));
    old_ptr = ptr;
    if (old_size >= new_size) {
        if (old_size - new_size >= MINBLOCK) {  //分割内存块
            place(old_ptr, new_size);
            mm_printblock(VERBOSE, __func__);
            return old_ptr;
        } else {   //剩余块小于最小块大小,不分割
            mm_printblock(VERBOSE, __func__);
            return old_ptr;
        }
    } else {  //释放原内存块,寻找新内存块
        if ((new_ptr = find_fit(new_size)) == NULL) {  //无合适内存块
            extendsize = MAX(new_size, CHUNKSIZE);
            if ((new_ptr = extend_heap(extendsize)) == NULL)   //拓展堆空间
                return NULL;
        }
    place(new_ptr, new_size);
    memcpy(new_ptr, old_ptr, old_size - 2*WSIZE);
    mm_free(old_ptr);
    mm_printblock(VERBOSE, __func__);
    return new_ptr;
    }
}

static void mm_printblock(int verbose, const char* func) {
    if (!verbose) return;
    void *curbp;
    printf("\n=========================== $%s$ ===========================\n" ,func);
    printf("================ block ================\n");
    for (curbp = heap_listp; GET_SIZE(HDRP(curbp)) > 0; curbp = NEXT_BLKP(curbp)) {
        printf("address = %p\n", curbp);
        printf("hsize = %d, fsize = %d\n", GET_SIZE(HDRP(curbp)), GET_SIZE(FTRP(curbp)));
        printf("halloc = %d, falloc = %d\n", GET_ALLOC(HDRP(curbp)), GET_ALLOC(FTRP(curbp)));
        printf("\n");
    }
    //epilogue blocks
    printf("address = %p\n", curbp);
    printf("hsize = %d\n", GET_SIZE(HDRP(curbp)));
    printf("halloc = %d\n", GET_ALLOC(HDRP(curbp)));
    printf("================ block ================\n");
    printf("\n");
    printf("=============== freelist ===============\n");
    for (curbp = GETADDR(head_free); curbp != NULL; curbp = GETADDR(SUCC_POINT(curbp))) {
        printf("address = %p, size = %d,%d, alloc = %d,%d\n",
         curbp, GET_SIZE(HDRP(curbp)), GET_SIZE(FTRP(curbp)), GET_ALLOC(HDRP(curbp)), GET_ALLOC(FTRP(curbp)));
    }
    printf("address = %p\n", curbp);
    printf("=============== freelist ===============\n");
    printf("=========================== $%s$ ===========================\n" ,func);
}

调试信息

部分调试打印信息:(可以清晰地看出我们的设计方案)

# 进行初始化堆块后内存块以及空闲链表的情况

=========================== $mm_init$ ===========================
================ block ================
address = 0xf69aa018
hsize = 8, fsize = 8
halloc = 1, falloc = 1

address = 0xf69aa020
hsize = 4096, fsize = 4096
halloc = 0, falloc = 0

address = 0xf69ab020
hsize = 0
halloc = 1
================ block ================

=============== freelist ===============
address = 0xf69aa020, size = 4096,4096, alloc = 0,0
address = (nil)
=============== freelist ===============
=========================== $mm_init$ ===========================

跑分(显式空闲链表 + LIFO)

# 显式空闲链表 + LIFO
Results for mm malloc:
trace  valid  util     ops      secs  Kops
 0       yes   93%    5694  0.000093 60964
 1       yes   94%    5848  0.000088 66455
 2       yes   96%    6648  0.000150 44290
 3       yes   97%    5380  0.000158 34115
 4       yes   66%   14400  0.000131109756
 5       yes   89%    4800  0.000306 15686
 6       yes   85%    4800  0.000360 13315
 7       yes   55%   12000  0.001054 11385
 8       yes   51%   24000  0.001897 12652
 9       yes   26%   14401  0.032686   441
10       yes   30%   14401  0.000994 14486
Total          71%  112372  0.037918  2964

Perf index = 43 (util) + 40 (thru) = 83/100

优化方向

按照地址顺序来维护链表代替LIFO

  • 缺陷:插入结点时间由常数时间增加为线性时间
  • 优势:按照地址排序的首次适配比 LIFO 排序的首次适配有更高的内存利用率,接近最佳适配的利用率。
修改代码 insert_freelist
static void insert_freelist(void *bp) {   //按地址顺序维护链表(线性时间)
    void *pre_block, *post_block, *tmp;
    tmp = head_free;
    for (post_block = GETADDR(head_free); post_block != NULL; post_block = GETADDR(SUCC_POINT(post_block))) {
        if (post_block > bp) {
            pre_block = GETADDR(PRED_POINT(post_block));
            // bp 结点前后序块
            PUTADDR(PRED_POINT(bp), pre_block);
            PUTADDR(SUCC_POINT(bp), post_block);
            //前序块
            if (pre_block == head_free) {
                PUTADDR(head_free, bp);
            } else {
                PUTADDR(SUCC_POINT(pre_block), bp);
            }
            //后序块
            PUTADDR(PRED_POINT(post_block), bp);
            return;
        }
        tmp = post_block;  //若只能插入链表未尾,则存储最后一个结点
    }
    // curbp == NULL
    //前序结点地址
    pre_block = tmp;
    //bp 结点前后序块
    PUTADDR(PRED_POINT(bp), pre_block);
    PUTADDR(SUCC_POINT(bp), NULL);
    //前序结点块
    if (pre_block == head_free) {
        PUTADDR(head_free, bp);
    } else {
        PUTADDR(SUCC_POINT(pre_block), bp);
    }
}

//另外因调试的需要,部分修改了调试函数(关于freelist部分)
static void mm_printblock(int verbose, const char* func) {
    if (!verbose) return;
    void *curbp;
    printf("\n=========================== $%s$ ===========================\n" ,func);
    printf("================ block ================\n");
    for (curbp = heap_listp; GET_SIZE(HDRP(curbp)) > 0; curbp = NEXT_BLKP(curbp)) {
        printf("address = %p\n", curbp);
        printf("hsize = %d, fsize = %d\n", GET_SIZE(HDRP(curbp)), GET_SIZE(FTRP(curbp)));
        printf("halloc = %d, falloc = %d\n", GET_ALLOC(HDRP(curbp)), GET_ALLOC(FTRP(curbp)));
        printf("\n");
    }
    //epilogue blocks
    printf("address = %p\n", curbp);
    printf("hsize = %d\n", GET_SIZE(HDRP(curbp)));
    printf("halloc = %d\n", GET_ALLOC(HDRP(curbp)));
    printf("================ block ================\n");
    printf("\n");
    printf("=============== freelist ===============\n");
    printf("head_address = %p, next_address = %p\n", head_free, GETADDR(head_free));
    for (curbp = GETADDR(head_free); curbp != NULL; curbp = GETADDR(SUCC_POINT(curbp))) {
    printf("address      = %p, next_address = %p, size = %d\n",curbp, 
    GETADDR(SUCC_POINT(curbp)), GET_SIZE(HDRP(curbp)));
    }
    printf("address      = %p\n", curbp);
    printf("=============== freelist ===============\n");
    printf("=========================== $%s$ ===========================\n" ,func);
}
调试信息
# ./mdriver -V -f short1-bal.rep > out.txt
=========================== $mm_init$ ===========================
================ block ================
address = 0xf697d018
hsize = 8, fsize = 8
halloc = 1, falloc = 1

address = 0xf697d020
hsize = 4096, fsize = 4096
halloc = 0, falloc = 0

address = 0xf697e020
hsize = 0
halloc = 1
================ block ================

=============== freelist ===============
head_address = 0xf697d010, next_address = 0xf697d020
address      = 0xf697d020, next_address = (nil), size = 4096
address      = (nil)
=============== freelist ===============
=========================== $mm_init$ ===========================
跑分
# 显式空闲链表+按地址顺序维护链表
Results for mm malloc:
trace  valid  util     ops      secs  Kops
 0       yes   99%    5694  0.000137 41471
 1       yes   99%    5848  0.000123 47506
 2       yes   99%    6648  0.000168 39501
 3       yes   99%    5380  0.000136 39646
 4       yes   66%   14400  0.000129111369
 5       yes   92%    4800  0.001686  2846
 6       yes   92%    4800  0.001677  2863
 7       yes   55%   12000  0.012042   996
 8       yes   51%   24000  0.071326   336
 9       yes   27%   14401  0.029718   485
10       yes   30%   14401  0.000880 16361
Total          74%  112372  0.118024   952

Perf index = 44 (util) + 40 (thru) = 84/100
优化成果分析

​ 可以看出提升并不是很大(原为43 (util) + 40 (thru) = 83/100,现为44 (util) + 40 (thru) = 84/100)。

​ 因为按照地址顺序来维护空闲链表更接近于最佳适配,故内存利用率有所提高。

​ 而吞吐率方面因插入链表为线性时间,而原搜索适配空闲块为线性时间,方法修改后时间上只为原方法线性时间的常数倍,对大规模操作而言,几乎无变化。

分离的空闲链表

思想

​ 正如我们在前面所看到的,一个使用单向空闲块链表的分配器需要与空闲块数量呈线性关系的时间来分配块,为了近似达到最佳适配以及更快寻找适配块,可以根据不同的_大小类_来维护多个空闲链表。本代码采用的每个大小类都是2的幂。

函数设计原则

static void *segList[25]; //左闭右开,根据MAX_HEAP(20*(1<<20)) 即最大不到1<<25

​ 主要修改的是关于链表的操作,与单个链表类似。

​ 同时,因为在realloc的操作中,如果原内存块不够需求的话,之前的方案是直接寻找适配块,若寻找不到的话就新分配堆块进行操作,但我们没有考虑到的是若原块相邻的空闲块的话那就可以进行合并操作以存放新块,这样就提高了内存使用率,改进的realloc迎运而生。

代码以及跑分

代码

/*
 * mm-naive.c - The fastest, least memory-efficient malloc package.
 * 
 * In this naive approach, a block is allocated by simply incrementing
 * the brk pointer.  A block is pure payload. There are no headers or
 * footers.  Blocks are never coalesced or reused. Realloc is
 * implemented directly using mm_malloc and mm_free.
 *
 * NOTE TO STUDENTS: Replace this header comment with your own header
 * comment that gives a high level description of your solution.
 */
#include 
#include 
#include 
#include 
#include 

#include "mm.h"
#include "memlib.h"

/*********************************************************
 * NOTE TO STUDENTS: Before you do anything else, please
 * provide your team information in the following struct.
 ********************************************************/
team_t team = {
    /* Team name */
    "ateam",
    /* First member's full name */
    "Harry Bovik",
    /* First member's email address */
    "[email protected]",
    /* Second member's full name (leave blank if none) */
    "",
    /* Second member's email address (leave blank if none) */
    ""
};

#define VERBOSE 0
#ifdef DEBUG
#define VERBOSE 1
#endif

/* single word (4) or double word (8) alignment */
#define ALIGNMENT 8

/* rounds up to the nearest multiple of ALIGNMENT */  // 对 ALIGNMENT 倍数上取整的计算
#define ALIGN(size) (((size) + (ALIGNMENT-1)) & ~0x7)
#define SIZE_T_SIZE (ALIGN(sizeof(size_t)))

/* 自定义的宏,有便于操作常量和指针运算 */   
#define WSIZE       4        //字、脚部或头部的大小(字节)
#define DSIZE       8        //双字大小(字节)
#define CHUNKSIZE  (1<<12)   //扩展堆时的默认大小
#define MINBLOCK (DSIZE + 2*WSIZE + 2*WSIZE)  //头部、脚部、两指针、8字节数据

#define MAX(x, y)  ((x) > (y) ? (x) : (y))

#define PACK(size, alloc)  ((size) | (alloc))         //将 size 和 allocated bit 合并为一个字

#define GET(p)             (*(unsigned int *)(p))          //读地址p处的一个字
#define PUT(p, val)        (*(unsigned int *)(p) = (val))  //向地址p处写一个字
#define GETADDR(p)         (*(unsigned int **)(p))   //读地址p处的一个指针
#define PUTADDR(p, addr)   (*(unsigned int **)(p) = (unsigned int *)(addr))  //向地址p处写一个指针


#define GET_SIZE(p)   (GET(p) & ~0x07)    //得到地址p处的 size
#define GET_ALLOC(p)  (GET(p) & 0x1)      //得到地址p处的 allocated bit
//block point --> bp指向有效载荷块指针
#define HDRP(bp)     ((char*)(bp) - WSIZE)                       //获得头部的地址
#define FTRP(bp)     ((char*)(bp) + GET_SIZE(HDRP(bp)) - DSIZE)  //获得脚部的地址, 与宏定义HDRP有耦合

#define NEXT_BLKP(bp)    ((char*)(bp) + GET_SIZE((char*)(bp) - WSIZE))  //计算后块的地址
#define PREV_BLKP(bp)    ((char*)(bp) - GET_SIZE((char*)(bp) - DSIZE))  //计算前块的地址

#define PRED_POINT(bp)   (bp)            //指向祖先指针的指针
#define SUCC_POINT(bp)   ((char*)(bp) + WSIZE)  //指向后继指针的指针

static void *heap_listp;    //指向序言块
static void *segList[25];    //左闭右开,根据MAX_HEAP(20*(1<<20)) 即最大不到1<<25
/* private functions */
static void *extend_heap(size_t size);     //拓展堆块
static void *find_fit(size_t size);        //寻找空闲块   first fit
static void place(void *bp, size_t size);  //分割空闲块
static void *coalesce(void *bp);           //合并空闲块
static void *mm_realloc_coalesce(void *old_ptr, size_t new_size);   //针对realloc优化的合并函数
/* 链表操作 */
static void insert_freelist(void *bp);
static void remove_freelist(void *bp);
static int isSegList(void *bp);  //判断是否为segList
//check
static void mm_printblock(int verbose, const char* func);

/* 
 * mm_init - initialize the malloc package.
 */
//设立序言块、结尾块,以及序言块前的对齐块(4B),总共需要4个4B的空间
int mm_init(void)
{
    for (int index = 0; index < 25; index++) {
        segList[index] = NULL;
    }
    if ((heap_listp = mem_sbrk(4*WSIZE)) == (void*)-1) 
        return -1;
    PUTADDR(heap_listp, NULL);                     //堆起绐位置的对齐块,使bp对齐8字节
    PUT(heap_listp + 1*WSIZE, PACK(8, 1));  //序言块
    PUT(heap_listp + 2*WSIZE, PACK(8, 1));  //序言块
    PUT(heap_listp, PACK(0, 1));            //结尾块
    heap_listp += (2*WSIZE);     //小技巧:使heap_listp指向下一块, 即两个序主块中间

    if (extend_heap(CHUNKSIZE) == NULL)   //拓展堆块
        return -1;
    mm_printblock(VERBOSE, __func__);
    return 0;
}

static int isSegList(void *bp) {
    if (bp >= segList && bp <= (segList+23))
        return 1;
    return 0;
}

//寻找适合大小的空闲链表并使用头插法插入
static void insert_freelist(void *bp) {
    size_t size;
    int index;

    size = GET_SIZE(HDRP(bp));
    for (index = 4; index < 25; index++) {   //最小块为16B,即下标从4起有效
        if ((1 << index) <= size && (1 << (index+1)) > size)
            break;
    }

    if (segList[index] == NULL) {
        PUTADDR(SUCC_POINT(bp), NULL);
        PUTADDR(PRED_POINT(bp), &segList[index]);   //为了判断该结点的前序结点是segList,将些赋值为NULL
        segList[index] = bp;
    } else {
        void *tmp;
        tmp = segList[index];
        PUTADDR(SUCC_POINT(bp), tmp);
        PUTADDR(PRED_POINT(bp), &segList[index]);
        segList[index] = bp;
        PUTADDR(PRED_POINT(tmp), bp);
        tmp = NULL;
    }
}

//将 bp 所指的空闲块从空闲链表中移除(进行合并、放置操作中会用到)
static void remove_freelist(void *bp) {
    void *pre_block, *post_block;
    pre_block = GETADDR(PRED_POINT(bp));
    post_block = GETADDR(SUCC_POINT(bp));
    //处理前序结点
    if (isSegList(pre_block)) {  //前序是头结点
        PUTADDR(pre_block, post_block);
    } else {
        PUTADDR(SUCC_POINT(pre_block), post_block);
    }
    //处理后序结点
    if (post_block != NULL) {
        PUTADDR(PRED_POINT(post_block), pre_block);
    }
}

/* 
 * mm_malloc - Allocate a block by incrementing the brk pointer.
 *     Always allocate a block whose size is a multiple of the alignment.
 */
void *mm_malloc(size_t size)
{
    size_t asize;     //ajusted size
    size_t extendsize;  //若无适配块则拓展堆的大小
    void *bp = NULL;

    if (size == 0)    //无效的申请
        return NULL;

    asize = ALIGN(size + 2*WSIZE);
    
    if ((bp = find_fit(asize)) != NULL) {
        place(bp, asize);
        mm_printblock(VERBOSE, __func__);
        return bp;
    }
    
    //无足够空间的空闲块用来分配
    extendsize = MAX(asize, CHUNKSIZE);
    if ((bp = extend_heap(extendsize)) == NULL) {
        return NULL;
    }    
    place(bp, asize);
    mm_printblock(VERBOSE, __func__);
    return bp;
}

static void *extend_heap(size_t size) {
    size_t asize;   
    void *bp;

    asize = ALIGN(size);
    if ((long)(bp = mem_sbrk(asize)) == -1)
        return NULL;

    PUT(HDRP(bp), PACK(asize, 0));          //HDRP(bp)指向原结尾块
    PUT(FTRP(bp), PACK(asize, 0));          
    PUT(HDRP(NEXT_BLKP(bp)), PACK(0, 1));   //新结尾块
 
    return coalesce(bp);
}

//放置策略搜索   首次适配搜索+分离适配
static void *find_fit(size_t size) {         
    for (int index = 4; index < 25; index++) {
        if (size < (1 << (index+1))) {
            unsigned int *curbp;
            for (curbp = segList[index]; curbp != NULL; curbp = GETADDR(SUCC_POINT(curbp))) {
                if (size <= GET_SIZE(HDRP(curbp))) {
                    return curbp;
                }
            }
        }
    }
    return NULL;    //未适配
} 


//分割空闲块
static void place(void *bp, size_t asize) {     //注意最小块的限制(24B == MINBLOCK)
    size_t total_size = GET_SIZE(HDRP(bp));
    size_t remainder_size = total_size - asize;
    if (remainder_size >= MINBLOCK) {
        PUT(HDRP(bp), PACK(asize, 1));
        PUT(FTRP(bp), PACK(asize, 1));
        remove_freelist(bp);
        void *next_bp = NEXT_BLKP(bp);
        PUT(HDRP(next_bp), PACK(remainder_size, 0));
        PUT(FTRP(next_bp), PACK(remainder_size, 0));
        insert_freelist(next_bp);
    } else {          //没有已分配块或空闲块可以比最小块更小
        PUT(HDRP(bp), PACK(total_size, 1));
        PUT(FTRP(bp), PACK(total_size, 1));
        remove_freelist(bp);
    }
}

/*
 * mm_free - Freeing a block does nothing.
 */
void mm_free(void *ptr)
{
    size_t size = GET_SIZE(HDRP(ptr));
    PUT(HDRP(ptr), PACK(size, 0));
    PUT(FTRP(ptr), PACK(size, 0));
    coalesce(ptr);
    mm_printblock(VERBOSE, __func__);
}
/*
* coalesce - 合并内存块
*/
static void *coalesce(void *bp) {
    char *pre_block, *post_block;
    int pre_alloc = GET_ALLOC(HDRP(PREV_BLKP(bp)));
    int post_alloc = GET_ALLOC(HDRP(NEXT_BLKP(bp)));
    size_t size = GET_SIZE(HDRP(bp));
    if (pre_alloc && post_alloc) {
        insert_freelist(bp);
        return bp;
    } else if (!pre_alloc && post_alloc) {   //与前块合并
        size += GET_SIZE(HDRP(PREV_BLKP(bp)));
        bp = PREV_BLKP(bp);
        remove_freelist(bp);
    } else if (pre_alloc && !post_alloc) {   //与后块合并
        size += GET_SIZE(HDRP(NEXT_BLKP(bp)));
        post_block = NEXT_BLKP(bp);  //记录后块的指针
        remove_freelist(post_block);
    } else {  //前后块都合并
        size += GET_SIZE(HDRP(PREV_BLKP(bp))) + GET_SIZE(FTRP(NEXT_BLKP(bp)));
        pre_block = PREV_BLKP(bp);
        post_block = NEXT_BLKP(bp);
        bp = PREV_BLKP(bp);
        remove_freelist(pre_block);
        remove_freelist(post_block);
    }
    PUT(HDRP(bp), PACK(size, 0));
    PUT(FTRP(bp), PACK(size, 0));
    insert_freelist(bp);
    return bp;
}

/*
 * mm_realloc - Implemented simply in terms of mm_malloc and mm_free
 */
void *mm_realloc(void *ptr, size_t size)
{
    size_t old_size, new_size, extendsize;
    void *old_ptr, *new_ptr;

    if (ptr == NULL) {
        return mm_malloc(size);
    }
    if (size == 0) {
        mm_free(ptr);
        return NULL;
    }

    new_size = ALIGN(size + 2*WSIZE);
    old_size = GET_SIZE(HDRP(ptr));
    old_ptr = ptr;
    if (old_size >= new_size) {
        if (old_size - new_size >= MINBLOCK) {  //分割内存块
            place(old_ptr, new_size);
            mm_printblock(VERBOSE, __func__);
            return old_ptr;
        } else {   //剩余块小于最小块大小,不分割
            mm_printblock(VERBOSE, __func__);
            return old_ptr;
        }
    } else {  //寻找合并内存块或新内存块
        if ((new_ptr = mm_realloc_coalesce(old_ptr, new_size)) != NULL) {  //合并相邻内存块并数据迁移后返回
            mm_printblock(VERBOSE, __func__);
            return new_ptr;
        }
        if ((new_ptr = find_fit(new_size)) == NULL) {  //无合适内存块
            extendsize = MAX(new_size, CHUNKSIZE);
            if ((new_ptr = extend_heap(extendsize)) == NULL)   //拓展堆空间
                return NULL;
        }
    //针对非相邻内存块进行数据迁移
    place(new_ptr, new_size);
    memcpy(new_ptr, old_ptr, old_size - 2*WSIZE);
    mm_free(old_ptr);
    mm_printblock(VERBOSE, __func__);
    return new_ptr;
    }
}

//针对realloc的前后合并
static void *mm_realloc_coalesce(void *old_ptr, size_t new_size) {
    void *pre_block, *post_block, *new_ptr;
    int pre_alloc, post_alloc;
    size_t pre_size, post_size, old_size, total_size;

    pre_block = PREV_BLKP(old_ptr);
    post_block = NEXT_BLKP(old_ptr);
    pre_alloc = GET_ALLOC(HDRP(pre_block));
    post_alloc = GET_ALLOC(HDRP(post_block));
    pre_size = GET_SIZE(HDRP(pre_block));
    post_size = GET_SIZE(HDRP(post_block));
    old_size = GET_SIZE(HDRP(old_ptr));

    if (!pre_alloc && ((total_size = old_size + pre_size) >= new_size)) {  //与前块合并分配
        new_ptr = pre_block;
        remove_freelist(pre_block);
    } else if (!post_alloc && ((total_size = old_size + post_size) >= new_size)){  //与后块合并分配
        new_ptr = old_ptr;
        remove_freelist(post_block);
    } else if (!pre_alloc && !post_alloc && ((total_size = old_size + pre_size + post_size) >= new_size)){  //与前后块合并分配
        new_ptr = pre_block;
        remove_freelist(pre_block);
        remove_freelist(post_block);
    } else {   //无合并分配的可能
        return NULL;
    }
    memcpy(new_ptr, old_ptr, old_size - 2*WSIZE);
    if (total_size - new_size >= MINBLOCK) {
        PUT(HDRP(new_ptr), PACK(new_size, 1));
        PUT(FTRP(new_ptr), PACK(new_size, 1));
        void *next_bp = NEXT_BLKP(new_ptr);
        PUT(HDRP(next_bp), PACK(total_size - new_size, 0));
        PUT(FTRP(next_bp), PACK(total_size - new_size, 0));
        coalesce(next_bp);
    } else {
        PUT(HDRP(new_ptr), PACK(total_size, 1));
        PUT(FTRP(new_ptr), PACK(total_size, 1));
    }
    return new_ptr;
}

static void mm_printblock(int verbose, const char* func) {
    if (!verbose) return;
    void *curbp;
    printf("\n=========================== $%s$ ===========================\n" ,func);
    printf("================ block ================\n");
    for (curbp = heap_listp; GET_SIZE(HDRP(curbp)) > 0; curbp = NEXT_BLKP(curbp)) {
        printf("address = %p\n", curbp);
        printf("size = %d, %d, alloc = %d, %d\n", GET_SIZE(HDRP(curbp)), GET_SIZE(FTRP(curbp)), 
        GET_ALLOC(HDRP(curbp)), GET_ALLOC(FTRP(curbp)));
        printf("\n");
    }
    //epilogue blocks
    printf("address = %p\n", curbp);
    printf("size = %d, alloc = %d\n", GET_SIZE(HDRP(curbp)), GET_ALLOC(HDRP(curbp)));
    printf("================ block ================\n");
    printf("\n");
    printf("=============== freelist ===============\n");
    for (int index = 4; index < 25; index++) {
        if (segList[index] == NULL) continue;
        printf("segList[%d]: [%d,%d)\n", index, (1 << index), (1 << (index+1)));
        for (curbp = segList[index]; curbp != NULL; curbp = GETADDR(SUCC_POINT(curbp))) {
            printf("address = %p, size = %d,%d, alloc = %d,%d\n",
            curbp, GET_SIZE(HDRP(curbp)), GET_SIZE(FTRP(curbp)), GET_ALLOC(HDRP(curbp)), GET_ALLOC(FTRP(curbp)));
        }
        printf("address = %p\n", curbp);
    }
    printf("=============== freelist ===============\n");
    printf("=========================== $%s$ ===========================\n" ,func);
}

调试信息

=========================== $mm_malloc$ ===========================
================ block ================
address = 0xf69df018
size = 8, 8, alloc = 1, 1

address = 0xf69df020
size = 2104, 2104, alloc = 0, 0

address = 0xf69df858
size = 4080, 4080, alloc = 1, 1

address = 0xf69e0848
size = 4080, 4080, alloc = 1, 1

address = 0xf69e1838
size = 2024, 2024, alloc = 0, 0

address = 0xf69e2020
size = 0, alloc = 1
================ block ================

=============== freelist ===============
segList[10]: [1024,2048)
address = 0xf69e1838, size = 2024,2024, alloc = 0,0
address = (nil)
segList[11]: [2048,4096)
address = 0xf69df020, size = 2104,2104, alloc = 0,0
address = (nil)
=============== freelist ===============
=========================== $mm_malloc$ ===========================

跑分

Results for mm malloc:
trace  valid  util     ops      secs  Kops
 0       yes   98%    5694  0.000287 19833
 1       yes   97%    5848  0.000311 18780
 2       yes   99%    6648  0.000322 20652
 3       yes   99%    5380  0.000278 19339
 4       yes   66%   14400  0.000528 27283
 5       yes   93%    4800  0.000459 10462
 6       yes   90%    4800  0.000377 12719
 7       yes   55%   12000  0.000447 26864
 8       yes   51%   24000  0.001032 23254
 9       yes   45%   14401  0.023348   617
10       yes   45%   14401  0.001142 12609
Total          76%  112372  0.028531  3939

Perf index = 46 (util) + 40 (thru) = 86/100

优化方向

最佳适配(与分离链表的组合)

//寻找适合大小的空闲链表并按照从小到插入,这样利用分离适配不需要搜索所有的堆就可达到最佳适配的效果
static void insert_freelist(void *bp) {
    size_t size;
    int index;

    size = GET_SIZE(HDRP(bp));
    for (index = 4; index < 25; index++) {   //最小块为16B,即下标从4起有效
        if ((1 << index) <= size && (1 << (index+1)) > size)
            break;
    }
    void *pre_block, *post_block, *tmp;
    tmp = segList + index;
    for (post_block = segList[index]; post_block != NULL; post_block = GETADDR(SUCC_POINT(post_block))) {
        if (GET_SIZE(HDRP(post_block)) >= size) {
            pre_block = GETADDR(PRED_POINT(post_block));
            // bp 结点前后序块
            PUTADDR(PRED_POINT(bp), pre_block);
            PUTADDR(SUCC_POINT(bp), post_block);
            //前序块
            if (isSegList(pre_block)) {
                PUTADDR(pre_block, bp);
            } else {
                PUTADDR(SUCC_POINT(pre_block), bp);
            }
            //后序块
            PUTADDR(PRED_POINT(post_block), bp);
            return;
        }
        tmp = post_block;  //若只能插入链表未尾,则存储最后一个结点
    }
    //前序结点地址
    pre_block = tmp;
    //bp 结点前后序块
    PUTADDR(PRED_POINT(bp), pre_block);
    PUTADDR(SUCC_POINT(bp), NULL);
    //前序结点块
    if (isSegList(pre_block)) {
        PUTADDR(pre_block, bp);
    } else {
        PUTADDR(SUCC_POINT(pre_block), bp);
    }
}

跑分

Results for mm malloc:
trace  valid  util     ops      secs  Kops
 0       yes   99%    5694  0.000171 33240
 1       yes   99%    5848  0.000147 39674
 2       yes   99%    6648  0.000192 34661
 3       yes   99%    5380  0.000138 38986
 4       yes   66%   14400  0.000226 63717
 5       yes   96%    4800  0.000343 13978
 6       yes   95%    4800  0.000341 14068
 7       yes   55%   12000  0.000345 34732
 8       yes   51%   24000  0.000608 39487
 9       yes   40%   14401  0.022937   628
10       yes   45%   14401  0.000986 14611
Total          77%  112372  0.026435  4251

Perf index = 46 (util) + 40 (thru) = 86/100

优化成果分析

与分离链表+首次适配的得分一样啊,内存利用率竟然没有提升。。。。

总结

隐式空闲链表+首次适配+原始realloc

Results for mm malloc:
trace  valid  util     ops      secs  Kops
 0       yes   99%    5694  0.003636  1566
 1       yes   99%    5848  0.003695  1583
 2       yes   99%    6648  0.005280  1259
 3       yes  100%    5380  0.003648  1475
 4       yes   66%   14400  0.000089161254
 5       yes   92%    4800  0.004965   967
 6       yes   92%    4800  0.004495  1068
 7       yes   55%   12000  0.086989   138
 8       yes   51%   24000  0.126514   190
 9       yes   27%   14401  0.028150   512
10       yes   30%   14401  0.000831 17330
Total          74%  112372  0.268292   419

Perf index = 44 (util) + 28 (thru) = 72/100

隐式空闲链表+下一次适配+原始realloc

Results for mm malloc:
trace  valid  util     ops      secs  Kops
 0       yes   90%    5694  0.001078  5280
 1       yes   91%    5848  0.000686  8526
 2       yes   95%    6648  0.001864  3567
 3       yes   96%    5380  0.001754  3067
 4       yes   66%   14400  0.000094153518
 5       yes   91%    4800  0.002902  1654
 6       yes   89%    4800  0.002691  1783
 7       yes   55%   12000  0.009888  1214
 8       yes   51%   24000  0.003605  6658
 9       yes   27%   14401  0.028398   507
10       yes   45%   14401  0.000729 19754
Total          72%  112372  0.053688  2093

Perf index = 43 (util) + 40 (thru) = 83/100

显式空闲链表+LIFO(首次适配)+原始realloc

# 显式空闲链表 + LIFO
Results for mm malloc:
trace  valid  util     ops      secs  Kops
 0       yes   93%    5694  0.000093 60964
 1       yes   94%    5848  0.000088 66455
 2       yes   96%    6648  0.000150 44290
 3       yes   97%    5380  0.000158 34115
 4       yes   66%   14400  0.000131109756
 5       yes   89%    4800  0.000306 15686
 6       yes   85%    4800  0.000360 13315
 7       yes   55%   12000  0.001054 11385
 8       yes   51%   24000  0.001897 12652
 9       yes   26%   14401  0.032686   441
10       yes   30%   14401  0.000994 14486
Total          71%  112372  0.037918  2964

Perf index = 43 (util) + 40 (thru) = 83/100

显式空闲链表+按地址排序(首次适配)+原始realloc

# 显式空闲链表+按地址顺序维护链表
Results for mm malloc:
trace  valid  util     ops      secs  Kops
 0       yes   99%    5694  0.000137 41471
 1       yes   99%    5848  0.000123 47506
 2       yes   99%    6648  0.000168 39501
 3       yes   99%    5380  0.000136 39646
 4       yes   66%   14400  0.000129111369
 5       yes   92%    4800  0.001686  2846
 6       yes   92%    4800  0.001677  2863
 7       yes   55%   12000  0.012042   996
 8       yes   51%   24000  0.071326   336
 9       yes   27%   14401  0.029718   485
10       yes   30%   14401  0.000880 16361
Total          74%  112372  0.118024   952

Perf index = 44 (util) + 40 (thru) = 84/100

分离适配+首次适配+改进 realloc

Results for mm malloc:
trace  valid  util     ops      secs  Kops
 0       yes   98%    5694  0.000287 19833
 1       yes   97%    5848  0.000311 18780
 2       yes   99%    6648  0.000322 20652
 3       yes   99%    5380  0.000278 19339
 4       yes   66%   14400  0.000528 27283
 5       yes   93%    4800  0.000459 10462
 6       yes   90%    4800  0.000377 12719
 7       yes   55%   12000  0.000447 26864
 8       yes   51%   24000  0.001032 23254
 9       yes   45%   14401  0.023348   617
10       yes   45%   14401  0.001142 12609
Total          76%  112372  0.028531  3939

Perf index = 46 (util) + 40 (thru) = 86/100

分离适配+最佳适配+改进 realloc

Results for mm malloc:
trace  valid  util     ops      secs  Kops
 0       yes   99%    5694  0.000171 33240
 1       yes   99%    5848  0.000147 39674
 2       yes   99%    6648  0.000192 34661
 3       yes   99%    5380  0.000138 38986
 4       yes   66%   14400  0.000226 63717
 5       yes   96%    4800  0.000343 13978
 6       yes   95%    4800  0.000341 14068
 7       yes   55%   12000  0.000345 34732
 8       yes   51%   24000  0.000608 39487
 9       yes   40%   14401  0.022937   628
10       yes   45%   14401  0.000986 14611
Total          77%  112372  0.026435  4251

Perf index = 46 (util) + 40 (thru) = 86/100

会发现吞吐率很容易拿到满分,而内存利用率才是更为重要的选项.
但这个方法都用上了,搞不清楚为什么内存利用率没有提升…

你可能感兴趣的:(CSAPP,linux)