* Quickstart
This library is all in one file to simplify the most common usage:
ftp it, compile it (-O3), and link it into another program. All of
the compile-time options default to reasonable values for use on
most platforms. You might later want to step through various
compile-time and dynamic tuning options.
For convenience, an include file for code using this malloc is at:
ftp://gee.cs.oswego.edu/pub/misc/malloc-2.8.6.h
You don't really need this .h file unless you call functions not
defined in your system include files. The .h file contains only the
excerpts from this file needed for using this malloc on ANSI C/C++
systems, so long as you haven't changed compile-time options about
naming and tuning parameters. If you do, then you can create your
own malloc.h that does include all settings by cutting at the point
indicated below. Note that you may already by default be using a C
library containing a malloc that is based on some version of this
malloc (for example in linux). You might still want to use the one
in this file to customize settings or to avoid overheads associated
with library versions.
这个库由1个单文件构成,编译时使用-O3优化等级。使用时需要与用户自己的代码进行链接。
一般情况下,不需要直接包含此头文件。系统中可能已经有地方包含了此头文件。
Supported pointer/size_t representation: 4 or 8 bytes
size_t MUST be an unsigned type of the same width as
pointers. (If you are using an ancient system that declares
size_t as a signed type, or need it to be a different width
than pointers, you can use a previous release of this malloc
(e.g. 2.7.2) supporting these.)
size_t和和指针变量的长度必须保持一致。都是4字节或者8字节。如果你的系统不满足这个约束,那么本库需要回退到2.7.2以前的版本才支持。
Alignment: 8 bytes (minimum)
This suffices for nearly all current machines and C compilers.
However, you can define MALLOC_ALIGNMENT to be wider than this
if necessary (up to 128bytes), at the expense of using more space.
可以定义64字节或者128字节对齐。128字节对齐会稍微浪费更多的空间
Minimum overhead per allocated chunk: 4 or 8 bytes (if 4byte sizes)
8 or 16 bytes (if 8byte sizes)
Each malloced chunk has a hidden word of overhead holding size
and status information, and additional cross-check word
if FOOTERS is defined.
每一次申请内存至少要比用户需求多消耗4字节~到16字节内存。用于维护此内存块的信息。
Minimum allocated size: 4-byte ptrs: 16 bytes (including overhead)
8-byte ptrs: 32 bytes (including overhead)
Even a request for zero bytes (i.e., malloc(0)) returns a
pointer to something of the minimum allocatable size.
The maximum overhead wastage (i.e., number of extra bytes
allocated than were requested in malloc) is less than or equal
to the minimum size, except for requests >= mmap_threshold that
are serviced via mmap(), where the worst case wastage is about
32 bytes plus the remainder from a system page (the minimal
mmap unit); typically 4096 or 8192 bytes.
最下的申请内存长度为16字节或者32字节(已包含内存块自身维护数据在内)
Security: static-safe; optionally more or less
The "security" of malloc refers to the ability of malicious
code to accentuate the effects of errors (for example, freeing
space that is not currently malloc'ed or overwriting past the
ends of chunks) in code that calls malloc. This malloc
guarantees not to modify any memory locations below the base of
heap, i.e., static variables, even in the presence of usage
errors. The routines additionally detect most improper frees
and reallocs. All this holds as long as the static bookkeeping
for malloc itself is not corrupted by some other means. This
is only one aspect of security -- these checks do not, and
cannot, detect all possible programming errors.
If FOOTERS is defined nonzero, then each allocated chunk
carries an additional check word to verify that it was malloced
from its space. These check words are the same within each
execution of a program using malloc, but differ across
executions, so externally crafted fake chunks cannot be
freed. This improves security by rejecting frees/reallocs that
could corrupt heap memory, in addition to the checks preventing
writes to statics that are always on. This may further improve
security at the expense of time and space overhead. (Note that
FOOTERS may also be worth using with MSPACES.)
By default detected errors cause the program to abort (calling
"abort()"). You can override this to instead proceed past
errors by defining PROCEED_ON_ERROR. In this case, a bad free
has no effect, and a malloc that encounters a bad address
caused by user overwrites will ignore the bad address by
dropping pointers and indices to all known memory. This may
be appropriate for programs that should continue if at all
possible in the face of programming errors, although they may
run out of memory because dropped memory is never reclaimed.
If you don't like either of these options, you can define
CORRUPTION_ERROR_ACTION and USAGE_ERROR_ACTION to do anything
else. And if if you are sure that your program using malloc has
no errors or vulnerabilities, you can define INSECURE to 1,
which might (or might not) provide a small performance improvement.
It is also possible to limit the maximum total allocatable
space, using malloc_set_footprint_limit. This is not
designed as a security feature in itself (calls to set limits
are not screened or privileged), but may be useful as one
aspect of a secure implementation.
安全性考虑。
静态安全性,或多或少有一些。
如应对释放没有申请的内存,或者内存写越界这样的问题,称为安全性问题。
在本算法中,静态变量是不会被乱写的。内部会检测释放和重分配是否合法。这些检查无法避免所有的错误。
如果FOOTERS被定义了的话,则被分配的内存增加了一个额外的信息用来做校验,这些数据在一次内存分配后,下一次内存分配前不会修订。如果在释放的时候发现这个数据被改了,则拒绝释放,否则会破坏堆内存。
缺省情况下,如果malloc发现内存被篡改,则会推出进程(abort)。用户也可以自定义处理方式,让程序继续运行而不推出,但这样做,会导致后期内存逐渐耗尽,因为被篡改的内存不会回收。空闲内存会逐渐减少。
最后最好限制一下可以申请的内存空间的总大小。
Thread-safety: NOT thread-safe unless USE_LOCKS defined non-zero
When USE_LOCKS is defined, each public call to malloc, free,
etc is surrounded with a lock. By default, this uses a plain
pthread mutex, win32 critical section, or a spin-lock if if
available for the platform and not disabled by setting
USE_SPIN_LOCKS=0. However, if USE_RECURSIVE_LOCKS is defined,
recursive versions are used instead (which are not required for
base functionality but may be needed in layered extensions).
Using a global lock is not especially fast, and can be a major
bottleneck. It is designed only to provide minimal protection
in concurrent environments, and to provide a basis for
extensions. If you are using malloc in a concurrent program,
consider instead using nedmalloc
(http://www.nedprod.com/programs/portable/nedmalloc/) or
ptmalloc (See http://www.malloc.de), which are derived from
versions of this malloc.
在高并发环境下,考虑使用其他的malloc库而不是当前的这个库,因为这个库的锁的粒度比较粗,性能较低。
System requirements: Any combination of MORECORE and/or MMAP/MUNMAP
This malloc can use unix sbrk or any emulation (invoked using
the CALL_MORECORE macro) and/or mmap/munmap or any emulation
(invoked using CALL_MMAP/CALL_MUNMAP) to get and release system
memory. On most unix systems, it tends to work best if both
MORECORE and MMAP are enabled. On Win32, it uses emulations
based on VirtualAlloc. It also uses common C library functions
like memset.
Compliance: I believe it is compliant with the Single Unix Specification
(See http://www.unix.org). Also SVID/XPG, ANSI C, and probably
others as well.
系统需求和兼容性。仅依赖于2到3个系统调用或者等价体。兼容C标准和单一UNIX规范。
* Overview of algorithms
This is not the fastest, most space-conserving, most portable, or
most tunable malloc ever written. However it is among the fastest
while also being among the most space-conserving, portable and
tunable. Consistent balance across these factors results in a good
general-purpose allocator for malloc-intensive programs.
In most ways, this malloc is a best-fit allocator. Generally, it
chooses the best-fitting existing chunk for a request, with ties
broken in approximately least-recently-used order. (This strategy
normally maintains low fragmentation.) However, for requests less
than 256bytes, it deviates from best-fit when there is not an
exactly fitting available chunk by preferring to use space adjacent
to that used for the previous small request, as well as by breaking
ties in approximately most-recently-used order. (These enhance
locality of series of small allocations.) And for very large requests
(>= 256Kb by default), it relies on system memory mapping
facilities, if supported. (This helps avoid carrying around and
possibly fragmenting memory used only for large chunks.)
All operations (except malloc_stats and mallinfo) have execution
times that are bounded by a constant factor of the number of bits in
a size_t, not counting any clearing in calloc or copying in realloc,
or actions surrounding MORECORE and MMAP that have times
proportional to the number of non-contiguous regions returned by
system allocation routines, which is often just 1. In real-time
applications, you can optionally suppress segment traversals using
NO_SEGMENT_TRAVERSAL, which assures bounded execution even when
system allocators return non-contiguous spaces, at the typical
expense of carrying around more memory and increased fragmentation.
The implementation is not very modular and seriously overuses
macros. Perhaps someday all C compilers will do as good a job
inlining modular code as can now be done by brute-force expansion,
but now, enough of them seem not to.
Some compilers issue a lot of warnings about code that is
dead/unreachable only on some platforms, and also about intentional
uses of negation on unsigned types. All known cases of each can be
ignored.
For a longer but out of date high-level description, see
http://gee.cs.oswego.edu/dl/html/malloc.html
这是一个性价比较高的内存分配算法。
它尽量从最近刚释放的内存中分配,增加cache命中率
大多数函数的执行时间是常量时间。
* MSPACES
If MSPACES is defined, then in addition to malloc, free, etc.,
this file also defines mspace_malloc, mspace_free, etc. These
are versions of malloc routines that take an "mspace" argument
obtained using create_mspace, to control all internal bookkeeping.
If ONLY_MSPACES is defined, only these versions are compiled.
So if you would like to use this allocator for only some allocations,
and your system malloc for others, you can compile with
ONLY_MSPACES and then do something like...
static mspace mymspace = create_mspace(0,0); // for example
#define mymalloc(bytes) mspace_malloc(mymspace, bytes)
(Note: If you only need one instance of an mspace, you can instead
use "USE_DL_PREFIX" to relabel the global malloc.)
You can similarly create thread-local allocators by storing
mspaces as thread-locals. For example:
static __thread mspace tlms = 0;
void* tlmalloc(size_t bytes) {
if (tlms == 0) tlms = create_mspace(0, 0);
return mspace_malloc(tlms, bytes);
}
void tlfree(void* mem) { mspace_free(tlms, mem); }
Unless FOOTERS is defined, each mspace is completely independent.
You cannot allocate from one and free to another (although
conformance is only weakly checked, so usage errors are not always
caught). If FOOTERS is defined, then each chunk carries around a tag
indicating its originating mspace, and frees are directed to their
originating spaces. Normally, this requires use of locks.
内存的申请和释放可以由多个mspace来进行管理
/* --- begin vpp customizations --- */
#if CLIB_DEBUG > 0
#define FOOTERS 1 /* extra debugging */
#define DLM_MAGIC_CONSTANT 0xdeaddabe
#endif
#define USE_LOCKS 1
#define DLM_ABORT {extern void os_panic(void); os_panic(); abort();}
#define ONLY_MSPACES 1
/* --- end vpp customizations --- */
在开启debug的情况下,每个内存块记录魔鬼数字,用于检查内存是否被写越界。
缺省的错误处理方法是使进程退出
/*
malloc(size_t n)
Returns a pointer to a newly allocated chunk of at least n bytes, or
null if no space is available, in which case errno is set to ENOMEM
on ANSI C systems.
If n is zero, malloc returns a minimum-sized chunk. (The minimum
size is 16 bytes on most 32bit systems, and 32 bytes on 64bit
systems.) Note that size_t is an unsigned type, so calls with
arguments that would be negative if signed are interpreted as
requests for huge amounts of space, which will often fail. The
maximum supported value of n differs across systems, but is in all
cases less than the maximum representable value of a size_t.
*/
DLMALLOC_EXPORT void* dlmalloc(size_t);
申请给定大小的内存空间,返回内存空间的首地址。大小参数不要填成了负数,否则会当成正数解释,而去获取一个很大的内存,导致失败。
/*
free(void* p)
Releases the chunk of memory pointed to by p, that had been previously
allocated using malloc or a related routine such as realloc.
It has no effect if p is null. If p was not malloced or already
freed, free(p) will by default cause the current program to abort.
*/
DLMALLOC_EXPORT void dlfree(void*);
释放之前申请的内存块。如果重复释放或者释放不存在的内存块,则进程一般情况下会退出。
/*
calloc(size_t n_elements, size_t element_size);
Returns a pointer to n_elements * element_size bytes, with all locations
set to zero.
*/
DLMALLOC_EXPORT void* dlcalloc(size_t, size_t);
申请1个数组的内存,并初始化所有元素为0
/*
realloc(void* p, size_t n)
Returns a pointer to a chunk of size n that contains the same data
as does chunk p up to the minimum of (n, p's size) bytes, or null
if no space is available.
The returned pointer may or may not be the same as p. The algorithm
prefers extending p in most cases when possible, otherwise it
employs the equivalent of a malloc-copy-free sequence.
If p is null, realloc is equivalent to malloc.
If space is not available, realloc returns null, errno is set (if on
ANSI) and p is NOT freed.
if n is for fewer bytes than already held by p, the newly unused
space is lopped off and freed if possible. realloc with a size
argument of zero (re)allocates a minimum-sized chunk.
The old unix realloc convention of allowing the last-free'd chunk
to be used as an argument to realloc is not supported.
*/
DLMALLOC_EXPORT void* dlrealloc(void*, size_t);
重新申请内存大小,即内存伸缩。可以简单理解成申请一块新内存,将旧内存的数据拷贝到新内存,释放旧内存。
/*
realloc_in_place(void* p, size_t n)
Resizes the space allocated for p to size n, only if this can be
done without moving p (i.e., only if there is adjacent space
available if n is greater than p's current allocated size, or n is
less than or equal to p's size). This may be used instead of plain
realloc if an alternative allocation strategy is needed upon failure
to expand space; for example, reallocation of a buffer that must be
memory-aligned or cleared. You can use realloc_in_place to trigger
these alternatives only when needed.
Returns p if successful; otherwise null.
*/
DLMALLOC_EXPORT void* dlrealloc_in_place(void*, size_t);
与上一个函数语义相同,不同的地方是,直接在当前内存上进行伸缩,而不是重新申请内存,效率较高。但这个容易失败,特别是内存扩展的时候,如果旁边的内存块不是空闲的话,想一想?
/*
memalign(size_t alignment, size_t n);
Returns a pointer to a newly allocated chunk of n bytes, aligned
in accord with the alignment argument.
The alignment argument should be a power of two. If the argument is
not a power of two, the nearest greater power is used.
8-byte alignment is guaranteed by normal malloc calls, so don't
bother calling memalign with an argument of 8 or less.
Overreliance on memalign is a sure way to fragment space.
*/
DLMALLOC_EXPORT void* dlmemalign(size_t, size_t);
申请一个有字节对齐要求的内存块。
/*
int posix_memalign(void** pp, size_t alignment, size_t n);
Allocates a chunk of n bytes, aligned in accord with the alignment
argument. Differs from memalign only in that it (1) assigns the
allocated memory to *pp rather than returning it, (2) fails and
returns EINVAL if the alignment is not a power of two (3) fails and
returns ENOMEM if memory cannot be allocated.
*/
DLMALLOC_EXPORT int dlposix_memalign(void**, size_t, size_t);
与上一个函数类似,内存块由输出参数返回,函数本身返回错误代码。
/*
valloc(size_t n);
Equivalent to memalign(pagesize, n), where pagesize is the page
size of the system. If the pagesize is unknown, 4096 is used.
*/
DLMALLOC_EXPORT void* dlvalloc(size_t);
同memalign, pagesize参数取系统默认值。
/*
mallopt(int parameter_number, int parameter_value)
Sets tunable parameters The format is to provide a
(parameter-number, parameter-value) pair. mallopt then sets the
corresponding parameter to the argument value if it can (i.e., so
long as the value is meaningful), and returns 1 if successful else
0. To workaround the fact that mallopt is specified to use int,
not size_t parameters, the value -1 is specially treated as the
maximum unsigned size_t value.
SVID/XPG/ANSI defines four standard param numbers for mallopt,
normally defined in malloc.h. None of these are use in this malloc,
so setting them has no effect. But this malloc also supports other
options in mallopt. See below for details. Briefly, supported
parameters are as follows (listed defaults are for "typical"
configurations).
Symbol param # default allowed param values
M_TRIM_THRESHOLD -1 2*1024*1024 any (-1 disables)
M_GRANULARITY -2 page size any power of 2 >= page size
M_MMAP_THRESHOLD -3 256*1024 any (or 0 if no MMAP support)
*/
DLMALLOC_EXPORT int dlmallopt(int, int);
设置malloc的全局参数,会影响malloc的行为。
/*
malloc_footprint();
Returns the number of bytes obtained from the system. The total
number of bytes allocated by malloc, realloc etc., is less than this
value. Unlike mallinfo, this function returns only a precomputed
result, so can be called frequently to monitor memory consumption.
Even if locks are otherwise defined, this function does not use them,
so results might not be up to date.
*/
DLMALLOC_EXPORT size_t dlmalloc_footprint(void);
当前malloc内存用量,这个统计不是实时的,有点延迟,大致准确。但是多线程安全。
/*
malloc_max_footprint();
Returns the maximum number of bytes obtained from the system. This
value will be greater than current footprint if deallocated space
has been reclaimed by the system. The peak number of bytes allocated
by malloc, realloc etc., is less than this value. Unlike mallinfo,
this function returns only a precomputed result, so can be called
frequently to monitor memory consumption. Even if locks are
otherwise defined, this function does not use them, so results might
not be up to date.
*/
DLMALLOC_EXPORT size_t dlmalloc_max_footprint(void);
malloc用量的历史最大值
/*
malloc_footprint_limit();
Returns the number of bytes that the heap is allowed to obtain from
the system, returning the last value returned by
malloc_set_footprint_limit, or the maximum size_t value if
never set. The returned value reflects a permission. There is no
guarantee that this number of bytes can actually be obtained from
the system.
*/
DLMALLOC_EXPORT size_t dlmalloc_footprint_limit();
malloc用量的理论极限值。
/*
malloc_set_footprint_limit();
Sets the maximum number of bytes to obtain from the system, causing
failure returns from malloc and related functions upon attempts to
exceed this value. The argument value may be subject to page
rounding to an enforceable limit; this actual value is returned.
Using an argument of the maximum possible size_t effectively
disables checks. If the argument is less than or equal to the
current malloc_footprint, then all future allocations that require
additional system memory will fail. However, invocation cannot
retroactively deallocate existing used memory.
*/
DLMALLOC_EXPORT size_t dlmalloc_set_footprint_limit(size_t bytes);
设置malloc用量的理论极限值
/*
malloc_inspect_all(void(*handler)(void *start,
void *end,
size_t used_bytes,
void* callback_arg),
void* arg);
Traverses the heap and calls the given handler for each managed
region, skipping all bytes that are (or may be) used for bookkeeping
purposes. Traversal does not include include chunks that have been
directly memory mapped. Each reported region begins at the start
address, and continues up to but not including the end address. The
first used_bytes of the region contain allocated data. If
used_bytes is zero, the region is unallocated. The handler is
invoked with the given callback argument. If locks are defined, they
are held during the entire traversal. It is a bad idea to invoke
other malloc functions from within the handler.
For example, to count the number of in-use chunks with size greater
than 1000, you could write:
static int count = 0;
void count_chunks(void* start, void* end, size_t used, void* arg) {
if (used >= 1000) ++count;
}
then:
malloc_inspect_all(count_chunks, NULL);
malloc_inspect_all is compiled only if MALLOC_INSPECT_ALL is defined.
*/
遍历所有的malloc内存块,执行指定的函数。
/*
mallinfo()
Returns (by copy) a struct containing various summary statistics:
arena: current total non-mmapped bytes allocated from system
ordblks: the number of free chunks
smblks: always zero.
hblks: current number of mmapped regions
hblkhd: total bytes held in mmapped regions
usmblks: the maximum total allocated space. This will be greater
than current total if trimming has occurred.
fsmblks: always zero
uordblks: current total allocated space (normal or mmapped)
fordblks: total free space
keepcost: the maximum number of bytes that could ideally be released
back to system via malloc_trim. ("ideally" means that
it ignores page restrictions etc.)
Because these fields are ints, but internal bookkeeping may
be kept as longs, the reported values may wrap around zero and
thus be inaccurate.
*/
DLMALLOC_EXPORT struct dlmallinfo dlmallinfo(void);
获取malloc相关的统计信息
/*
independent_calloc(size_t n_elements, size_t element_size, void* chunks[]);
independent_calloc is similar to calloc, but instead of returning a
single cleared space, it returns an array of pointers to n_elements
independent elements that can hold contents of size elem_size, each
of which starts out cleared, and can be independently freed,
realloc'ed etc. The elements are guaranteed to be adjacently
allocated (this is not guaranteed to occur with multiple callocs or
mallocs), which may also improve cache locality in some
applications.
The "chunks" argument is optional (i.e., may be null, which is
probably the most typical usage). If it is null, the returned array
is itself dynamically allocated and should also be freed when it is
no longer needed. Otherwise, the chunks array must be of at least
n_elements in length. It is filled in with the pointers to the
chunks.
In either case, independent_calloc returns this pointer array, or
null if the allocation failed. If n_elements is zero and "chunks"
is null, it returns a chunk representing an array with zero elements
(which should be freed if not wanted).
Each element must be freed when it is no longer needed. This can be
done all at once using bulk_free.
independent_calloc simplifies and speeds up implementations of many
kinds of pools. It may also be useful when constructing large data
structures that initially have a fixed number of fixed-sized nodes,
but the number is not known at compile time, and some of the nodes
may later need to be freed. For example:
struct Node { int item; struct Node* next; };
struct Node* build_list() {
struct Node** pool;
int n = read_number_of_nodes_needed();
if (n <= 0) return 0;
pool = (struct Node**)(independent_calloc(n, sizeof(struct Node), 0);
if (pool == 0) die();
// organize into a linked list...
struct Node* first = pool[0];
for (i = 0; i < n-1; ++i)
pool[i]->next = pool[i+1];
free(pool); // Can now free the array (or not, if it is needed later)
return first;
}
*/
DLMALLOC_EXPORT void** dlindependent_calloc(size_t, size_t, void**);
申请数组内存,以指针数组的形式返回。实际上申请了2级内存。
/*
independent_comalloc(size_t n_elements, size_t sizes[], void* chunks[]);
independent_comalloc allocates, all at once, a set of n_elements
chunks with sizes indicated in the "sizes" array. It returns
an array of pointers to these elements, each of which can be
independently freed, realloc'ed etc. The elements are guaranteed to
be adjacently allocated (this is not guaranteed to occur with
multiple callocs or mallocs), which may also improve cache locality
in some applications.
The "chunks" argument is optional (i.e., may be null). If it is null
the returned array is itself dynamically allocated and should also
be freed when it is no longer needed. Otherwise, the chunks array
must be of at least n_elements in length. It is filled in with the
pointers to the chunks.
In either case, independent_comalloc returns this pointer array, or
null if the allocation failed. If n_elements is zero and chunks is
null, it returns a chunk representing an array with zero elements
(which should be freed if not wanted).
Each element must be freed when it is no longer needed. This can be
done all at once using bulk_free.
independent_comallac differs from independent_calloc in that each
element may have a different size, and also that it does not
automatically clear elements.
independent_comalloc can be used to speed up allocation in cases
where several structs or objects must always be allocated at the
same time. For example:
struct Head { ... }
struct Foot { ... }
void send_message(char* msg) {
int msglen = strlen(msg);
size_t sizes[3] = { sizeof(struct Head), msglen, sizeof(struct Foot) };
void* chunks[3];
if (independent_comalloc(3, sizes, chunks) == 0)
die();
struct Head* head = (struct Head*)(chunks[0]);
char* body = (char*)(chunks[1]);
struct Foot* foot = (struct Foot*)(chunks[2]);
// ...
}
In general though, independent_comalloc is worth using only for
larger values of n_elements. For small values, you probably won't
detect enough difference from series of malloc calls to bother.
Overuse of independent_comalloc can increase overall memory usage,
since it cannot reuse existing noncontiguous small chunks that
might be available for some of the elements.
*/
DLMALLOC_EXPORT void** dlindependent_comalloc(size_t, size_t*, void**);
独立的通用内存申请。比独立的指针数组二级内存申请更加灵活。
相当于指针数组中,每一个指针元素指向的内存大小可以不同。
/*
bulk_free(void* array[], size_t n_elements)
Frees and clears (sets to null) each non-null pointer in the given
array. This is likely to be faster than freeing them one-by-one.
If footers are used, pointers that have been allocated in different
mspaces are not freed or cleared, and the count of all such pointers
is returned. For large arrays of pointers with poor locality, it
may be worthwhile to sort this array before calling bulk_free.
*/
DLMALLOC_EXPORT size_t dlbulk_free(void**, size_t n_elements);
释放一组内存,这些内存由指针数组记录。