一个linux进程的虚拟地址空间分布如图所示,分为内核空间和进程空间,对于一个32位操作系统来说,4GB的空间分成两部分,低地址的0~3G给用户空间,高地址的3G~4G给内核空间。
图源:http://www.dongcoder.com/detail-1060768.html
从操作系统角度看,进程分配内存有两种方式,分别由两个系统调用完成:brk 和 mmap (不考虑共享内存)
1)brk 是将数据段(.data)的最高地址指针 _edata 往高地址推
2)mmap 是在进程的虚拟地址空间中(堆和栈中间,称为“文件映射区域”的地方)找一块空闲的虚拟内存。
这两种方式分配的都是虚拟内存,没有分配物理内存。在第一次访问已分配的虚拟地址空间的时候,发生缺页中断,操作系统负责分配物理内存,然后建立虚拟内存和物理内存之间的映射关系(一般是硬件单元MMU管理)。
1)当开辟的空间小于 128K 时,调用 brk()函数,malloc 的底层实现是系统调用函数 brk(),其主要移动指针 _enddata(此时的 _enddata 指的是 Linux 地址空间中堆段的末尾地址,不是数据段的末尾地址)
2)当开辟的空间大于 128K 时,mmap()系统调用函数来在虚拟地址空间中(堆和栈中间,称为“文件映射区域”的地方)找一块空间来开辟。
TCMalloc全称Thread-Caching Malloc,即线程缓存的malloc,是 Google 开发的内存分配器,在不少项目中都有使用,例如在 Golang 中就使用了类似的算法进行内存分配。它具有现代化内存分配器的基本特征:对抗内存碎片、在多核处理器能够 scale。据称,它的内存分配速度是 glibc2.3 中实现的 malloc的数倍。实现了高效的多线程内存管理,用于替代glib等库的内存分配相关的函数(malloc、free,new,new[]等)。
TCMalloc的管理内存的策略可以用一张图描述(源:https://wallenwang.com/2018/11/tcmalloc/)
TCMalloc是gperftools的一部分,除TCMalloc外,gperftools还包括heap-checker、heap-profiler和cpu-profiler。
官方介绍http://goog-perftools.sourceforge.net/doc/tcmalloc.html
Golang的内存分配器是基于TCMalloc实现的。Golang 的程序在启动之初,会一次性从操作系统那里申请一大块内存(初始堆内存应该是 64M 左右)作为内存池。这块内存空间会放在一个叫 mheap 的 struct 中管理,mheap 负责将这一整块内存切割成不同的区域(spans, bitmap ,areana),并将其中一部分的内存切割成合适的大小,分配给用户使用。
mcache 运行时分配池,每个线程都有自己的局部内存缓存mCache,实现goroutine高并发的重要因素(分配小对象可直接从mCache中分配,不用加锁)
在golang程序初始化时,runtime中会初始化内存管理器,调用函数 mallocinit()
在文件src/runtime/malloc.go
func mallocinit() {
if class_to_size[_TinySizeClass] != _TinySize {
throw("bad TinySizeClass")
}
testdefersizes()
if heapArenaBitmapBytes&(heapArenaBitmapBytes-1) != 0 {
// heapBits expects modular arithmetic on bitmap
// addresses to work.
throw("heapArenaBitmapBytes not a power of 2")
}
// Copy class sizes out for statistics table.
for i := range class_to_size {
memstats.by_size[i].size = uint32(class_to_size[i])
}
// Check physPageSize.
if physPageSize == 0 {
// The OS init code failed to fetch the physical page size.
throw("failed to get system page size")
}
if physPageSize < minPhysPageSize {
print("system page size (", physPageSize, ") is smaller than minimum page size (", minPhysPageSize, ")\n")
throw("bad system page size")
}
if physPageSize&(physPageSize-1) != 0 {
print("system page size (", physPageSize, ") must be a power of 2\n")
throw("bad system page size")
}
if physHugePageSize&(physHugePageSize-1) != 0 {
print("system huge page size (", physHugePageSize, ") must be a power of 2\n")
throw("bad system huge page size")
}
if physHugePageSize != 0 {
// Since physHugePageSize is a power of 2, it suffices to increase
// physHugePageShift until 1<= 0; i-- {
var p uintptr
switch {
case GOARCH == "arm64" && GOOS == "darwin":
p = uintptr(i)<<40 | uintptrMask&(0x0013<<28)
case GOARCH == "arm64":
p = uintptr(i)<<40 | uintptrMask&(0x0040<<32)
case GOOS == "aix":
if i == 0 {
// We don't use addresses directly after 0x0A00000000000000
// to avoid collisions with others mmaps done by non-go programs.
continue
}
p = uintptr(i)<<40 | uintptrMask&(0xa0<<52)
case raceenabled:
// The TSAN runtime requires the heap
// to be in the range [0x00c000000000,
// 0x00e000000000).
p = uintptr(i)<<32 | uintptrMask&(0x00c0<<32)
if p >= uintptrMask&0x00e000000000 {
continue
}
default:
p = uintptr(i)<<40 | uintptrMask&(0x00c0<<32)
}
hint := (*arenaHint)(mheap_.arenaHintAlloc.alloc())
hint.addr = p
hint.next, mheap_.arenaHints = mheap_.arenaHints, hint
}
} else {
// On a 32-bit machine, we're much more concerned
// about keeping the usable heap contiguous.
// Hence:
//
// 1. We reserve space for all heapArenas up front so
// they don't get interleaved with the heap. They're
// ~258MB, so this isn't too bad. (We could reserve a
// smaller amount of space up front if this is a
// problem.)
//
// 2. We hint the heap to start right above the end of
// the binary so we have the best chance of keeping it
// contiguous.
//
// 3. We try to stake out a reasonably large initial
// heap reservation.
const arenaMetaSize = (1 << arenaBits) * unsafe.Sizeof(heapArena{})
meta := uintptr(sysReserve(nil, arenaMetaSize))
if meta != 0 {
mheap_.heapArenaAlloc.init(meta, arenaMetaSize)
}
// We want to start the arena low, but if we're linked
// against C code, it's possible global constructors
// have called malloc and adjusted the process' brk.
// Query the brk so we can avoid trying to map the
// region over it (which will cause the kernel to put
// the region somewhere else, likely at a high
// address).
procBrk := sbrk0()
// If we ask for the end of the data segment but the
// operating system requires a little more space
// before we can start allocating, it will give out a
// slightly higher pointer. Except QEMU, which is
// buggy, as usual: it won't adjust the pointer
// upward. So adjust it upward a little bit ourselves:
// 1/4 MB to get away from the running binary image.
p := firstmoduledata.end
if p < procBrk {
p = procBrk
}
if mheap_.heapArenaAlloc.next <= p && p < mheap_.heapArenaAlloc.end {
p = mheap_.heapArenaAlloc.end
}
p = alignUp(p+(256<<10), heapArenaBytes)
// Because we're worried about fragmentation on
// 32-bit, we try to make a large initial reservation.
arenaSizes := []uintptr{
512 << 20,
256 << 20,
128 << 20,
}
for _, arenaSize := range arenaSizes {
a, size := sysReserveAligned(unsafe.Pointer(p), arenaSize, heapArenaBytes)
if a != nil {
mheap_.arena.init(uintptr(a), size)
p = uintptr(a) + size // For hint below
break
}
}
hint := (*arenaHint)(mheap_.arenaHintAlloc.alloc())
hint.addr = p
hint.next, mheap_.arenaHints = mheap_.arenaHints, hint
}
}
之前说到golang程序初始化时会申请一块内存,放在mheap中管理,如下图:
mheap的关键代码如下:
type mheap struct {
lock mutex
spans []*mspan
// Malloc stats.
largealloc uint64 // bytes allocated for large objects
nlargealloc uint64 // number of large object allocations
largefree uint64 // bytes freed for large objects (>maxsmallsize)
nlargefree uint64 // number of frees for large objects (>maxsmallsize)
// range of addresses we might see in the heap
bitmap uintptr // Points to one byte past the end of the bitmap
bitmap_mapped uintptr
arena_start uintptr
arena_used uintptr // Set with setArenaUsed.
arena_alloc uintptr
arena_end uintptr
arena_reserved bool
central [numSpanClasses]struct {
mcentral mcentral
pad [sys.CacheLineSize - unsafe.Sizeof(mcentral{})%sys.CacheLineSize]byte
}
}
小对象:
大对象:
代码:
package main
import(
"fmt"
)
func main() {
aa := 1
fmt.Println(aa)
i := getObj()
fmt.Println(i)
ii := getHeapObj()
fmt.Println(ii,*ii)
}
//栈中分配内存
func getObj() int{
i := 2
return i
}
//堆中分配内存
func getHeapObj() *int{
i := 3
return &i
}
调试过程:
gdb newobj
GNU gdb (GDB) 8.0.1
Copyright (C) 2017 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-apple-darwin17.0.0".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
.
Find the GDB manual and other documentation resources online at:
.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from newobj...done.
Loading Go Runtime support.
(gdb)
(gdb) r
Starting program: /opt/newobj
[New Thread 0x2603 of process 93045]
warning: unhandled dyld version (15)
1
2
0xc0000aa020 3
[Inferior 1 (process 93045) exited normally]
(gdb) b main.main
Breakpoint 1 at 0x1094770: file /opt/newobj.go, line 7.
(gdb) n
The program is not being run.
(gdb) r
Starting program: /opt/newobj
[New Thread 0x1b03 of process 93047]
warning: unhandled dyld version (15)
[New Thread 0x190b of process 93047]
[New Thread 0x1c03 of process 93047]
[New Thread 0x1d03 of process 93047]
[New Thread 0x2303 of process 93047]
[New Thread 0x2403 of process 93047]
Thread 2 hit Breakpoint 1, main.main () at /opt/newobj.go:7
7 func main() {
(gdb) n
8 aa := 1
(gdb) s
10 fmt.Println(aa)
(gdb) n
1
12 i := getObj()
(gdb) s
main.getObj (~r0=824634101464) at /opt/newobj.go:22
22 func getObj() int{
(gdb)
23 i := 2
(gdb) s
25 return i
(gdb) s
1 : No such file or directory.
(gdb) n
main.main () at /opt/newobj.go:14
14 fmt.Println(i)
(gdb) n
2
16 ii := getHeapObj()
(gdb) s
main.getHeapObj (~r0=0xc00005ced8) at /opt/newobj.go:28
28 func getHeapObj() *int{
(gdb) s
29 i := 3
(gdb) s
runtime.newobject (typ=0x10a1e60 , ~r1=) at /opt/code/go/src/runtime/malloc.go:1150
1150 func newobject(typ *_type) unsafe.Pointer {
(gdb) n
1151 return mallocgc(typ.size, typ, true)
(gdb) n
main.getHeapObj (~r0=0x0) at /opt/newobj.go:31
31 return &i
(gdb) n
main.main () at /opt/newobj.go:18
18 fmt.Println(ii,*ii)
(gdb) n
0xc0000a6020 3
20 }
(gdb) n
runtime.main () at /opt/code/go/src/runtime/proc.go:212
212 if atomic.Load(&runningPanicDefers) != 0 {
(gdb) n
221 if atomic.Load(&panicking) != 0 {
(gdb) n
225 exit(0)
(gdb) n
[Inferior 1 (process 93047) exited normally]
可以看到在函数getHeapObj中,返回了变量i的地址,所以i在堆中分配内存,分配内存调用了runtime中的函数newobject
参考:
malloc 底层实现及原理
图解 TCMalloc
go源码分析之内存池
【golang 源码分析】内存分配与管理
【Golang源码探索】(三) GC的实现原理