https://valgrind.org/docs/manual/ms-manual.html
valgrind --tool=massif [--massif-opts] prog [prog-args]
malloc, free
;malloc
申请内存, 实际分配内存(字节对齐, 分配器的记录头, 等等原因)
(减少缺页, 提高缓存命中)
massif
记录信息; 有一些参数# 是否对 malloc, new, free, delete 等动态分配内存的函数进行采样? 默认采样;
--heap= [default: yes]
Specifies whether heap profiling should be done.
# 如何指定内存分配对齐
--heap-admin= [default: 8]
If heap profiling is enabled, gives the number of administrative bytes per block to use. This should be an estimate of the average, since it may vary. For example, the allocator used by glibc on Linux requires somewhere between 4 to 15 bytes per block, depending on various factors. That allocator also requires admin space for freed blocks, but Massif cannot account for this.
# 栈上内存是否也进行跟踪: 临时变量内存分配释放;
--stacks= [default: no]
Specifies whether stack profiling should be done. This option slows Massif down greatly, and so is off by default. Note that Massif assumes that the main stack has size zero at start-up. This is not true, but doing otherwise accurately is difficult. Furthermore, starting at zero better indicates the size of the part of the main stack that a user program actually has control over.
--pages-as-heap= [default: no]
Tells Massif to profile memory at the page level rather than at the malloc'd block level. See above for details.
# 最大堆栈数量
--depth= [default: 30]
Maximum depth of the allocation trees recorded for detailed snapshots. Increasing it will make Massif run somewhat more slowly, use more memory, and produce bigger output files.
# 添加其他内存分配函数: 分配函数实际是 brk 系统调用, 也可以自己写一个分配器; 不用 glibc 的 malloc, free
--alloc-fn=
Functions specified with this option will be treated as though they were a heap allocation function such as malloc. This is useful for functions that are wrappers to malloc or new, which can fill up the allocation trees with uninteresting information. This option can be specified multiple times on the command line, to name multiple functions.
Note that the named function will only be treated this way if it is the top entry in a stack trace, or just below another function treated this way. For example, if you have a function malloc1 that wraps malloc, and malloc2 that wraps malloc1, just specifying --alloc-fn=malloc2 will have no effect. You need to specify --alloc-fn=malloc1 as well. This is a little inconvenient, but the reason is that checking for allocation functions is slow, and it saves a lot of time if Massif can stop looking through the stack trace entries as soon as it finds one that doesn't match rather than having to continue through all the entries.
Note that C++ names are demangled. Note also that overloaded C++ names must be written in full. Single quotes may be necessary to prevent the shell from breaking them up. For example:
--alloc-fn='operator new(unsigned, std::nothrow_t const&)'
--ignore-fn=
Any direct heap allocation (i.e. a call to malloc, new, etc, or a call to a function named by an --alloc-fn option) that occurs in a function specified by this option will be ignored. This is mostly useful for testing purposes. This option can be specified multiple times on the command line, to name multiple functions.
Any realloc of an ignored block will also be ignored, even if the realloc call does not occur in an ignored function. This avoids the possibility of negative heap sizes if ignored blocks are shrunk with realloc.
The rules for writing C++ function names are the same as for --alloc-fn above.
# 控制输出详细信息的阈值
--threshold= [default: 1.0]
The significance threshold for heap allocations, as a percentage of total memory size. Allocation tree entries that account for less than this will be aggregated. Note that this should be specified in tandem with ms_print's option of the same name.
--peak-inaccuracy= [default: 1.0]
Massif does not necessarily record the actual global memory allocation peak; by default it records a peak only when the global memory allocation size exceeds the previous peak by at least 1.0%. This is because there can be many local allocation peaks along the way, and doing a detailed snapshot for every one would be expensive and wasteful, as all but one of them will be later discarded. This inaccuracy can be changed (even to 0.0%) via this option, but Massif will run drastically slower as the number approaches zero.
# x轴的单位: i: 指令, 一行指令 x + 1; ms 按毫秒 + 1, B: 按malloc free的总字节数 + 1, 某个节点的 x 坐标位置表示当前采样前 malloc 和 free的内存总量, 如下案例;
--time-unit= [default: i]
The time unit used for the profiling. There are three possibilities: instructions executed (i), which is good for most cases; real (wallclock) time (ms, i.e. milliseconds), which is sometimes useful; and bytes allocated/deallocated on the heap and/or stack (B), which is useful for very short-run programs, and for testing purposes, because it is the most reproducible across different machines.
# 详细采样频率: 默认10; 每次都采样可以设置为 1, 更多的内容, 但是有很多可能重复的, 但是更加详细, 推荐;
--detailed-freq= [default: 10]
Frequency of detailed snapshots. With --detailed-freq=1, every snapshot is detailed.
# 最大采样数量: 超过会丢弃之前的, 可以设置大一点;
--max-snapshots= [default: 100]
The maximum number of snapshots recorded. If set to N, for all programs except very short-running ones, the final number of snapshots will be between N/2 and N.
# 输出文件名; 不推荐修改, 推荐默认;
--massif-out-file= [default: massif.out.%p]
Write the profile data to file rather than to the default output file, massif.out.. The %p and %q format specifiers can be used to embed the process ID and/or the contents of an environment variable in the name, as is the case for the core option --log-file.
ms_print
: mvc
, 数据和表示分离;-h --help
Show the help message.
--version
Show the version number.
# 显示详情阈值
--threshold= [default: 1.0]
Same as Massif's --threshold option, but applied after profiling rather than during.
# x 轴的长度
--x=<4..1000> [default: 72]
Width of the graph, in columns.
# y 轴的长度
--y=<4..1000> [default: 20]
Height of the graph, in rows.
-g
编译有调试信息;valgrind --tool=massif prog
--massif-out-file, 指定输出文件名;
默认 massif.out.
(周期采样, 可控制周期)
ms_print massif.out.12345
--------------------------------------------------------------------------------
Command: example
Massif arguments: (none)
ms_print arguments: massif.out.12797
--------------------------------------------------------------------------------
ms_print
执行参数 KB
19.63^ #
| #
| #
| #
| #
| #
| #
| #
| #
| #
| #
| #
| #
| #
| #
| #
| #
| :#
| :#
| :#
0 +----------------------------------------------------------------------->ki
0 113.4
Number of snapshots: 25
Detailed snapshots: [9, 14 (peak), 24]
x
轴, 采样: 在某个时间进行采样, 这次采样有详细堆栈; :无堆栈, @有堆栈, #有堆栈且峰值
;i
指令为时间轴; ms
为时间轴; B
:累计涉及内存malloc free
的内存都进行累加;--------------------------------------------------------------------------------
Command: ./a.out
Massif arguments: --detailed-freq=1 --threshold=0 --max-snapshots=256 --heap-admin=0 --time-unit=B
ms_print arguments: --threshold=0 massif.out.277019
--------------------------------------------------------------------------------
KB
1.000^ #
| @@#@
| @@@@#@@@
| @@@@@@#@@@@@
| @@@@@@@#@@@@@@
| @@@@@@@@@#@@@@@@@@@
| @@@@@@@@@@@#@@@@@@@@@@
| @@@@@@@@@@@@@#@@@@@@@@@@@@
| @@@@@@@@@@@@@@@#@@@@@@@@@@@@@@
| @@@@@@@@@@@@@@@@#@@@@@@@@@@@@@@@
| @@@@@@@@@@@@@@@@@@#@@@@@@@@@@@@@@@@@@
| @@@@@@@@@@@@@@@@@@@@#@@@@@@@@@@@@@@@@@@@
| @@@@@@@@@@@@@@@@@@@@@@#@@@@@@@@@@@@@@@@@@@@@
| @@@@@@@@@@@@@@@@@@@@@@@@#@@@@@@@@@@@@@@@@@@@@@@@
| @@@@@@@@@@@@@@@@@@@@@@@@@#@@@@@@@@@@@@@@@@@@@@@@@@
| @@@@@@@@@@@@@@@@@@@@@@@@@@@#@@@@@@@@@@@@@@@@@@@@@@@@@@@
| @@@@@@@@@@@@@@@@@@@@@@@@@@@@@#@@@@@@@@@@@@@@@@@@@@@@@@@@@@
| @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@#@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
| @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@#@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
| @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@#@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
0 +----------------------------------------------------------------------->KB
0 2.000
Number of snapshots: 130
Detailed snapshots: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41
, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65 (peak), 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 8
4, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122,
123, 124, 125, 126, 127, 128, 129]
--------------------------------------------------------------------------------
n time(B) total(B) useful-heap(B) extra-heap(B) stacks(B)
--------------------------------------------------------------------------------
0 0 0 0 0 0
00.00% (0B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
--------------------------------------------------------------------------------
n time(B) total(B) useful-heap(B) extra-heap(B) stacks(B)
--------------------------------------------------------------------------------
1 16 16 1 15 0
分配释放配对, 可以看到x
轴刚好是y
轴的两倍;
#include
int main() {
void* s[100]{};
for(int i = 0 ; i < 64; i++)
s[i] = malloc(1);
for(int i = 0 ; i < 64; i++)
free(s[i]);
}
每一次分配1
字节, 15
字节对齐, 64*16 == 1024
, malloc + free = 2048
--threashold=0
free
触发峰值统计;<1
不显示堆栈, --peak-inaccuracy
n
: 采样编号time
: 时间轴对应位置; 等于之前所有free + malloc
字节和;total
: 全部申请useful-heap
: 实际申请;extra-heap
: 对齐或分配器标记分配额外的; --heap-admin
影响; --alignmentx=
对齐;stack
堆栈内存, 即分析静态变量的申请释放;--------------------------------------------------------------------------------
n time(B) total(B) useful-heap(B) extra-heap(B) stacks(B)
--------------------------------------------------------------------------------
10 10,080 10,080 10,000 80 0
11 12,088 12,088 12,000 88 0
12 16,096 16,096 16,000 96 0
13 20,104 20,104 20,000 104 0
14 20,104 20,104 20,000 104 0
99.48% (20,000B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
->49.74% (10,000B) 0x804841A: main (example.c:20)
|
->39.79% (8,000B) 0x80483C2: g (example.c:5)
| ->19.90% (4,000B) 0x80483E2: f (example.c:11)
| | ->19.90% (4,000B) 0x8048431: main (example.c:23)
| |
| ->19.90% (4,000B) 0x8048436: main (example.c:25)
|
->09.95% (2,000B) 0x80483DA: f (example.c:10)
->09.95% (2,000B) 0x8048431: main (example.c:23)
14
次采样有详细信息记录;100%
?:前面说了, 因为对齐, 头等信息, 申请内存和实际申请内存有些许差异;
最顶层是内存分配函数: 即new,malloc
, --alloc-fun=xx
指定的函数等;
往下是当前这些内存的分配堆栈; 虽然可以一个堆栈一行, 但是因为有很多相同的栈; 所以节省空间, 更好显示;
free
有没有堆栈?free
也会采样, 但是用的是分配的堆栈, free
的堆栈不会记录;
每一层的内存总和相同, 越往下拆分得越多;
因为--threashold=1
, 即少于1%
的内存堆栈不显示;可以通过选项修改;
malloc,free
函数多两个?最初状态, 0; 终止状态, 终止的状态可以确定是否有内存泄漏;