Valgrind memcheck介绍以及在海思himix410平台的安装使用

Valgrind memcheck介绍以及在海思himix410平台的安装使用

    • 概要
    • Valgrind 介绍
    • Memcheck 工具简介
      • 内存泄漏类型细分
      • 内存泄漏举例: definitely lost
      • 内存泄漏举例: indirectly lost
      • 内存泄漏举例: possibly lost
      • 内存泄漏举例: still reachable
      • 内存泄漏类型总结
    • 在海思himix410上交叉编译安装Valgrind
      • 安装交叉编译工具链
      • 修改 Valgrind 3.19.0 源码
      • 编译安装 Valgrind 3.19.0
      • 在开发板上运行Valgrind: 报 debuginfo 错误
      • 解决 glibc debuginfo 报错:不用编译glibc

概要

Valgrind 的 memcheck 是一个强大的内存检测和分析工具. 本文对 memcheck 的几种类型做介绍并举例; 然后给出在海思 himix410 工具链对应的编译步骤, 包括在开发板上运行时提示缺少 glibc debuginfo 报错的解决方法.

Valgrind 介绍

Valgrind 是一个内存检测工具框架,包括核心的 coregrind 库,独立的 IR 系统 libVEX, 以及基于 coregrind 和 libVEX 的一系列内存检测分析工具,最常用的是 memcheck, 其他还有 callgrind, cachegrind 等.

Valgrind 主要是在 Linux 系统上运行.通过 sudo apt install valgrind 进行安装.

在 ADAS 相关项目中,使用到海思的开发板,对应的交叉编译工具链为 himix410.即:使用 himix410 编译器交叉编译出 valgrind 后,能够运行和检测开发板上运行的程序中是否有内存泄漏,泄漏的位置是哪里,等等.

Memcheck 工具简介

Valgrind 中最常用的工具是 Memcheck, 它会报告出被代理运行的程序中的内存问题,具体又可以分成如下5种:

内存泄漏类型细分

内存泄漏细分类型 含义
definitely lost 确信无疑的内存泄漏
indirectly lost 间接的内存泄漏
possibly lost 可能有内存泄漏
still reachable 程序退出时仍然可以访问的内存,也是泄漏
suppressed 被抑制的

内存泄漏举例: definitely lost

definitely lost意思是:确信无疑的内存泄漏,缺少free/detelete/delete[]操作。
举例:

#include 
#include 

int main()
{
    float* data = (float*)malloc(sizeof(float)*5);

    return 0;
}
gcc example2.c -g 
valgrind --leak-check=yes ./a.out

23257 Memcheck, a memory error detector
23257 Copyright © 2002-2015, and GNU GPL’d, by Julian Seward et al.
23257 Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
23257 Command: ./a.out
23257
23257
23257 HEAP SUMMARY:
23257 in use at exit: 20 bytes in 1 blocks
23257 total heap usage: 1 allocs, 0 frees, 20 bytes allocated
23257
23257 20 bytes in 1 blocks are definitely lost in loss record 1 of 1
23257 at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
23257 by 0x400577: main (example2.c:6)
23257
23257 LEAK SUMMARY:
23257 definitely lost: 20 bytes in 1 blocks
23257 indirectly lost: 0 bytes in 0 blocks
23257 possibly lost: 0 bytes in 0 blocks
23257 still reachable: 0 bytes in 0 blocks
23257 suppressed: 0 bytes in 0 blocks
23257
23257 For counts of detected and suppressed errors, rerun with: -v
23257 ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

内存泄漏举例: indirectly lost

indirectly lost,说的很玄乎,还是看例子比较直观:

#include 
#include 

typedef struct Image {
    int h, w, c;
    float* data;
} Image;

int main()
{
    Image* im = (Image*)malloc(sizeof(Image));
    int h = 224, w = 224, c = 3;
    im->h = h;
    im->w = w;
    im->c = c;
    im->data = (float*)malloc(sizeof(float)*h*w*c);

    return 0;
}
gcc example2.c -g
valgrind --leak-check=yes ./a.out

23580 Memcheck, a memory error detector
23580 Copyright © 2002-2015, and GNU GPL’d, by Julian Seward et al.
23580 Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
23580 Command: ./a.out
23580
23580
23580 HEAP SUMMARY:
23580 in use at exit: 602,136 bytes in 2 blocks
23580 total heap usage: 2 allocs, 0 frees, 602,136 bytes allocated
23580
23580 602,136 (24 direct, 602,112 indirect) bytes in 1 blocks are definitely lost in loss record 2 of 2
23580 at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
23580 by 0x400537: main (example2.c:11)
23580
23580 LEAK SUMMARY:
23580 definitely lost: 24 bytes in 1 blocks
23580 indirectly lost: 602,112 bytes in 1 blocks
23580 possibly lost: 0 bytes in 0 blocks
23580 still reachable: 0 bytes in 0 blocks
23580 suppressed: 0 bytes in 0 blocks
23580
23580 For counts of detected and suppressed errors, rerun with: -v
23580 ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

样例代码中,缺少的是:

free(im->data);
free(im);

因为没有free(im),因此im对应的内存泄漏了;而im对应的内存块里面,有一个data字段,data字段现在也没有被释放,因此,602,112字节说的是224*224*4*sizeof(float)这么一大块内存是indirectly lost。需要注意的是,definitely lost此时为24字节,如果调用了free(im)也就是消除了当前的definitely lost,则当前的indirectly lost会升级为definitely lost

内存泄漏举例: possibly lost

找到possibly lost的例子并不很容易。按官方说法是,指针指向malloc申请的内存,然后指针往后++,再free这个指针,就得到possibly lost。看看下面这两个例子,按照这个解释,其实分别得到possibly lost和definitely lost。

possibly lost:

#include 
#include 

int* g_p1;

void fun()
{
    g_p1 = (int*)malloc(sizeof(int)*10);
    g_p1++;
}

int main()
{
    fun();
    free(g_p1);

    return 0;
}

24652 Memcheck, a memory error detector
24652 Copyright © 2002-2015, and GNU GPL’d, by Julian Seward et al.
24652 Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
24652 Command: ./a.out
24652
24652 Invalid free() / delete / delete[] / realloc()
24652 at 0x4C2EDEB: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
24652 by 0x4005AC: main (example2.c:15)
24652 Address 0x5204044 is 4 bytes inside a block of size 40 alloc’d
24652 at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
24652 by 0x400573: fun (example2.c:8)
24652 by 0x40059D: main (example2.c:14)
24652
24652
24652 HEAP SUMMARY:
24652 in use at exit: 40 bytes in 1 blocks
24652 total heap usage: 1 allocs, 1 frees, 40 bytes allocated
24652
24652 40 bytes in 1 blocks are possibly lost in loss record 1 of 1
24652 at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
24652 by 0x400573: fun (example2.c:8)
24652 by 0x40059D: main (example2.c:14)
24652
24652 LEAK SUMMARY:
24652 definitely lost: 0 bytes in 0 blocks
24652 indirectly lost: 0 bytes in 0 blocks
24652 possibly lost: 40 bytes in 1 blocks
24652 still reachable: 0 bytes in 0 blocks
24652 suppressed: 0 bytes in 0 blocks
24652
24652 For counts of detected and suppressed errors, rerun with: -v
24652 ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0)

definitely lost:

#include 
#include 

int main()
{
    int* p1 = (int*)malloc(sizeof(int)*10);
    p1++;
    free(p1);

    return 0;
}

24703 Memcheck, a memory error detector
24703 Copyright © 2002-2015, and GNU GPL’d, by Julian Seward et al.
24703 Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
24703 Command: ./a.out
24703
24703 Invalid free() / delete / delete[] / realloc()
24703 at 0x4C2EDEB: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
24703 by 0x40058C: main (example22.c:8)
24703 Address 0x5204044 is 4 bytes inside a block of size 40 alloc’d
24703 at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
24703 by 0x400577: main (example22.c:6)
24703
24703
24703 HEAP SUMMARY:
24703 in use at exit: 40 bytes in 1 blocks
24703 total heap usage: 1 allocs, 1 frees, 40 bytes allocated
24703
24703 40 bytes in 1 blocks are definitely lost in loss record 1 of 1
24703 at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
24703 by 0x400577: main (example22.c:6)
24703
24703 LEAK SUMMARY:
24703 definitely lost: 40 bytes in 1 blocks
24703 indirectly lost: 0 bytes in 0 blocks
24703 possibly lost: 0 bytes in 0 blocks
24703 still reachable: 0 bytes in 0 blocks
24703 suppressed: 0 bytes in 0 blocks
24703
24703 For counts of detected and suppressed errors, rerun with: -v
24703 ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0)

内存泄漏举例: still reachable

和例子2类似,still reacheable类型的内存泄漏,和definitely lost类型的内存泄漏,我觉得仍然是一个是全局变量的指针、另一个是局部变量指针的区别。来看例子:

still reachable

#include 
#include 

int* g_p1;

void fun()
{
    g_p1 = (int*)malloc(sizeof(int)*10);
}

int main()
{
    fun();

    return 0;
}

24892 Memcheck, a memory error detector
24892 Copyright © 2002-2015, and GNU GPL’d, by Julian Seward et al.
24892 Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
24892 Command: ./a.out
24892
24892
24892 HEAP SUMMARY:
24892 in use at exit: 40 bytes in 1 blocks
24892 total heap usage: 1 allocs, 0 frees, 40 bytes allocated
24892
24892 LEAK SUMMARY:
24892 definitely lost: 0 bytes in 0 blocks
24892 indirectly lost: 0 bytes in 0 blocks
24892 possibly lost: 0 bytes in 0 blocks
24892 still reachable: 40 bytes in 1 blocks
24892 suppressed: 0 bytes in 0 blocks
24892 Reachable blocks (those to which a pointer was found) are not shown.
24892 To see them, rerun with: --leak-check=full --show-leak-kinds=all
24892
24892 For counts of detected and suppressed errors, rerun with: -v
24892 ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

definitely lost
其实这个例子就是“例子0:definitely lost”的例子,2333:

#include 
#include 

int main()
{
    int* g_p1 = (int*)malloc(sizeof(int)*10);

    return 0;
}

24984 Memcheck, a memory error detector
24984 Copyright © 2002-2015, and GNU GPL’d, by Julian Seward et al.
24984 Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
24984 Command: ./a.out
24984
24984
24984 HEAP SUMMARY:
24984 in use at exit: 40 bytes in 1 blocks
24984 total heap usage: 1 allocs, 0 frees, 40 bytes allocated
24984
24984 40 bytes in 1 blocks are definitely lost in loss record 1 of 1
24984 at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
24984 by 0x400537: main (example33.c:6)
24984
24984 LEAK SUMMARY:
24984 definitely lost: 40 bytes in 1 blocks
24984 indirectly lost: 0 bytes in 0 blocks
24984 possibly lost: 0 bytes in 0 blocks
24984 still reachable: 0 bytes in 0 blocks
24984 suppressed: 0 bytes in 0 blocks
24984
24984 For counts of detected and suppressed errors, rerun with: -v
24984 ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

内存泄漏类型总结

  1. definitely lost: 局部变量,用了malloc/calloc申请的堆内存,没有释放
  2. indirectly lost: 某个局部变量结构体对象本身是从malloc/calloc申请的堆内存,没有释放;并且,这个结构体的某个成员是个指针,也是用malloc/calloc从堆上申请的内存,也没有释放。此时,这个成员指向的内存,是indirectly lost。如果这个结构体对象被释放了,那么这个成员指向的内存,就是definitely lost。
  3. possibly lost:全局变量,指针类型,在某个函数中用malloc/calloc申请了堆内存给它,然后指针往后偏移了一部分,但是仍然处于这块堆内存里面,并且最后free了这个指针。那么这是possibly lost。
  4. still reachable:全局变量,指针类型,在某个函数中用malloc/calloc申请了堆内存给它,并且这个指针没有偏移;但是呢,最终程序没有free掉这个指针。这就是still reachable。

不管是以上哪种类型的内存泄漏,最理想最安全的做法都应该是,修复其中每种错误,确保没有内存泄漏。然而事实上也许并不容易,因为一旦你的程序用了第三方库,而这个第三方库有内存泄漏,那就需要搞清楚是谁写了bug。典型的例子是OpenCV。

在海思himix410上交叉编译安装Valgrind

安装交叉编译工具链

安装时需要有 root 权限,安全角度出发我使用了 docker 环境,安装到了默认的 /opt/arm-himix410-linux 目录, 并保存生成了 docker 镜像.

修改 Valgrind 3.19.0 源码

当前(2022.07.10) Valgrind 最新版本是 3.19.0,从官网下载.

启动 docker 并映射 valgrind-3.19.0 解压目录到 docker 中,切换到 valgrind-3.19.0 目录,执行:

sudo ./autogen.sh

会生成 configure 文件.

修改 configure 文件: himix410 是 armv7l 架构,可通过连接开发板后输入 uname -m 获知. 在 configure 文件里找到 armv7*) 改为 armv7*|arm).

编译安装 Valgrind 3.19.0

# 导出编译工具链相关的环境变量,它们会被后续的make阶段的makefile读取使用
export CC=/opt/arm-himix410-linux/bin/arm-himix410-linux-gcc
export CXX=/opt/arm-himix410-linux/bin/arm-himix410-linux-g++
export CPP=/opt/arm-himix410-linux/bin/arm-himix410-linux-cpp
export AR=/opt/arm-himix410-linux/bin/arm-himix410-linux-ar 
export LD=/opt/arm-himix410-linux/bin/arm-himix410-linux-ld

# 编译安装的 valgrind 最终放在 开发板上的目录
export DEVBOARD_VALGRIND_DIR=/mnt/sd1/test/test-zz-new/valgrind

# 编译后在本机(host)上安装的目录
export HOST_INSTALL_DIR=/home/install

./configure \
    --prefix=$DEVBOARD_VALGRIND_DIR \
    --target=armv7-himix410-linux \
    --host=armv7-himix410-linux \
    --program-prefix=hisi-

make -j4
make install DESTDIR=$HOST_INSTALL_DIR

其中

  • --prefix 指定的路径是和下一步拷贝到开发板上的路径保持一致, 否则容易引发找不到动态库, fork/exec 等系统调用出错的问题, 需要仔细确认并尽量不修改;
  • --program-prefix=hisi- 意思是生成得到的可执行程序会带有前缀 hisi-, 即hisi-valgrind.
  • DESTDIR 表示在本机上的安装路径

拷贝 Valgrind 到开发板
通过 NFS 挂载硬盘到开发板.
通过 cp / scp 等命令拷贝前面一步放在 DESTDIR 里的 valgrind 安装子目录到 NFS 挂载的硬盘.

scp -r valgrind [email protected]:/home/xx/adas/nfs/xxx/test/test-zz-new-valgrind

然后 telnet 到开发板, 调整一下目录, 目标是:安装目录和前面设置的 DEVELOPER_VALGRIND_DIR 一致。

在开发板上运行Valgrind: 报 debuginfo 错误

在板子上调用 valgrind:

$ cd /mnt/sd1/test/test-zz-new

$ ls
testbed-zz  valgrind

$ ./valgrind/bin/hisi-valgrind ./testbed-zz 

看到输出,能运行 valgirnd(注意,并没有提示 memcheck 没找到); 不过提示说缺少 debug 信息:

[root@mdvr:test-zz-new]# ./valgrind/bin/hisi-valgrind ./testbed-zz 
==10740== Memcheck, a memory error detector
==10740== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==10740== Using Valgrind-3.19.0 and LibVEX; rerun with -h for copyright info
==10740== Command: ./testbed-zz
==10740== 

valgrind:  Fatal error at startup: a function redirection
valgrind:  which is mandatory for this platform-tool combination
valgrind:  cannot be set up.  Details of the redirection are:
valgrind:  
valgrind:  A must-be-redirected function
valgrind:  whose name matches the pattern:      index
valgrind:  in an object with soname matching:   ld-linux.so.3
valgrind:  was not found whilst processing
valgrind:  symbols from the object with soname: ld-linux.so.3
valgrind:  
valgrind:  Possible fixes: (1, short term): install glibc's debuginfo
valgrind:  package on this machine.  (2, longer term): ask the packagers
valgrind:  for your Linux distribution to please in future ship a non-
valgrind:  stripped ld.so (or whatever the dynamic linker .so is called)
valgrind:  that exports the above-named function using the standard
valgrind:  calling conventions for this platform.  The package you need
valgrind:  to install for fix (1) is called
valgrind:  
valgrind:    On Debian, Ubuntu:                 libc6-dbg
valgrind:    On SuSE, openSuSE, Fedora, RHEL:   glibc-debuginfo
valgrind:  
valgrind:  Note that if you are debugging a 32 bit process on a
valgrind:  64 bit system, you will need a corresponding 32 bit debuginfo
valgrind:  package (e.g. libc6-dbg:i386).
valgrind:  
valgrind:  Cannot continue -- exiting now.  Sorry.

解决 glibc debuginfo 报错:不用编译glibc

绝大多数 Linux 下的可执行程序, 包括编译出来准备用 valgrind 做内存泄漏检测的 himix410 上的可执行程序, 都是动态链接了 glibc 的. 这些 ELF 文件里有一个在编译阶段标出来的 .interp 字段(又叫 dynamic interpreter), 是/lib/ld-2.29.so.

对于 Valgrind, 尤其是 memcheck 工具, 为了准确的提供报错信息, 需要提供被检测的可执行程序的调试信息, 也就是前面报错提到的 debuginfo.

在 Linux x86 平台通过 apt install valgrind 时, 会自动安装关联的依赖库 libc6-dbg, 它提供了 glibc 的调试符号信息, 使得尽管 /lib/ld-2.29.so 没有内置调试符号信息, 但搭配 libc6-dbg 后就能让 memcheck 正常工作.

而在海思开发板上, 一方面 /lib/ld-29.so 位于只读文件系统无法被修改, 另一方面如果要重新做系统镜像并注意修改 makefile 来保留 glibc 的调试符号信息, 虽然可行但是门槛较高.

通过 man ld.so 可以知道: 能够手动指定 ld.so 的路径去执行 可执行文件:
Valgrind memcheck介绍以及在海思himix410平台的安装使用_第1张图片
通过使用 ld.so some_elf 替代 ./some_elf , 能够临时切换不同版本的 ld-so.

找到工具链里到带调试符号信息的 ld-2.29.so 并拷贝到开发板

find /opt/arm-himix410-linux/ -name '*ld*.so'
cp /opt/arm-himix410-linux/target/lib/ld-2.29.so $HOST_INSTALL_DIR/$DEVBOARD_VALGRIND_DIR/lib/valgrind
# 然后, 重新 cp / scp 一份整个 DESTDIR 目录到开发版

因此在海思 himix410 上的解决方法就自然而然的得到了:

原来的:

./valgrind/bin/hisi-valgrind ./testbed-zz 

改为

./valgrind/bin/hisi-valgrind ./valgrind/lib/valgrind/ld-2.29.so ./testbed-zz 

也就是多增加了一个 ./valgrind/lib/valgrind/ld-2.29.so(有调试符号信息), 代理掉了 testbed-zz 里的默认的 /lib/ld-2.29.so(不带调试符号信息)

完整的内存泄漏检测命令(保存log到文件memchk.log)

./test-zz-new/valgrind/bin/hisi-valgrind \
	--tool=memcheck \
	--leak-check=full \
	--show-reachable=yes \
	--track-origins=yes \
	--log-file=memchk.log  \
	-v ./test-zz-new/valgrind/lib/valgrind/ld-2.29.so  \
	./可执行程序

你可能感兴趣的:(ADAS,内存,linux)