使用 valgrind + memcheck 运行程序, 查找错误的线程id.
./valgrind --leak-check=yes --show-reachable=yes ./test // 这里test为目标程序
==1499== Thread 2: ==1499== Invalid read of size 4 ==1499== at 0x9D70: thread_exec (test.c:152) ==1499== Address 0x0 is not stack'd, malloc'd or (recently) free'd ==1499== ==1499== ==1499== Process terminating with default action of signal 11 (SIGSEGV) ==1499== Access not within mapped region at address 0x0 ==1499== at 0x9D70: thread_exec (test.c:152) ==1499== If you believe this happened as a result of a stack ==1499== overflow in your program's main thread (unlikely but ==1499== possible), you can try to increase the size of the ==1499== main thread stack using the --main-stacksize= flag. ==1499== The main thread stack size used in this run was 8388608. ==1499== ==1499== HEAP SUMMARY: ==1499== in use at exit: 272 bytes in 2 blocks ==1499== total heap usage: 2 allocs, 0 frees, 272 bytes allocated ==1499== ==1499== Thread 1: ==1499== 272 bytes in 2 blocks are possibly lost in loss record 1 of 1 ==1499== at 0x4832240: calloc (vg_replace_malloc.c:593) ==1499== by 0x4011203: _dl_allocate_tls (in /lib/ld-2.8.so) ==1499== by 0x4A4ECA7: pthread_create (in /lib/libpthread-2.8.so) ==1499== ==1499== LEAK SUMMARY: ==1499== definitely lost: 0 bytes in 0 blocks ==1499== indirectly lost: 0 bytes in 0 blocks ==1499== possibly lost: 272 bytes in 2 blocks ==1499== still reachable: 0 bytes in 0 blocks
注:编译源文件时需要添加 -g 选项. valgrind直接定位出错误的文件和函数位置.
对于一些引用某些动态库的程序, valgrind可能无法定位出错误的文件和函数位置.
那么我们可以通过一下步骤来获取更多的信息.
1. 为线程设定名字. 以做明确的区分.
详见: 为线程设置名字
2.使用valgrind定位出错误的线程id号.
3. 使用valgrind执行程序后, ps 查看程序的id值. 如:ps | grep test
4. 进入进程的proc目录. /proc/xxxx/task
搜索每个线程的名字. grep "Name*" */status 如:
1499/status:Name: memcheck-arm-li 1500/status:Name: chk_state 1501/status:Name: chk_pakage
这里可以看出,
该进程的线程1为调试线程.
线程2为线程 chk_state
线程3为线程chk_package
根据步骤2定位出的线程id, 就可以找到出错的具体线程.