程序运行过程中发生异常退出时,系统会把程序当前的内存情况存储在一个core文件中。由于没有足够的现场日志,异常发生的原因通常难于查找。这种情况可以用core文件配合gdb来解决。
造成 segment fault,产生 core dump 的可能原因 ,
1、内存访问越界 、 a) 由于使用错误的下标,导致数组访问越界 b) 搜索字符串时,依靠字符串结束符来判断字符串是否结束,但是字符串没有正常的使用 结束符 c) 使用 strcpy, strcat, sprintf, strcmp, strcasecmp 等字符串操作函数,将目标字符串读/写 爆。应该使用 strncpy, strlcpy, strncat, strlcat, snprintf, strncmp, strncasecmp 等函数防止 读写越界。
2、多线程程序使用了线程不安全的函数。 、多线程程序使用了线程不安全的函数。
3、多线程读写的数据未加锁保护。 、多线程读写的数据未加锁保护。 对于会被多个线程同时访问的全局数据,应该注意加锁保护,否则很容易造成 core dump
4、非法指针 、 a) 使用空指针 b) 随意使用指针转换。一个指向一段内存的指针,除非确定这段内存原先就分配为某种结 构或类型,或者这种结构或类型的数组,否则不要将它转换为这种结构或类型的指针,而应 该将这段内存拷贝到一个这种结构或类型中, 再访问这个结构或类型。 这是因为如果这段内 存的开始地址不是按照这种结构或类型对齐的,那么访问它时就很容易因为 bus error 而 core dump.
5、堆栈溢出 、 不要使用大的局部变量(因为局部变量都分配在栈上),这样容易造成堆栈溢出,破坏系统 的栈和堆结构,导致出现莫名其妙的错误。
问题程序:
segment.c
#include
#include
#include
#include
void func()
{
char *p = NULL;
*p = 3;
}
main()
{
func();
return;
}
编译:
gcc -g -o segment segment.c
Core文件:
1. 查看系统是否允许生成core文件
- #ulimit -a
- core file size (blocks, -c) 0
#ulimit -a core file size (blocks, -c) 0
core文件大小限制为0,不能生成core文件。
2. 使用如下命令取消限制,使系统能生成core文件
- ulimit -c unlimited
或者指定core文件大小,如1Kulimit -c unlimited
- ulimit -c 1024
ulimit -c 1024
执行程序:
# ./segment
Segmentation fault (core dumped)
在程序当前目录下生成了core文件
GDB调试:
# gdb ./segment core
GNU gdb (GDB) 7.1-ubuntu
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "i486-linux-gnu".
For bug reporting instructions, please see:
...
Reading symbols from /home/duanbb/test/segment/segment...done.
[New Thread 3655]
warning: Can't read pathname for load map: Input/output error.
Reading symbols from /lib/tls/i686/cmov/libc.so.6...Reading symbols from /usr/lib/debug/lib/tls/i686/cmov/libc-2.11.1.so...done.
done.
Loaded symbols for /lib/tls/i686/cmov/libc.so.6
Reading symbols from /lib/ld-linux.so.2...Reading symbols from /usr/lib/debug/lib/ld-2.11.1.so...done.
done.
Loaded symbols for /lib/ld-linux.so.2
Core was generated by `./segment'.
Program terminated with signal 11, Segmentation fault.
#0 0x080483c4 in func () at segment.c:10
10 *p = 3;
可以清楚地看到,程序在第10行代码,func()函数内,执行"*p = 3"时发生了segment错误
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Release版 Core的调试
众所周知,发布程序时必须是release版的,杜绝debug版,因为debug版内涵了源码信息。
下面就来看看如何调试release 版程序生成的core文件
还是如上程序,编译release版:
# gcc -o segment segment.c
运行程序:
./segment
Segmentation fault (core dumped)
调试程序:
$ gdb ./segment core
GNU gdb (GDB) 7.1-ubuntu
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "i486-linux-gnu".
For bug reporting instructions, please see:
...
Reading symbols from /home/duanbb/test/segment/segment...(no debugging symbols found)...done.
[New Thread 4711]
warning: Can't read pathname for load map: Input/output error.
Reading symbols from /lib/tls/i686/cmov/libc.so.6...Reading symbols from /usr/lib/debug/lib/tls/i686/cmov/libc-2.11.1.so...done.
done.
Loaded symbols for /lib/tls/i686/cmov/libc.so.6
Reading symbols from /lib/ld-linux.so.2...Reading symbols from /usr/lib/debug/lib/ld-2.11.1.so...done.
done.
Loaded symbols for /lib/ld-linux.so.2
Core was generated by `./segment'.
Program terminated with signal 11, Segmentation fault.
#0 0x080483c4 in func ()
(gdb)
只能显示出出错时的函数,而看不到出错的具体位置,怎么办?
方法: 用发布时的原代码,在原有的编译选项上(不要改变任何编译参数,即使有"-O2"参数,也不要动),只加上"-g"选项,编译出对应的debug程序
本例:
# gcc -g -o segment_d segment.c
调试:
# gdb ./segment_d core
GNU gdb (GDB) 7.1-ubuntu
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "i486-linux-gnu".
For bug reporting instructions, please see:
...
Reading symbols from /home/duanbb/test/segment/segment_d...done.
warning: core file may not match specified executable file.
[New Thread 4711]
warning: Can't read pathname for load map: Input/output error.
Reading symbols from /lib/tls/i686/cmov/libc.so.6...Reading symbols from /usr/lib/debug/lib/tls/i686/cmov/libc-2.11.1.so...done.
done.
Loaded symbols for /lib/tls/i686/cmov/libc.so.6
Reading symbols from /lib/ld-linux.so.2...Reading symbols from /usr/lib/debug/lib/ld-2.11.1.so...done.
done.
Loaded symbols for /lib/ld-linux.so.2
Core was generated by `./segment'.
Program terminated with signal 11, Segmentation fault.
#0 0x080483c4 in func () at segment.c:10
10 *p = 3;
即可定位出错误位置
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Core文件的命名和路径
默认情况下core文件名即为"core",文件路径:运行可执行文件时,环境所在的目录
两个控制文件
(1) /proc/sys/kernel/core_uses_pid
0: 默认值,生成的文件名不会带上PID后缀
1: 文件名带上PID后缀,如"core.2976"
修改方法: "echo 1 > /proc/sys/kernel/core_uses_pid"
(2) /proc/sys/kernel/core_pattern
用来控制文件名格式
默认值: core
假设将其值改变为:"/home/duanbei/corefile/core-%e-%p-%t", 那么所有的core文件将保存在"/home/duanbei/corefile"目录下,文件名格式为“core-程序名-pid-时间戳”
格式参数列表
- %p - insert pid into filename
- %u - insert current uid into filename
- %g - insert current gid into filename
- %s - insert signal that caused the coredump into the filename
- %t - insert UNIX time that the coredump occurred into filename
- %h - insert hostname where the coredump happened into filename
- %e - insert coredumping executable name into filename
%p - insert pid into filename %u - insert current uid into filename %g - insert current gid into filename %s - insert signal that caused the coredump into the filename %t - insert UNIX time that the coredump occurred into filename %h - insert hostname where the coredump happened into filename %e - insert coredumping executable name into filename
注:如果设置了“%p”,而“/proc/sys/kernel/core_uses_pid”值又为1,则不会添加PID后缀,因文件名中已含该信息
PS:
要生成core文件,就不要在程序中用signal()捕获‘SIGSEGV’信号了,不然生成不了core文件