各种奇特的事情:内存错误,无常的段错误,堆栈消失

1. 在修改较多文件,make和执行,发现总是提示SegmentFault,调试也找不到原因(gdb调试时,往往进入某个普通的函数就SegmentFault)

解决方法:全部重新编译可能会解决问题,原因不明。make clean; make。

环境:g++/gcc (GCC) 4.4.4 20100726 (Red Hat 4.4.4-13) centos6


2. 程序突然就退出了,调试时却不退出。

解决方法:可能是SIGPIPE信号导致的,linux下默认该信号是终止程序,而在gdb下,这个信号被忽略。


3. 调用堆栈突然就消失了:

解决方法:大多是内存越界操作,而且是临时变量(栈变量),譬如声明char [256],操作时写入超过256字符,就可能改写函数的调用堆栈。


4. Centos4编译时的问题:

4.1 Centos4的gcc(3.4.6)不支持-std=c++0x参数

4.2 Centos4的多线程编程mutext的初始化,只能使用“pthread_mutex_init(&mutex, NULL);”,宏定义“PTHREAD_MUTEX_INITIALIZER;”不能使用。


5. new时段错误:

这个是最难解决的问题,因为是其他地方越界导致内存错误,在这个地方new导致问题。

Program received signal SIGSEGV, Segmentation fault.
0x00000037234787ee in _int_malloc () from /lib64/libc.so.6
(gdb) bt
#0  0x00000037234787ee in _int_malloc () from /lib64/libc.so.6
#1  0x0000003723479aed in malloc () from /lib64/libc.so.6
#2  0x00000037300bd0ed in operator new(unsigned long) () from /usr/lib64/libstdc++.so.6
#3  0x000000000040db32 in ConnectionFarm::AddClient (this=0xae34e0, client_fd=6) at connection.cpp:312

在connection.cpp:312是一个new操作:
Connection* client = new Connection(this);

碰到这个问题,就比较恐怖了。有可能是memcpy,memset,strcpy,strcat等等越界。

有个游戏界大牛(游晶 )说tcmalloc能解决这个问题,当memcpy越界时会报错。
游晶  12:47:11
http://www.google.com/#hl=zh-CN&q=tcmalloc+TCMALLOC_PAGE_FENCE&oq=tcmalloc+TCMALLOC_PAGE_FENCE&gs_l=serp.3...139786.139786.4.140233.1.1.0.0.0.0.134.134.0j1.1.0...0.0...1c.9XWRVXVVt3c&bav=on.2,or.r_gc.r_pw.&fp=3b4abcf478f33cbd&biw=1440&bih=763
http://www.cppblog.com/feixuwu/archive/2011/05/14/146395.html
不是阻止,而是在有内存访问越界的时候捕获到,例如使用guard

试了一下果然在越界的地方停下来了:

下载地址:
http://code.google.com/p/gperftools/downloads/detail?name=gperftools-2.0.tar.gz

http://download.csdn.net/download/winlinvip/4475430

wget http://gperftools.googlecode.com/files/gperftools-2.0.tar.gz

#!/bin/bash

test -z "gperftools-2.0" || tar xf gperftools-2.0.tar.gz

echo "please modify the src/debugallocation.cc"
echo "   DEFINE_bool(malloc_page_fence,"
echo "            EnvToBool(\"TCMALLOC_PAGE_FENCE\", false),"
echo "            \"Enables putting of memory allocations at page boundaries \""
echo "            \"with a guard page following the allocation (to catch buffer \""
echo "            \"overruns right when they happen).\");"
echo "to EnvToBool(\"TCMALLOC_PAGE_FENCE\", true) and link with -ltcmalloc_debug"

echo ""
echo "build and install:"
echo "cd  gperftools-2.0"
echo "./configure --enable-frame-pointers"
echo "make"
echo "sudo make install"


静态库链接:
sudo ln -sf /usr/local/lib/libtcmalloc_debug.so.4 /lib64/libtcmalloc_debug.so.4

编译选项加上:

-fno-builtin-malloc -fno-builtin-calloc -fno-builtin-realloc -fno-builtin-free

链接选项加上:

-ltcmalloc_debug 

使用gdb调试,在越界的地方就会停下来。

举一个真实项目的例子(原型):

/**
# to build:
g++ -g -O0 -c memcorrupt.cpp -o memcorrupt.o -fno-builtin-malloc -fno-builtin-calloc -fno-builtin-realloc -fno-builtin-free; g++ memcorrupt.o -o memcorrupt -ltcmalloc_debug ; ./memcorrupt
*/
#include 

class Connection;
class State
{
private:
    Connection* conn;
public:
    State(Connection* c) : conn(c){
    }
    virtual ~State(){
    }
    void action();
};

class Manager;
class Connection
{
private:
    State* state;
    Manager* manager;
public:
    Connection(){
        state = NULL;
    }
    virtual ~Connection(){
        if(state != NULL){
            delete state;
            state = NULL;
        }
    }
public:
    void SetManager(Manager* m){
        manager = m;
    }
    Manager* GetManager(){
        return manager;
    }
    void SetState(State* s){
        state = s;
    }
};

class Manager
{
private:
    Connection* conn;
public:
    Manager(){
        conn = NULL;
    }
    virtual ~Manager(){
    }
public:
    void Destroy(){
        if(conn != NULL){
            delete conn;
            conn = NULL;
        }
    }
    Connection* GetConnection(){
        return conn;
    }
    void SetConnection(Connection* c){
        conn = c;
        conn->SetManager(this);
    }
};

void State::action(){
    if(conn == NULL){
        return;
    }
    
    conn->GetManager()->Destroy();
    this->conn = NULL;
}

int main(int /*argc*/, char** /*argv*/){
    Manager manager;
    Connection* connection = new Connection();
    State* state = new State(connection);
    
    connection->SetState(state);
    manager.SetConnection(connection);
    
    state->action();
    
    return 0;
}

这段代码怎么看都没有问题,State为终止状态,将调用Manager销毁Connection对象。

有一个地方越界了:Manager.Destroy()将销毁Connection,而Connection将销毁State,在Destroy之后State的this已经不可用了,再调用this->conn=NULL就会越界。这个越界有时会有问题,有时候没有,所以很危险。

使用tcmalloc,用gdb调试时会停在越界的地方:

0x000000000040079a in State::action (this=0x2aaaaaab0ff0) at memcorrupt.cpp:80
80          this->conn = NULL;
(gdb) bt
#0  0x000000000040079a in State::action (this=0x2aaaaaab0ff0) at memcorrupt.cpp:80
#1  0x0000000000400816 in main () at memcorrupt.cpp:91

你可能感兴趣的:(各种奇特的事情:内存错误,无常的段错误,堆栈消失)