1.现象描述:
程序在启动时,解析xml文件时出现malloc(): memory corruption (fast): 0x09a5e3e8错误。相同的代码在windows下运行时不会出现错误。
具体错误信息为:
*** glibc detected *** ./test_61850: malloc(): memory corruption (fast): 0x095133e8 ***
======= Backtrace: =========
/lib/libc.so.6[0xa55d96]
/lib/libc.so.6[0xa5684e]
/lib/libc.so.6(realloc+0xe6)[0xa574a6]
../lib/lib61850.so[0x85d1f7]
../lib/libutils.so[0x275c00]
../lib/libexpat.so[0x8d5ea2]
../lib/libexpat.so[0x8d6d81]
../lib/libexpat.so(XML_ParseBuffer+0x7c)[0x8cf83c]
../lib/libexpat.so(XML_Parse+0xd3)[0x8d0cf3]
../lib/libutils.so(dcfg_read+0x188)[0x275e1d]
../lib/lib61850.so(dsscd_read+0xc5)[0x85da2b]
../lib/lib61850.so(StartTasks+0x3c)[0x87bc90]
../lib/lib61850.so(BIO_Init+0x3b)[0x8448b1]
./test_61850[0x804c059]
./test_61850(__gxx_personality_v0+0x17d)[0x80491c5]
./test_61850(__gxx_personality_v0+0x237)[0x804927f]
/lib/libc.so.6(__libc_start_main+0xdc)[0xa00e9c]
./test_61850(__gxx_personality_v0+0x79)[0x80490c1]
======= Memory map: ========
00110000-00112000 r-xp 00000000 08:02 426398 /users/oracle/lib61850/debug/linux-i386/lib/libcharset.so.1
00112000-00113000 rwxp 00001000 08:02 426398 /users/oracle/lib61850/debug/linux-i386/lib/libcharset.so.1
00139000-00140000 r-xp 00000000 08:02 426395 /users/oracle/lib61850/debug/linux-i386/lib/libfmap.so
00140000-00141000 rwxp 00006000 08:02 426395 /users/oracle/lib61850/debug/linux-i386/lib/libfmap.so
00237000-0023e000 r-xp 00000000 08:01 885981 /lib/librt-2.5.so
0023e000-0023f000 r-xp 00007000 08:01 885981 /lib/librt-2.5.so
0023f000-00240000 rwxp 00008000 08:01 885981 /lib/librt-2.5.so
00274000-00278000 r-xp 00000000 08:02 426368 /users/oracle/lib61850/debug/linux-i386/lib/libutils.so
00278000-00279000 rwxp 00003000 08:02 426368 /users/oracle/lib61850/debug/linux-i386/lib/libutils.so
002e7000-002e8000 r-xp 002e7000 00:00 0 [vdso]
003da000-0046a000 r-xp 00000000 08:02 426392 /users/oracle/lib61850/debug/linux-i386/lib/libssacsi.so
0046a000-0046e000 rwxp 0008f000 08:02 426392 /users/oracle/lib61850/debug/linux-i386/lib/libssacsi.so
0046e000-00472000 rwxp 0046e000 00:00 0
0059f000-005a2000 r-xp 00000000 08:02 426396 /users/oracle/lib61850/debug/linux-i386/lib/libnetlog.so
005a2000-005a3000 rwxp 00003000 08:02 426396 /users/oracle/lib61850/debug/linux-i386/lib/libnetlog.so
005f2000-006d1000 r-xp 00000000 08:02 426401 /users/oracle/lib61850/debug/linux-i386/lib/libiconv.so.2
006d1000-006d2000 rwxp 000df000 08:02 426401 /users/oracle/lib61850/debug/linux-i386/lib/libiconv.so.2
0082f000-00894000 r-xp 00000000 08:02 426397 /users/oracle/lib61850/debug/linux-i386/lib/lib61850.so
00894000-00897000 rwxp 00065000 08:02 426397 /users/oracle/lib61850/debug/linux-i386/lib/lib61850.so
00897000-00898000 rwxp 00897000 00:00 0
008cd000-008ef000 r-xp 00000000 08:02 426369 /users/oracle/lib61850/debug/linux-i386/lib/libexpat.so
008ef000-008f0000 --xp 00022000 08:02 426369 /users/oracle/lib61850/debug/linux-i386/lib/libexpat.so
008f0000-008f2000 r-xp 00022000 08:02 426369 /users/oracle/lib61850/debug/linux-i386/lib/libexpat.so
008f2000-008f3000 rwxp 00024000 08:02 426369 /users/oracle/lib61850/debug/linux-i386/lib/libexpat.so
009cc000-009e7000 r-xp 00000000 08:01 885957 /lib/ld-2.5.so
009e7000-009e8000 r-xp 0001a000 08:01 885957 /lib/ld-2.5.so
009e8000-009e9000 rwxp 0001b000 08:01 885957 /lib/ld-2.5.so
009eb000-00b3e000 r-xp 00000000 08:01 885978 /lib/libc-2.5.so
00b3e000-00b40000 r-xp 00153000 08:01 885978 /lib/libc-2.5.so
00b40000-00b41000 rwxp 00155000 08:01 885978 /lib/libc-2.5.so
00b41000-00b44000 rwxp 00b41000 00:00 0
00b46000-00b6d000 r-xp 00000000 08:01 885985 /lib/libm-2.5.so
00b6d000-00b6e000 r-xp 00026000 08:01 885985 /lib/libm-2.5.so
00b6e000-00b6f000 rwxp 00027000 08:01 885985 /lib/libm-2.5.so
00b78000-00b8d000 r-xp 00000000 08:01 885980 /lib/libpthread-2.5.so
00b8d000-00b8e000 r-xp 00015000 08:01 885980 /lib/libpthread-2.5.so
00b8e000-00b8f000 rwxp 00016000 08:01 885980 /lib/libpthread-2.5.so
00b8f000-00b91000 rwxp 00b8f000 00:00 0
2.问题分析
使用gdb test_61850 core
where命令进行跟踪后发现是在paser_fcda_info函数中调用realloc函数时,程序core掉的,增加打印后发现是在多次分配内存之后出现的错误(p_fcda = 9512f80,dsNum = b),怀疑是不是在该函数中对内存进行了非法的操作,造成内存泄露,或内存的非法读写,从而造成程序的core。
核查该函数代码后发现以下两个问题:
line834:
ptr_data = pInfo->doName;
if(ptr_data != NULL){
ptr = (char *)realloc(ptr,(curLen+strlen(ptr_data))*sizeof(char));
memcpy(ptr+curLen,ptr_data,strlen(ptr_data)+1);//多写了一个字节
curLen += strlen(ptr_data);
}
line850:
ptr_data = pInfo->daName;
if(ptr_data != NULL){
ptr = (char *)realloc(ptr,(curLen+strlen(ptr_data))*sizeof(char));
memcpy(ptr+curLen,ptr_data,strlen(ptr_data)+1);//多写了一个字节
curLen += strlen(ptr_data);
}
在memcpy中每次写数据时比实际申请的内容多写了一个字节,该字节并不在系统的内存管理中,并且可能造成将一些有用的数据被覆盖掉,成为造成程序错误的隐患。
3.问题解决
将上述两端代码更改为:
line834:
ptr_data = pInfo->doName;
if(ptr_data != NULL){
ptr = (char *)realloc(ptr,(curLen+strlen(ptr_data)+1)*sizeof(char));
memcpy(ptr+curLen,ptr_data,strlen(ptr_data)+1);//按申请空间写数据
curLen += strlen(ptr_data);
}
line850:
ptr_data = pInfo->daName;
if(ptr_data != NULL){
ptr = (char *)realloc(ptr,(curLen+strlen(ptr_data)+1)*sizeof(char));
memcpy(ptr+curLen,ptr_data,strlen(ptr_data)+1);//按申请空间写数据
curLen += strlen(ptr_data);
}
重新运行程序,该现象消失。