前段时间用C++写了一个DXF的转换程序,把DXF格式的图形数据转换为ESRI Personal GeoDatabase格式的MDB文件,我平时用C++写程序不多,所以对自己写的程序不是特别放心。在调试状态下跑了跑,工作正常,但是Output窗口中有很多“Microsoft C++ Exception”的东东,主要有如下两类:
First-chance exception at 0x7c812a5b (kernel32.dll) in CASSBroker.exe: Microsoft C++ exception: long at memory location 0x0012d7a8...
First-chance exception at 0x67a76330 (AfCore.dll) in CASSBroker.exe: 0xC0000005: Access violation reading location 0x7f944000.
第一种异常发生在下面的代码中:
0x7c812a5b
7C800000-7C91C000
kernel32.dll
Offset: 0x00012a6b
00012A09 RaiseException
***********************************************************************************
7C812A09: mov edi, edi
7C812A0B: push ebp
7C812A0C: mov ebp, esp
7C812A0E: sub esp, 50h
7C812A11: mov eax, dword ptr [ebp+8]
7C812A14: and dword ptr [ebp-48h],0
7C812A18: mov dword ptr [ebp-50h],eax
7C812A1B: mov eax,dword ptr [ebp+0Ch]
7C812A1E: push esi
7C812A1F: mov esi,dword ptr [ebp+14h]
7C812A22: and eax,1
7C812A25: test esi,esi
7C812A27: mov dword ptr [ebp-4Ch],eax
7C812A2A: mov dword ptr [ebp-44h],7C812A09h
7C812A31: je 7C812AD0
7C812A37: mov ecx,dword ptr [ebp+10h]
7C812A3A: cmp ecx,0Fh
7C812A3D: ja 7C844790
7C812A43: test ecx,ecx
7C812A45: mov dword ptr [ebp-40h],ecx
7C812A48: je 7C812A51
7C812A4A: push edi
7C812A4B: lea edi,[ebp-3Ch]
7C812A4E: rep movs dword ptr es:[edi],dword ptr [esi]
7C812A50: pop edi
7C812A51: lea eax,[ebp-50h]
7C812A54: push eax
7C812A55: call dword ptr ds:[7C801508h]
->
7C812A5B: pop esi
7C812A5C: leave
7C812A5D: ret 10h
-----------------------------------------------------------------------------------------------
根据字面意思来猜测的话应该属于数据没有对齐的问题,起始地址不是偶数地址或者是能被4整除的字地址,但是pop esi指令,还有内存地址0x0012d7a8,否定了我的猜测。pop esi访问内存应该是堆栈内存,通过检查的确是这么回事,那个地址是esp的值,但是long at是什么意思呢,难道long还有别的什么意思,让人费解。这个错误不是很严重,不理它也罢,不多数量很多,在UltralEdit中Count All有540条之多,真是让我心疼,性能损失非常严重。暂时还没有良策,先搁置一边再说了,先看看第二个问题。
第二种异常发生在如下汇编代码中:
0x67a76330
67A50000-67B15000
AfCore.dll
Offset: 0x00026330
67A76235-67A50000 = 0x00026235
00026250 ?Alloc@FixedBlockHeap@@QAEPAXXZ
67A76250 public: void * __thiscall FixedBlockHeap::Alloc(void)
*********************************************************************************************************************
67A76250: push ebp
67A76251: mov ebp,esp
67A76253: push 0FFFFFFFFh
67A76255: push 67ADA4B9h
67A7625A: mov eax,dword ptr fs:[00000000h]
67A76260: push eax
67A76261: mov dword ptr fs:[00000000h],esp
67A76268: push ecx
67A76269: sub esp,40h
67A7626C: push ebx
67A7626D: push esi
67A7626E: push edi
67A7626F: mov dword ptr [ebp-10h],esp
67A76272: mov dword ptr [ebp-50h],ecx
67A76275: mov ecx,67B07CC0h
67A7627A: call 67A766F0
67A7627F: mov dword ptr [ebp-38h],eax
67A76282: mov eax,dword ptr [ebp-38h]
67A76285: mov dword ptr [ebp-1Ch],eax
67A76288: mov byte ptr [ebp-18h],0
67A7628C: mov ecx,dword ptr [ebp-1Ch]
67A7628F: push ecx
67A76290: call dword ptr ds:[67ADC1E0h]
67A76296: mov dword ptr [ebp-4],0
67A7629D: mov edx,dword ptr [ebp-50h]
67A762A0: xor eax,eax
67A762A2: mov al,byte ptr [edx+0Ch]
67A762A5: test eax,eax
67A762A7: jne 67A762B8
67A762A9: mov ecx,dword ptr [ebp-50h]
67A762AC: call 67A764B0
67A762B1: mov ecx,dword ptr [ebp-50h]
67A762B4: mov byte ptr [ecx+0Ch],1
67A762B8: mov edx,dword ptr [ebp-50h]
67A762BB: xor eax,eax
67A762BD: mov al,byte ptr [edx+0Dh]
67A762C0: test eax,eax
67A762C2: je 67A7630F
67A762C4: mov dword ptr [ebp-28h],0
67A762CB: mov dword ptr [ebp-4],0FFFFFFFFh
67A762D2: mov ecx,dword ptr [ebp-18h]
67A762D5: and ecx,0FFh
67A762DB: test ecx,ecx
67A762DD: je 67A762FD
67A762DF: mov edx,dword ptr [ebp-1Ch]
67A762E2: push edx
67A762E3: call dword ptr ds:[67ADC1E4h]
67A762E9: mov eax,dword ptr [ebp-1Ch]
67A762EC: mov dword ptr [ebp-3Ch],eax
67A762EF: mov ecx,dword ptr [ebp-3Ch]
67A762F2: push ecx
67A762F3: call 67A992D6
67A762F8: add esp,4
67A762FB: jmp 67A76307
67A762FD: mov edx,dword ptr [ebp-1Ch]
67A76300: push edx
67A76301: call dword ptr ds:[67ADC1DCh]
67A76307: mov eax,dword ptr [ebp-28h]
67A7630A: jmp 67A76452
67A7630F: mov byte ptr [ebp-4],1
67A76313: mov eax,dword ptr [ebp-50h]
67A76316: mov ecx,dword ptr [eax+14h]
67A76319: mov dword ptr [ebp-20h],ecx
67A7631C: mov edx,dword ptr [ebp-50h]
67A7631F: mov eax,dword ptr [edx+14h]
67A76322: mov ecx,dword ptr [ebp-50h]
67A76325: mov edx,dword ptr [eax]
67A76327: mov dword ptr [ecx+14h],edx
67A7632A: mov eax,dword ptr [ebp-50h]
67A7632D: mov ecx,dword ptr [eax+14h]
->
67A76330: mov dl,byte ptr [ecx]
67A76332: mov byte ptr [ebp-14h],dl
67A76335: mov eax,dword ptr [ebp-20h]
67A76338: mov dword ptr [ebp-2Ch],eax
67A7633B: mov dword ptr [ebp-4],0FFFFFFFFh
67A76342: mov ecx,dword ptr [ebp-18h]
67A76345: and ecx,0FFh
67A7634B: test ecx,ecx
67A7634D: je 67A7636D
67A7634F: mov edx,dword ptr [ebp-1Ch]
67A76352: push edx
67A76353: call dword ptr ds:[67ADC1E4h]
67A76359: mov eax,dword ptr [ebp-1Ch]
67A7635C: mov dword ptr [ebp-40h],eax
67A7635F: mov ecx,dword ptr [ebp-40h]
67A76362: push ecx
67A76363: call 67A992D6
67A76368: add esp,4
67A7636B: jmp 67A76377
67A7636D: mov edx,dword ptr [ebp-1Ch]
67A76370: push edx
67A76371: call dword ptr ds:[67ADC1DCh]
67A76377: mov eax,dword ptr [ebp-2Ch]
67A7637A: jmp 67A76452
67A7637F: call dword ptr ds:[67ADC354h]
67A76385: mov eax,dword ptr [ebp-50h]
67A76388: mov ecx,dword ptr [eax+14h]
67A7638B: push ecx
67A7638C: mov ecx,dword ptr [ebp-50h]
67A7638F: call 67A76560
67A76394: and eax,0FFh
67A76399: test eax,eax
67A7639B: jne 67A763F1
67A7639D: mov edx,dword ptr [ebp-50h]
67A763A0: mov byte ptr [edx+0Dh],1
67A763A4: mov dword ptr [ebp-30h],0
67A763AB: mov eax,67A763B1h
67A763B0: ret
----------------------------------------------------------------------------------------------------
我对汇编语言也只是初知皮毛而已,在IDA Pro中找到出错的代码段,稍微好读一些:
.text:67A7630F loc_67A7630F:
.text:67A7630F mov byte ptr [ebp+var_4], 1
.text:67A76313 mov eax, [ebp+var_50]
.text:67A76316 mov ecx, [eax+14h]
.text:67A76319 mov [ebp+var_20], ecx
.text:67A7631C mov edx, [ebp+var_50]
.text:67A7631F mov eax, [edx+14h]
.text:67A76322 mov ecx, [ebp+var_50]
.text:67A76325 mov edx, [eax]
.text:67A76327 mov [ecx+14h], edx
.text:67A7632A mov eax, [ebp+var_50]
.text:67A7632D mov ecx, [eax+14h]
.text:67A76330 mov dl, [ecx]
.text:67A76332 mov [ebp+var_14], dl
.text:67A76335 mov eax, [ebp+var_20]
.text:67A76338 mov [ebp+var_2C], eax
.text:67A7633B mov [ebp+var_4], 0FFFFFFFFh
.text:67A76342 mov ecx, [ebp+var_18]
.text:67A76345 and ecx, 0FFh
.text:67A7634B test ecx, ecx
.text:67A7634D jz short loc_67A7636D
.text:67A7634F mov edx, [ebp+lpCriticalSection]
.text:67A76352 push edx ; lpCriticalSection
.text:67A76353 call ds:DeleteCriticalSection
.text:67A76359 mov eax, [ebp+lpCriticalSection]
.text:67A7635C mov [ebp+var_40], eax
.text:67A7635F mov ecx, [ebp+var_40]
.text:67A76362 push ecx
.text:67A76363 call ??3@YAXPAX@Z ; operator delete(void *)
.text:67A76368 add esp, 4
.text:67A7636B jmp short loc_67A76377
------------------------------------------------------------------------------------------
大概看出了一点眉目,ebp-50h应该是一个结构,而且是局部变量。因为ebp通常指向堆栈的顶端,通过加一个偏移量来读取参数,局部变量分配内存才用这种ebp-disp这种方式,此外还有一个重要的发现是这个结构偏移14h处有一个4字节的指针,而且是一个指针的指针:
67A76313: mov eax,dword ptr [ebp-50h] ;eax = &var50
67A76316: mov ecx,dword ptr [eax+14h] ;ecx = var50->m14
67A76319: mov dword ptr [ebp-20h],ecx ;var20 = ecx
67A7631C: mov edx,dword ptr [ebp-50h] ;edx = &var50
67A7631F: mov eax,dword ptr [edx+14h] ;eax = var50->m14
67A76322: mov ecx,dword ptr [ebp-50h] ;ecx = &var50
67A76325: mov edx,dword ptr [eax] ;edx = *(var50->m14)
67A76327: mov dword ptr [ecx+14h],edx ;var50->m14 = edx = *(var50->m14) //指针的指针
67A7632A: mov eax,dword ptr [ebp-50h] ;eax = &var50
67A7632D: mov ecx,dword ptr [eax+14h] ;ecx = *(var50->m14)
->
67A76330: mov dl,byte ptr [ecx] ;dl = (byte)(*(var50->m14))
--------------------------------------------------------------------------------------------
另外我通过设置数据断点,看看这个值是什么时候写进去的,看到有一些调用VirtualAlloc和VirtualQuery的代码,然后这个地址被修改很多次,每次增加一点,每次增加一点,后来我没有耐心继续调试了。
理论上我可以解决这个问题,但是我对汇编语言还有操作系统关于虚拟内存分配的东西不是很清楚,暂时放弃了。如果有源代码的话,问题就简单多了。MSDN上有一段关于How Can I Debug a Access Violation说得很好,但是我看了好几遍才读出味道来,因为我一直想找更简单的方法。
Use the Call Stack window to work your way back up the call stack, looking for corrupted data being passed as a parameter to a function. If that fails, try setting a breakpoint at a point before the location where the access violation occurs. Check to see if data is good at that point. If so, try stepping your way toward the location where the access violation occurred. If you can identify a single action, such as a menu command that led to the access violation, you can try another technique: set a breakpoint between the action (in this example, the menu command) and the access violation. You can then look at the state of your program during the moments leading up to the access violation.
You can use a combination of these techniques to work forward and backward until you have isolated the location where the access violation occurred. For more information, see Using the Call Stack Window.
=================================================================================
另外抛出异常时,堆栈上的信息很乱,根本定位不到我写的源代码?我后来从异常处单步走出好几个函数调用才找到出错的源代码主要是下面两句:
pAnnoFeature->put_Annotation(pElement)
L639: hr = pSeg.CreateInstance(CLSID_Line);
后面的这句出错比较多。此外我发现一个现象,出错的内存地址是0x4000的整数倍:
First-chance exception at 0x67a76330 (AfCore.dll) in CASSBroker.exe: 0xC0000005: Access violation reading location 0x7f944000.
First-chance exception at 0x67a76330 (AfCore.dll) in CASSBroker.exe: 0xC0000005: Access violation reading location 0x7f978000.
First-chance exception at 0x67a76330 (AfCore.dll) in CASSBroker.exe: 0xC0000005: Access violation reading location 0x7f9ac000.
First-chance exception at 0x67a76330 (AfCore.dll) in CASSBroker.exe: 0xC0000005: Access violation reading location 0x7f9e0000.
First-chance exception at 0x67a76330 (AfCore.dll) in CASSBroker.exe: 0xC0000005: Access violation reading location 0x7fa14000.
First-chance exception at 0x67a76330 (AfCore.dll) in CASSBroker.exe: 0xC0000005: Access violation reading location 0x7fa48000.
First-chance exception at 0x67a76330 (AfCore.dll) in CASSBroker.exe: 0xC0000005: Access violation reading location 0x7fa7c000.
First-chance exception at 0x67a76330 (AfCore.dll) in CASSBroker.exe: 0xC0000005: Access violation reading location 0x7fab0000.
First-chance exception at 0x67a76330 (AfCore.dll) in CASSBroker.exe: 0xC0000005: Access violation reading location 0x7fae4000.
----------------------------------------------------------------------------------------------------------------------------------------
Windows采取的是段页式内存管理策略,页面大小为4k(0x1000),存取异常发生在页面边界处,访问下一个还没有取得权限的页面,因为我发现后来这个地址也写入内容了。
此外我找遍了所有能找到的书籍,网上能找到的东西,因为最近海底光缆断了,很多国外网站访问有问题,所以看不到国外的帖子。
First-change exception是调试器收到操作系统发来的信号,通知它正在调试的程序触发一个异常,如果没有后续的Last-change exception说明程序处理了这个异常,可以忽略,但是如果异常特别多可能会引起性能问题。但是long at memory location这个异常还是非常费解,我查阅MSDN的异常类型,没有与此有关的类型,另外我让数据未对齐的异常抛出来,没有作用,它属于C++ Exception,但是又不属于任何一个小类,怪异!