上篇博文中给大家分享了使用Windbg分析线程阻塞问题:
Windbg程序调试系列3-线程阻塞问题
本篇中我们继续,跟大家分享附加进程实时调试-Live Debugging。
先说一下使用Windbg附加进程实时调试的应用场景和注意事项:
应用场景:
- 集成测试环境,影响异常后,分析异常和线程上下文的执行堆栈、参数情况;
- 生产环境:短时间内调试程序异常,查看异常上下文和参数,但是调试时间不能太久。
注意事项:附加进程调试会阻塞请求,调试后新的请求被阻塞住,前端调用受影响,因此要谨慎、权衡利弊,开发测试环境可以,生产环境要谨慎。
附加进程实时调试的套路:
- F6 Attache进程
- 加载SOS, .loadby sos clr
- 启用异常捕获 sxe clr
- 运行,捕获&查看异常 g !pe
- 查看异常所在线程堆栈!clrstack, 查看异常所在线程堆栈的参数和变量 !clrstack -a / !dso
- 禁用异常捕获 sxd clr
- 退出调试 qd
动手实操:
1. F6 Attache进程
打开Windbg,按下F6键,选择要调试的进程,支持输入进程ID:
2. 加载SOS, .loadby sos clr
.loadby sos clr
3. 启用异常捕获 sxe clr
sxe clr
4. 运行,捕获&查看异常 g !pe
g
上图的输出中,我们发现了以下异常信息:SocketException:远程追究强迫关闭了一个现有的连接
0:032> !pe Exception object: 000000d79aedb7a8 Exception type: System.Net.Sockets.SocketException Message: 远程主机强迫关闭了一个现有的连接。 InnerException:StackTrace (generated): StackTraceString: HResult: 80004005
异常Exception对象的内存地址:
000000d79aedb7a8
5. 查看异常所在线程堆栈!clrstack, 查看异常所在线程堆栈的参数和变量 !clrstack -a / !dso
0:032> !pe Exception object: 000000d79aedb7a8 Exception type: System.Net.Sockets.SocketException Message: 远程主机强迫关闭了一个现有的连接。 InnerException:StackTrace (generated): StackTraceString: HResult: 80004005 0:032> !clrstack OS Thread Id: 0x15968 (32) Child SP IP Call Site 000000d7b927ea88 00007ffe61c395fc [HelperMethodFrame: 000000d7b927ea88] 000000d7b927eb70 00007ffe52e68f0f *** WARNING: Unable to verify checksum for C:\Windows\assembly\NativeImages_v4.0.30319_64\System\47e0be927382f169f5de470fab0ceb7d\System.ni.dll System.Net.Sockets.NetworkStream.Read(Byte[], Int32, Int32) [f:\dd\NDP\fx\src\net\System\Net\Sockets\NetworkStream.cs @ 513] 000000d7b927ebe0 00007ffe53e59c04 *** WARNING: Unable to verify checksum for C:\Windows\assembly\NativeImages_v4.0.30319_64\mscorlib\5d0c037297cc1a64b52ce43b45c2ac2e\mscorlib.ni.dll System.IO.BufferedStream.ReadByte() [f:\dd\ndp\clr\src\BCL\system\io\bufferedstream.cs @ 814] 000000d7b927ec10 00007ffe53e2183c System.IO.BinaryReader.ReadByte() [f:\dd\ndp\clr\src\BCL\system\io\binaryreader.cs @ 143] 000000d7b927ec40 00007ffdf6188eb5 RabbitMQ.Client.Impl.Frame.ReadFrom(RabbitMQ.Util.NetworkBinaryReader) 000000d7b927ecb0 00007ffdf6188e13 RabbitMQ.Client.Impl.SocketFrameHandler.ReadFrame() 000000d7b927ed10 00007ffdf6188cfc RabbitMQ.Client.Framing.Impl.Connection.MainLoopIteration() 000000d7b927ed50 00007ffdf6188a9f RabbitMQ.Client.Framing.Impl.Connection.MainLoop() 000000d7b927edb0 00007ffe53e34740 System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean) [f:\dd\ndp\clr\src\BCL\system\threading\executioncontext.cs @ 954] 000000d7b927ee80 00007ffe53e345d4 System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean) [f:\dd\ndp\clr\src\BCL\system\threading\executioncontext.cs @ 902] 000000d7b927eeb0 00007ffe53e345a2 System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object) [f:\dd\ndp\clr\src\BCL\system\threading\executioncontext.cs @ 891] 000000d7b927ef00 00007ffe53ebcf62 System.Threading.ThreadHelper.ThreadStart() [f:\dd\ndp\clr\src\BCL\system\threading\thread.cs @ 111] 000000d7b927f158 00007ffe55325a03 [GCFrame: 000000d7b927f158] 000000d7b927f4a8 00007ffe55325a03 [DebuggerU2MCatchHandlerFrame: 000000d7b927f4a8]
上面的输出中,我们能看到异常所在的线程是32号线程,线程堆栈如下,RabbitMQ的通讯异常(心跳线程连接RabbitMQ Server异常)
继续看线程堆栈的变量:
!clrstack -a
0:032> !clrstack -a OS Thread Id: 0x15968 (32) Child SP IP Call Site 000000d7b927ea88 00007ffe61c395fc [HelperMethodFrame: 000000d7b927ea88] 000000d7b927eb70 00007ffe52e68f0f System.Net.Sockets.NetworkStream.Read(Byte[], Int32, Int32) [f:\dd\NDP\fx\src\net\System\Net\Sockets\NetworkStream.cs @ 513] PARAMETERS: this =buffer = offset = size = LOCALS: 000000d7b927ebe0 00007ffe53e59c04 System.IO.BufferedStream.ReadByte() [f:\dd\ndp\clr\src\BCL\system\io\bufferedstream.cs @ 814] PARAMETERS: this ( ) = 0x000000d79a3c2f18 LOCALS: 000000d7b927ec10 00007ffe53e2183c System.IO.BinaryReader.ReadByte() [f:\dd\ndp\clr\src\BCL\system\io\binaryreader.cs @ 143] PARAMETERS: this = LOCALS: 000000d7b927ec40 00007ffdf6188eb5 RabbitMQ.Client.Impl.Frame.ReadFrom(RabbitMQ.Util.NetworkBinaryReader) PARAMETERS: reader ( ) = 0x000000d79a3c2f70 LOCALS: 000000d7b927ecb0 00007ffdf6188e13 RabbitMQ.Client.Impl.SocketFrameHandler.ReadFrame() PARAMETERS: this = LOCALS: 0x000000d7b927ece0 = 0x0000000000000000 0x000000d7b927ecd8 = 0x000000d79a3c2f70 000000d7b927ed10 00007ffdf6188cfc RabbitMQ.Client.Framing.Impl.Connection.MainLoopIteration() PARAMETERS: this ( ) = 0x000000d79a3c31b8 LOCALS: 000000d7b927ed50 00007ffdf6188a9f RabbitMQ.Client.Framing.Impl.Connection.MainLoop() PARAMETERS: this (0x000000d7b927edb0) = 0x000000d79a3c31b8 LOCALS: 0x000000d7b927ed8c = 0x0000000000000000 000000d7b927edb0 00007ffe53e34740 System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean) [f:\dd\ndp\clr\src\BCL\system\threading\executioncontext.cs @ 954] PARAMETERS: executionContext = callback = state = preserveSyncCtx = LOCALS: 000000d7b927ee80 00007ffe53e345d4 System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean) [f:\dd\ndp\clr\src\BCL\system\threading\executioncontext.cs @ 902] PARAMETERS: executionContext = callback = state = preserveSyncCtx = 000000d7b927eeb0 00007ffe53e345a2 System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object) [f:\dd\ndp\clr\src\BCL\system\threading\executioncontext.cs @ 891] PARAMETERS: executionContext = callback = state = 000000d7b927ef00 00007ffe53ebcf62 System.Threading.ThreadHelper.ThreadStart() [f:\dd\ndp\clr\src\BCL\system\threading\thread.cs @ 111] PARAMETERS: this = 000000d7b927f158 00007ffe55325a03 [GCFrame: 000000d7b927f158] 000000d7b927f4a8 00007ffe55325a03 [DebuggerU2MCatchHandlerFrame: 000000d7b927f4a8]
能看到部分参数的内存地址,但是很多是no data,这样的话,我们尝试用另一个命令,查询线程上下文中的所有变量信息:
!dso
0:032> !dso OS Thread Id: 0x15968 (32) RSP/REG Object Name 000000D7B927E8D0 000000d79aedb7a8 System.Net.Sockets.SocketException 000000D7B927E948 000000d79aedb7a8 System.Net.Sockets.SocketException 000000D7B927E9A8 000000d79aedb7a8 System.Net.Sockets.SocketException 000000D7B927E9B0 000000d79aedb7a8 System.Net.Sockets.SocketException 000000D7B927E9E0 000000d79aedb7a8 System.Net.Sockets.SocketException 000000D7B927E9F0 000000d79aedb7a8 System.Net.Sockets.SocketException 000000D7B927EA78 000000d79aedb850 System.Text.StringBuilder 000000D7B927EAC0 000000d79aedb850 System.Text.StringBuilder 000000D7B927EAD0 000000d79aedb7a8 System.Net.Sockets.SocketException 000000D7B927EAD8 000000d79aedb7a8 System.Net.Sockets.SocketException 000000D7B927EB20 000000d79aedb7a8 System.Net.Sockets.SocketException 000000D7B927EB30 000000d79a3c2ed8 System.Net.Sockets.NetworkStream 000000D7B927EB60 000000d79a809eb0 System.Byte[] 000000D7B927EB70 000000d799fd1420 System.String 000000D7B927EB78 000000d79aeda2e8 RabbitMQ.Client.Framing.BasicProperties 000000D7B927EB80 000000d79aeda568 System.Byte[] 000000D7B927EBB0 000000d79a017a90 System.Threading.ContextCallback 000000D7B927EBB8 000000d79a3c2f18 System.IO.BufferedStream 000000D7B927EBC0 000000d79a3c2f70 RabbitMQ.Util.NetworkBinaryReader 000000D7B927EBC8 000000d79a3c3ff0 System.Threading.ThreadHelper 000000D7B927EBF0 000000d79aeda0a8 System.String HSF-ServiceState 000000D7B927EBF8 000000d799fd1420 System.String 000000D7B927EC00 000000d79a3c2f70 RabbitMQ.Util.NetworkBinaryReader 000000D7B927EC10 000000d79aeda2e8 RabbitMQ.Client.Framing.BasicProperties 000000D7B927EC20 000000d79aeda2e8 RabbitMQ.Client.Framing.BasicProperties 000000D7B927EC28 000000d79a017a90 System.Threading.ContextCallback 000000D7B927EC30 000000d79a3c6a00 RabbitMQ.Client.Framing.Impl.Model 000000D7B927EC50 000000d79a3c3ff0 System.Threading.ThreadHelper 000000D7B927EC78 000000d79a017a90 System.Threading.ContextCallback 000000D7B927EC80 000000d79a3c2bc0 RabbitMQ.Client.Impl.SocketFrameHandler 000000D7B927EC88 000000d79a3c2f70 RabbitMQ.Util.NetworkBinaryReader 000000D7B927EC90 000000d79a3c3ff0 System.Threading.ThreadHelper 000000D7B927ECB0 000000d79a017a90 System.Threading.ContextCallback 000000D7B927ECC0 000000d79a3c6688 RabbitMQ.Client.Impl.Session 000000D7B927ECC8 000000d79aedb430 RabbitMQ.Client.Impl.Frame 000000D7B927ECD8 000000d79a3c2f70 RabbitMQ.Util.NetworkBinaryReader 000000D7B927ECF0 000000d79a3c31b8 RabbitMQ.Client.Framing.Impl.Connection 000000D7B927ED80 000000d79a017a90 System.Threading.ContextCallback 000000D7B927EDB0 000000d79a3c31b8 RabbitMQ.Client.Framing.Impl.Connection 000000D7B927EDF0 000000d79a3c3f90 System.Threading.Thread 000000D7B927EE48 000000d79a3c3ff0 System.Threading.ThreadHelper 000000D7B927EE50 000000d79a3c4150 System.Threading.ExecutionContext 000000D7B927EE58 000000d79a017a90 System.Threading.ContextCallback 000000D7B927EEE8 000000d79a3c3ff0 System.Threading.ThreadHelper 000000D7B927EEF0 000000d79a3c4150 System.Threading.ExecutionContext 000000D7B927F100 000000d79a3c4018 System.Threading.ThreadStart
线程上下文中所有的变量的内存地址,我们都能看到,接下来看每个对象的信息,大家应该都会用了 !do objectAddress, 如果加载的Mex插件,可以用牛逼的 !do2
0:032> !do2 000000d79a3c6688 0x000000d79a3c6688 RabbitMQ.Client.Impl.Session 0000 _shutdownLock : 000000d79a3c66d0 (System.Object) 0008 _sessionShutdown : 000000d79a3c6cf8 (System.EventHandler) 0010 k__BackingField : NULL 0018 k__BackingField : 000000d79a3c6c50 (System.Action ) 0020 k__BackingField : 000000d79a3c31b8 (RabbitMQ.Client.Framing.Impl.Connection) 0028 k__BackingField : 1 (System.Int32) 0030 m_assembler : 000000d79a3c6790 (RabbitMQ.Client.Impl.CommandAssembler) 0:032> !do 000000d79a3c6688 Name: RabbitMQ.Client.Impl.Session MethodTable: 00007ffdf6257888 EEClass: 00007ffdf6264ce8 Size: 72(0x48) bytes File: C:\TeldApp\HSF\HSF.Host-BillCM-9086\RabbitMQ.Client.dll Fields: MT Field Offset Type VT Attr Value Name 00007ffe53fd6fd8 4000283 8 System.Object 0 instance 000000d79a3c66d0 _shutdownLock 00007ffdf6250760 4000284 10 ...RabbitMQ.Client]] 0 instance 000000d79a3c6cf8 _sessionShutdown 00007ffe53fd9428 4000285 30 System.Int32 1 instance 1 k__BackingField 00007ffdf62506c0 4000286 18 ...ShutdownEventArgs 0 instance 0000000000000000 k__BackingField 00007ffdf6272e78 4000287 20 ...RabbitMQ.Client]] 0 instance 000000d79a3c6c50 k__BackingField 00007ffdf62549a0 4000288 28 ...g.Impl.Connection 0 instance 000000d79a3c31b8 k__BackingField 00007ffdf62713b0 4000289 38 ....CommandAssembler 0 instance 000000d79a3c6790 m_assembler
我们继续看Connection对象:000000d79a3c31b8
0:032> !do 000000d79a3c31b8 Name: RabbitMQ.Client.Framing.Impl.Connection MethodTable: 00007ffdf62549a0 EEClass: 00007ffdf6260b78 Size: 232(0xe8) bytes File: C:\TeldApp\HSF\HSF.Host-BillCM-9086\RabbitMQ.Client.dll Fields: MT Field Offset Type VT Attr Value Name 00007ffe53fd6fd8 4000247 8 System.Object 0 instance 000000d79a3c32a0 m_eventLock 00007ffdf6257c48 4000248 10 ...Client.Impl.Frame 0 instance 000000d79a3c32b8 m_heartbeatFrame 00007ffe53fe7378 4000249 18 ....ManualResetEvent 0 instance 000000d79a3c32f8 m_appContinuation 0000000000000000 400024a 20 0 instance 0000000000000000 m_callbackException 00007ffe53954820 400024b 28 ...bject, mscorlib]] 0 instance 000000d79a3c51d8 m_clientProperties 00007ffdf62506c0 400024c 30 ...ShutdownEventArgs 0 instance 0000000000000000 m_closeReason 00007ffe53ff9a38 400024d c2 System.Boolean 1 instance 0 m_closed 0000000000000000 400024e 38 0 instance 0000000000000000 m_connectionBlocked 00007ffdf6250760 400024f 40 ...RabbitMQ.Client]] 0 instance 000000d79a3c6750 m_connectionShutdown 00007ffe53953ff0 4000250 48 ...tArgs, mscorlib]] 0 instance 0000000000000000 m_connectionUnblocked 00007ffdf622f7d0 4000251 50 ...ConnectionFactory 0 instance 000000d79a3c2698 m_factory 00007ffdf6255cb8 4000252 58 ...mpl.IFrameHandler 0 instance 000000d79a3c2bc0 m_frameHandler 00007ffe53ff40a0 4000253 c8 System.Guid 1 instance 000000d79a3c3280 m_id 00007ffdf625a5d0 4000254 60 ...nt.Impl.ModelBase 0 instance 000000d79a3c3860 m_model0 00007ffe53ff9a38 4000255 c3 System.Boolean 1 instance 1 m_running 00007ffdf6257a48 4000256 68 ....Impl.MainSession 0 instance 000000d79a3c35f0 m_session0 00007ffdf6259050 4000257 70 ...pl.SessionManager 0 instance 000000d79a3c6018 m_sessionManager 00007ffdf6258068 4000258 78 ...RabbitMQ.Client]] 0 instance 000000d79a3c3370 m_shutdownReport 00007ffe53fd5570 4000259 c0 System.UInt16 1 instance 60 m_heartbeat 00007ffe53ff3f28 400025a d8 System.TimeSpan 1 instance 000000d79a3c3290 m_heartbeatTimeSpan 00007ffe53fd9428 400025b b8 System.Int32 1 instance 0 m_missedHeartbeats 00007ffe53ff4248 400025c 80 ...m.Threading.Timer 0 instance 000000d79a3c61f0 _heartbeatWriteTimer 00007ffe53ff4248 400025d 88 ...m.Threading.Timer 0 instance 000000d79a3c6428 _heartbeatReadTimer 00007ffe53ff4de8 400025e 90 ...ng.AutoResetEvent 0 instance 000000d79a3c33a8 m_heartbeatRead 00007ffe53ff4de8 400025f 98 ...ng.AutoResetEvent 0 instance 000000d79a3c33f8 m_heartbeatWrite 00007ffe53ff9a38 4000260 c4 System.Boolean 1 instance 0 m_inConnectionNegotiation 00007ffdf6258e90 4000261 a0 ...nsumerWorkService 0 instance 000000d79a3c3448k__BackingField 00007ffe53ff21f0 4000262 bc System.UInt32 1 instance 131072 k__BackingField 00007ffdf625ace0 4000263 a8 ...AmqpTcpEndpoint[] 0 instance 0000000000000000 k__BackingField 00007ffe53954820 4000264 b0 ...bject, mscorlib]] 0 instance 000000d79a3c5468 k__BackingField 0:032> !do2 000000d79a3c31b8 0x000000d79a3c31b8 RabbitMQ.Client.Framing.Impl.Connection 0000 m_eventLock : 000000d79a3c32a0 (System.Object) 0008 m_heartbeatFrame : 000000d79a3c32b8 (RabbitMQ.Client.Impl.Frame) 0010 m_appContinuation : 000000d79a3c32f8 (System.Threading.ManualResetEvent) 0018 m_callbackException : NULL 0020 m_clientProperties : 000000d79a3c51d8 (System.Collections.Generic.Dictionary ) 0028 m_closeReason : NULL 0030 m_connectionBlocked : NULL 0038 m_connectionShutdown : 000000d79a3c6750 (System.EventHandler ) 0040 m_connectionUnblocked : NULL 0048 m_factory : 000000d79a3c2698 (RabbitMQ.Client.ConnectionFactory) 0050 m_frameHandler : 000000d79a3c2bc0 (RabbitMQ.Client.Impl.SocketFrameHandler) 0058 m_model0 : 000000d79a3c3860 (RabbitMQ.Client.Framing.Impl.Model) 0060 m_session0 : 000000d79a3c35f0 (RabbitMQ.Client.Impl.MainSession) 0068 m_sessionManager : 000000d79a3c6018 (RabbitMQ.Client.Impl.SessionManager) 0070 m_shutdownReport : 000000d79a3c3370 (RabbitMQ.Util.SynchronizedList ) 0078 _heartbeatWriteTimer : 000000d79a3c61f0 (System.Threading.Timer) 0080 _heartbeatReadTimer : 000000d79a3c6428 (System.Threading.Timer) 0088 m_heartbeatRead : 000000d79a3c33a8 (System.Threading.AutoResetEvent) 0090 m_heartbeatWrite : 000000d79a3c33f8 (System.Threading.AutoResetEvent) 0098 k__BackingField : 000000d79a3c3448 (RabbitMQ.Client.ConsumerWorkService) 00a0 k__BackingField : NULL 00a8 k__BackingField : 000000d79a3c5468 (System.Collections.Generic.Dictionary ) 00b0 m_missedHeartbeats : 0 (System.Int32) 00b4 k__BackingField : 131072 (System.UInt32) 00b8 m_heartbeat : 60 (System.UInt16) 00ba m_closed : False (System.Boolean) 00bb m_running : True (System.Boolean) 00bc m_inConnectionNegotiation : False (System.Boolean) 00c0 m_id : 000000d79a3c3280 1c8d8e96-0c6a-4c62-84b7-9feb241b67ad (System.Guid) 00d0 m_heartbeatTimeSpan : 000000d79a3c3290 00:00:15 (System.TimeSpan)
类似的,其他的内存变量也可以看,帮助我们分析问题。
找到问题的大致原因后,我们需要禁用异常捕获,退出调试,如果直接关闭Windbg,此时附加调试的进程就被强制退出了,一定要注意,请使用下面的指令:
6. 禁用异常捕获 sxd clr
7. 退出调试 qd
sxd clr
qd
以上就是跟大家分享的附加进程实时调试,总结一下套路:
- F6 Attache进程
- 加载SOS, .loadby sos clr
- 启用异常捕获 sxe clr
- 运行,捕获&查看异常 g !pe
- 查看异常所在线程堆栈!clrstack, 查看异常所在线程堆栈的参数和变量 !clrstack -a / !dso
- 禁用异常捕获 sxd clr
- 退出调试 qd
分享给大家。
周国庆
2018/11/4