作者:破砂锅
开源的GDB被广泛使用在Linux、OSX、Unix和各种嵌入式系统(例如手机),这次它又带给我们一个惊喜。
多线程调试之痛
调试器(如VS2008和老版GDB)往往只支持all-stop模式,调试多线程程序时,如果某个线程断在一个断点上,你的调试器会让整个程序freeze,直到你continue这个线程,程序中的其他线程才会继续运行。这个限制使得被调试的程序不能够像真实环境中那样运行--当某个线程断在一个断点上,让其他线程并行运行。
GDBv7.0引入的non-stop模式使得这个问题迎刃而解。在这个模式下,
- 当某个或多个线程断在一个断点上,其他线程仍会并行运行
- 你可以选择某个被断的线程,并让它继续运行
让我们想象一下,有了这个功能后
- 当其他线程断在断点上时,程序里的定时器线程可以正常的运行了,从而避免不必要得超时
- 当其他线程断在断点上时,程序里的watchdog线程可以正常的运行了,从而避免嵌入式硬件以为系统崩溃而重启
- 可以控制多个线程运行的顺序,从而重现deadlock场景了。由于GDB可以用python脚本驱动调试,理论上可以对程序在不同的线程运行顺序下进行自动化测试。
因此,non-stop模式理所当然成为多线程调试“必杀技”。这2009年下半年之后发布的Linux版本里都带有GDBv7.0之后的版本。很好奇,不知道VS2010里是不是也支持类似的调试模式了。
演示GDB的non-stop模式
让破砂锅用一个C++小程序在Ubuntu Linux 09.10下demo这个必杀技。虽然我的demo使用命令行版gdb,如果你喜欢图形化的调试器,Eclipse2009年5月之后的版本可以轻松的调 用这个功能,详情参见Eclipse参见http://live.eclipse.org/node/723
1. 编译以下程序nonstop
1
//
gdb non-stop mode demo
2
//
build instruction: g++ -g -o nonstop nonstop.cpp -lboost_thread
3
4
#include
<
iostream
>
5
#include
<
boost
/
thread
/
thread.hpp
>
6
7
struct
op
8
{
9
op(
int
id): m_id(id) {}
10
11
void
operator
()()
12
{
13
std::cout
<<
m_id
<<
"
begin
"
<<
std::endl;
14
std::cout
<<
m_id
<<
"
end
"
<<
std::endl;
15
}
16
17
int
m_id;
18
};
19
20
int
main(
int
argc,
char
**
argv)
21
{
22
boost::thread t1(op(
1
)), t2(op(
2
)), t3(op(
3
));
23
t1.join(); t2.join(); t3.join();
24
return
0
;
25
}
26
2. 把一下3行添加到~/.gdbinit来打开non-stop模式
set
target-async
1
set
pagination
off
set
non-stop
on
3. 启动gdb,设断点,运行.可以看到主线程1是running,3个子线程都断在断点上,而不是只有一个子线程断在断点上.
~/devroot/nonstop$ gdb ./nonstop
GNU gdb (GDB)
7.0
-ubuntu
Reading symbols from /home/frankwu/devroot/nonstop/nonstop...done.
(gdb) break
14
Breakpoint
1
at 0x402058: file nonstop.cpp
,
line
14
.
(gdb) break
24
Breakpoint
3
at 0x401805: file nonstop.cpp
,
line
24
.
(gdb) run
Starting program: /home/frankwu/devroot/nonstop/nonstop
[
Thread debugging using libthread_db enabled
]
[
New Thread 0x7ffff6c89910 (LWP 2762)
]
[
New Thread 0x7ffff6488910 (LWP 2763)
]
1
begin
Breakpoint
1
,
op::operator() (this
=
0x605118) at nonstop.cpp:
14
14
std::cout << m_id <<
"
end
"
<< std::endl
;
2
begin
Breakpoint
1
,
op::operator() (this
=
0x605388) at nonstop.cpp:
14
14
std::cout << m_id <<
"
end
"
<< std::endl
;
[
New Thread 0x7ffff5c87910 (LWP 2764)
]
3
begin
Breakpoint
1
,
op::operator() (this
=
0x605618) at nonstop.cpp:
14
14
std::cout << m_id <<
"
end
"
<< std::endl
;
(gdb) info threads
4
Thread 0x7ffff5c87910 (LWP
2764
) op::operator() (this
=
0x605618) at nonstop.cpp:
14
3
Thread 0x7ffff6488910 (LWP
2763
) op::operator() (this
=
0x605388) at nonstop.cpp:
14
2
Thread 0x7ffff6c89910 (LWP
2762
) op::operator() (this
=
0x605118) at nonstop.cpp:
14
*
1
Thread 0x7ffff7fe3710 (LWP
2759
) (running)
4. 让线程3继续运行,注意我顾意把主线程1也continue,这是我发现的workaround,否则gdb不能切回thread 1.
(gdb) thread apply
3
1
continue
Thread
3
(Thread 0x7ffff6488910 (LWP
2763
)):
Continuing.
Thread
1
(Thread 0x7ffff7fe3710 (LWP
2759
)):
Continuing.
Cannot execute this command while the selected thread is running.
2
end
[
Thread 0x7ffff6488910 (LWP 2763) exited
]
warning: Unknown thread
3
.
Thread
1
(Thread 0x7ffff7fe3710 (LWP
2759
)):
Continuing.
Cannot execute this command while the selected thread is running.
(gdb) info threads
4
Thread 0x7ffff5c87910 (LWP
2764
) op::operator() (this
=
0x605618) at nonstop.cpp:
14
2
Thread 0x7ffff6c89910 (LWP
2762
) op::operator() (this
=
0x605118) at nonstop.cpp:
14
*
1
Thread 0x7ffff7fe3710 (LWP
2759
) (running)
5. 让另外两个线程继续运行而结束,主线程断在第24行,最后结束.
(gdb) thread apply
4
2
1
continue
Thread
4
(Thread 0x7ffff5c87910 (LWP
2764
)):
Continuing.
Thread
2
(Thread 0x7ffff6c89910 (LWP
2762
)):
Continuing.
Thread
1
(Thread 0x7ffff7fe3710 (LWP
2759
)):
Continuing.
Cannot execute this command while the selected thread is running.
3
end
1
end
[
Thread 0x7ffff5c87910 (LWP 2764) exited
]
[
Thread 0x7ffff6c89910 (LWP 2762) exited
]
Breakpoint
3
,
main (argc
=
1
,
argv
=
0x7fffffffe348) at nonstop.cpp:
24
24
return
0
;
(gdb) continue
Thread
1
(Thread 0x7ffff7fe3710 (LWP
2759
)):
Continuing.
Program exited normally.
参考资料
Debugging with GDB
Reverse Debugging, Multi-Process and Non-Stop Debugging Come to the CDT