Chrome如何捕获程序的异常?
一个C++程序, 当发生异常时,比如内存访问违例时,CPU硬件会发现此问题,并产生一个异常(你可以把它理解为中断),然后CPU会把代码流程切换到异常处理服务例程。操作系统异常处理服务例程会查看当前进程是否处于调试状态,如果是,则通知调试器发生了异常,如果不是则操作系统会查看当前线程是否安装了的异常帧链,如果安装了SEH(try.... catch....),则调用SEH,并根据返回结果决定是否全局展开或者局部展开。如果异常链中所有的SEH都没有处理此异常,而且此进程还处于调试状态,则操作系统会再次通知调试器发生异常(二次异常)。如果还没人处理,则调用操作系统的默认异常处理代码UnhandledExceptionHandler,不过操作系统允许你Hook这个函数,就是通过 SetUnhandledExceptionFilter函数来设置。大部分异常通过此种方法都能捕获。
不过在Visual C++ 2005之后, Microsoft 对 CRT ( C 运行时库)的一些与安全相关的代码做了些改动,典型的,例如增加了对缓冲溢出的检查。新 CRT 版本在出现错误时强制把异常抛给默认的调试器(如果没有配置的话,默认是 Dr.Watson ),而不再通知应用程序设置的异常捕获函数,这种行为主要在以下两种情况出现。
(1) 遇到 _invalid_parameter 错误,而应用程序又没有主动调用 _set_invalid_parameter_handler 设置错误捕获函数。
(2) 虚函数调用错误, 而应用程序又没有主动调用_set_purecall_handler设置捕获函数。
在Chrome中对这两种情况也做了特殊处理。专门设置了两个回调函数进行捕获处理。
Chrome的Crash Report主要流程
在Chrome中,支持两种不同模式的Dump。
进程外Dump :由独立的Crash Handle Process处理Dump的生成过程,主进程产生异常时,通过IPC方式通知Crash Handle Process。由Crash Handle Process中的crash_generation_server负责写Dump文件。大致流程如下:
上图中,crash_generation_client和crash_generation_server之间是进程间通讯(IPC)。crash_report_sender负责将dump信息发送到google的crash report server(https://clients2.google.com/cr/report)。
进程内Dump :与进程外方式类似,只不过在Browser进程中增加了一个crash_handle_thread线程,由此线程负责写dump.基本流程如下:
crash_genration_client的实现
几个关键信号量变量
HANDLE server_alive_;
表示crash_handle_process是否活动的变量
HANDLE crash_event_;
表示crash_generation_client是否有exception事件发生的信号量。在crash_generation_client和crash_generation_server建立IPC通道后,crash_generation_server将等待这个信号量。
HANDLE crash_generated_;
表示crash_generation_server是否已写完dump文件的信号量。由crash_generation_server在写完dum文件后,设置该信号量。
几个关键变量
CustomClientInfo custom_info_;
描述当前发生Exception的进程的一些信息,在这里可能是Browser进程,也可能是Render进程。
EXCEPTION_POINTERS* exception_pointers_;
异常发生时,所有异常信息保存该指针指向的内存中。
MDRawAssertionInfo assert_info_;
Assert异常信息指针。
在crash_generation_client初始化时,将向crash_generation_server注册,建立ICP通道,且把上面几个地址发送给crash_generation_server,当后续crash_generation_client发生异常时,crash_generation_server将从这几个地址中读取信息,生成dump文件。(当然这是进程外模式,进程内模式由browser进程内的独立线程完成这些工作。)
一个关键函数
下面函数是
- bool CrashGenerationClient::SignalCrashEventAndWait() {
- assert(crash_event_);
- assert(crash_generated_);
- assert(server_alive_);
-
- // Reset the dump generated event before signaling the crash
- // event so that the server can set the dump generated event
- // once it is done generating the event.
- if (!ResetEvent(crash_generated_)) {
- return false ;
- }
-
- if (!SetEvent(crash_event_)) {
- return false ;
- }
-
- HANDLE wait_handles[kWaitEventCount] = {crash_generated_, server_alive_};
-
- DWORD result = WaitForMultipleObjects(kWaitEventCount,
- wait_handles,
- FALSE,
- kWaitForServerTimeoutMs);
-
- // Crash dump was successfully generated only if the server
- // signaled the crash generated event.
- return result == WAIT_OBJECT_0;
- }
这个函数是crash_generation_client产生exception时,如何和服务器交互的。基本上在上面介绍变量时已经介绍到了。
crash_generation_client是如何捕获异常的
在本文开始部分已经描述了原理。我们可以看一下实现。
- void ExceptionHandler::Initialize(const wstring& dump_path,
- FilterCallback filter,
- MinidumpCallback callback,
- void * callback_context,
- int handler_types,
- MINIDUMP_TYPE dump_type,
- const wchar_t * pipe_name,
- const CustomClientInfo* custom_info) {
- LONG instance_count = InterlockedIncrement(&instance_count_);
- filter_ = filter;
- callback_ = callback;
- callback_context_ = callback_context;
- dump_path_c_ = NULL;
- next_minidump_id_c_ = NULL;
- next_minidump_path_c_ = NULL;
- dbghelp_module_ = NULL;
- minidump_write_dump_ = NULL;
- dump_type_ = dump_type;
- rpcrt4_module_ = NULL;
- uuid_create_ = NULL;
- handler_types_ = handler_types;
- previous_filter_ = NULL;
- #if _MSC_VER >= 1400 // MSVC 2005/8
- previous_iph_ = NULL;
- #endif // _MSC_VER >= 1400
- previous_pch_ = NULL;
- handler_thread_ = NULL;
- is_shutdown_ = false ;
- handler_start_semaphore_ = NULL;
- handler_finish_semaphore_ = NULL;
- requesting_thread_id_ = 0;
- exception_info_ = NULL;
- assertion_ = NULL;
- handler_return_value_ = false ;
- handle_debug_exceptions_ = false ;
-
- // Attempt to use out-of-process if user has specified pipe name.
- if (pipe_name != NULL) {
- scoped_ptr<CrashGenerationClient> client(
- new CrashGenerationClient(pipe_name,
- dump_type_,
- custom_info));
-
- // If successful in registering with the monitoring process,
- // there is no need to setup in-process crash generation.
- if (client->Register()) {
- crash_generation_client_.reset(client.release());
- }
- }
-
- if (!IsOutOfProcess()) {
- // Either client did not ask for out-of-process crash generation
- // or registration with the server process failed. In either case,
- // setup to do in-process crash generation.
-
- // Set synchronization primitives and the handler thread. Each
- // ExceptionHandler object gets its own handler thread because that's the
- // only way to reliably guarantee sufficient stack space in an exception,
- // and it allows an easy way to get a snapshot of the requesting thread's
- // context outside of an exception.
- InitializeCriticalSection(&handler_critical_section_);
- handler_start_semaphore_ = CreateSemaphore(NULL, 0, 1, NULL);
- assert(handler_start_semaphore_ != NULL);
-
- handler_finish_semaphore_ = CreateSemaphore(NULL, 0, 1, NULL);
- assert(handler_finish_semaphore_ != NULL);
-
- // Don't attempt to create the thread if we could not create the semaphores.
- if (handler_finish_semaphore_ != NULL && handler_start_semaphore_ != NULL) {
- DWORD thread_id;
- handler_thread_ = CreateThread(NULL, // lpThreadAttributes
- kExceptionHandlerThreadInitialStackSize,
- ExceptionHandlerThreadMain,
- this , // lpParameter
- 0, // dwCreationFlags
- &thread_id);
- assert(handler_thread_ != NULL);
- }
-
- dbghelp_module_ = LoadLibrary(L"dbghelp.dll" );
- if (dbghelp_module_) {
- minidump_write_dump_ = reinterpret_cast <MiniDumpWriteDump_type>(
- GetProcAddress(dbghelp_module_, "MiniDumpWriteDump" ));
- }
-
- // Load this library dynamically to not affect existing projects. Most
- // projects don't link against this directly, it's usually dynamically
- // loaded by dependent code.
- rpcrt4_module_ = LoadLibrary(L"rpcrt4.dll" );
- if (rpcrt4_module_) {
- uuid_create_ = reinterpret_cast <UuidCreate_type>(
- GetProcAddress(rpcrt4_module_, "UuidCreate" ));
- }
-
- // set_dump_path calls UpdateNextID. This sets up all of the path and id
- // strings, and their equivalent c_str pointers.
- set_dump_path(dump_path);
- }
-
- // There is a race condition here. If the first instance has not yet
- // initialized the critical section, the second (and later) instances may
- // try to use uninitialized critical section object. The feature of multiple
- // instances in one module is not used much, so leave it as is for now.
- // One way to solve this in the current design (that is, keeping the static
- // handler stack) is to use spin locks with volatile bools to synchronize
- // the handler stack. This works only if the compiler guarantees to generate
- // cache coherent code for volatile.
- // TODO(munjal): Fix this in a better way by changing the design if possible.
-
- // Lazy initialization of the handler_stack_critical_section_
- if (instance_count == 1) {
- InitializeCriticalSection(&handler_stack_critical_section_);
- }
-
- if (handler_types != HANDLER_NONE) {
- EnterCriticalSection(&handler_stack_critical_section_);
-
- // The first time an ExceptionHandler that installs a handler is
- // created, set up the handler stack.
- if (!handler_stack_) {
- handler_stack_ = new vector<ExceptionHandler*>();
- }
- handler_stack_->push_back(this );
-
- if (handler_types & HANDLER_EXCEPTION)
- previous_filter_ = SetUnhandledExceptionFilter(HandleException);
-
- #if _MSC_VER >= 1400 // MSVC 2005/8
- if (handler_types & HANDLER_INVALID_PARAMETER)
- previous_iph_ = _set_invalid_parameter_handler(HandleInvalidParameter);
- #endif // _MSC_VER >= 1400
-
- if (handler_types & HANDLER_PURECALL)
- previous_pch_ = _set_purecall_handler(HandlePureVirtualCall);
-
- LeaveCriticalSection(&handler_stack_critical_section_);
- }
- }
在该函数的Line126中,调用了SetUnhandledExceptionFilter函数,设置了我们要处理的回调函数。
另外针对invalid paramter和purecall两种在VC2005中不支持的特性,做了特殊处理。
crash_generation_server 的实现
crash_generation_server基本上就是一个IPC Server。负责监听各个crash_generation_client的请求。
crash_generation_server的关键函数也就是一个简单的状态机函数:
void CrashGenerationServer::HandleConnectionRequest() {
// If we are shutting doen then get into ERROR state, reset the event so more
// workers don't run and return immediately.
if (shutting_down_) {
server_state_ = IPC_SERVER_STATE_ERROR;
ResetEvent(overlapped_.hEvent);
return ;
}
switch (server_state_) {
case IPC_SERVER_STATE_ERROR:
HandleErrorState();
break ;
case IPC_SERVER_STATE_INITIAL:
HandleInitialState();
break ;
case IPC_SERVER_STATE_CONNECTING:
HandleConnectingState();
break ;
case IPC_SERVER_STATE_CONNECTED:
HandleConnectedState();
break ;
case IPC_SERVER_STATE_READING:
HandleReadingState();
break ;
case IPC_SERVER_STATE_READ_DONE:
HandleReadDoneState();
break ;
case IPC_SERVER_STATE_WRITING:
HandleWritingState();
break ;
case IPC_SERVER_STATE_WRITE_DONE:
HandleWriteDoneState();
break ;
case IPC_SERVER_STATE_READING_ACK:
HandleReadingAckState();
break ;
case IPC_SERVER_STATE_DISCONNECTING:
HandleDisconnectingState();
break ;
default :
assert(false );
// This indicates that we added one more state without
// adding handling code.
server_state_ = IPC_SERVER_STATE_ERROR;
break ;
}
}
这个函数负责维护IPC的各种连接状态。并进行不同处理,相当直观,无须赘述!
crash_report_sender的实现
这个实现非常简单,模拟了一个表单的提交,将minidump信息封装成一个MIME类型,通过HTTP方式提交到服务器上。估计google的crash report server(https://clients2.google.com/cr/report )也就是一个简单的网页处理脚本,完全可以认为是通过一个表单提交上来的信息。
Browser如何使用crash report服务
首先,crash_handle process是一个独立运行的程序,负责监听chrome进程的请求。
其次,在Browser初始化时,生成crash_generation_client实例,
在chrome的主函数入口中包含了
// Initialize the crash reporter.
InitCrashReporterWithDllPath(dll_full_path);
这一行代码,在这个函数中生成了一个全局变量
g_breakpad = new google_breakpad::ExceptionHandler(temp_dir, NULL, callback,
NULL, google_breakpad::ExceptionHandler::HANDLER_ALL,
dump_type, pipe_name.c_str(), info->custom_info);
其中ExceptionHandler类包含了CrashGenerationClient实例。
由于Crash Report服务应该是越早启动越好,因此我们也可以看到chrome初始化该变量的位置也是相当的靠前。
小节
Google的crash_report服务几个关键点:
1.Minidump的定制化处理机制。
2.进程外dump写机制。
3.chrome是如何捕获Exception的。