SafePoint 是 Java 代码中的一个线程可能暂停执行的位置。SafePoint 保存了在其他位置没有的一些运行时信息。SafePoint 保存了线程上下文中的任何东西,包括对象,指向对象或非对象的内部指针。
在 JVM 处于 SafePoint 时,可以做什么呢?
在 JVM 源码的注释里提到,当系统要进入 SafePoint 时,不同状态的 Java 线程的暂停机制是不一样的:
Java 线程是如何进入 SafePoint 的呢?
通常 safepoint 有三种状态:
程序运行时,可以设置 JVM 参数 -XX:+PrintSafepointStatistics –XX:PrintSafepointStatisticsCount=1 来输出 SafePoint 的统计信息。
当 JVM 要让线程暂停 STW 时,会调用 SafepointSynchronize::begin 方法,该方法在 safepoint.cpp 里。源码地址
void SafepointSynchronize::begin() {
Thread* myThread = Thread::current();
assert(myThread->is_VM_thread(), "Only VM thread may execute a safepoint");
if (PrintSafepointStatistics || PrintSafepointStatisticsTimeout > 0) { // 输出统计信息
_safepoint_begin_time = os::javaTimeNanos();
_ts_of_current_safepoint = tty->time_stamp().seconds();
}
safepoint.cpp 里有一段注释,描述了不同状态的线程是如何处理 safepoint 的:
// Begin the process of bringing the system to a safepoint.
// Java threads can be in several different states and are
// stopped by different mechanisms:
//
// 1. Running interpreted
// The interpeter dispatch table is changed to force it to
// check for a safepoint condition between bytecodes.
// 2. Running in native code
// When returning from the native code, a Java thread must check
// the safepoint _state to see if we must block. If the
// VM thread sees a Java thread in native, it does
// not wait for this thread to block. The order of the memory
// writes and reads of both the safepoint state and the Java
// threads state is critical. In order to guarantee that the
// memory writes are serialized with respect to each other,
// the VM thread issues a memory barrier instruction
// (on MP systems). In order to avoid the overhead of issuing
// a memory barrier for each Java thread making native calls, each Java
// thread performs a write to a single memory page after changing
// the thread state. The VM thread performs a sequence of
// mprotect OS calls which forces all previous writes from all
// Java threads to be serialized. This is done in the
// os::serialize_thread_states() call. This has proven to be
// much more efficient than executing a membar instruction
// on every call to native code.
// 3. Running compiled Code
// Compiled code reads a global (Safepoint Polling) page that
// is set to fault if we are trying to get to a safepoint.
// 4. Blocked
// A thread which is blocked will not be allowed to return from the
// block condition until the safepoint operation is complete.
// 5. In VM or Transitioning between states
// If a Java thread is currently running in the VM or transitioning
// between states, the safepointing code will wait for the thread to
// block itself when it attempts transitions to a new state.
对于执行字节码的线程,是通过替换 dispatch table 来使其进入 safepoint 状态的,dispatch table 是 JVM 用来记录方法地址进行跳转的。Java 里有三个 DispatchTable,分别是:
修改 dispatch table 的代码如下所示,在进入 safepoint 时,调用了 notice_safepoints 方法,将 _active_table 置为了 _safept_table。相对的,还有一个 ignore_safepoints 方法,是在退出 safepoint 时调用的,该方法将 _normal_table 赋值给了 _active_table。
DispatchTable TemplateInterpreter::_active_table;
DispatchTable TemplateInterpreter::_normal_table;
DispatchTable TemplateInterpreter::_safept_table;
void TemplateInterpreter::notice_safepoints() {
if (!_notice_safepoints) {
// switch to safepoint dispatch table
_notice_safepoints = true;
copy_table((address*)&_safept_table, (address*)&_active_table, sizeof(_active_table) / sizeof(address));
}
}
void TemplateInterpreter::ignore_safepoints() {
if (_notice_safepoints) {
if (!JvmtiExport::should_post_single_step()) {
// switch to normal dispatch table
_notice_safepoints = false;
copy_table((address*)&_normal_table, (address*)&_active_table, sizeof(_active_table) / sizeof(address));
}
}
}
对于正在执行 native code 的线程,VM Thread 不需要等待其执行完成,当该线程返回 Java 代码时,会取检查 safepoint 状态的。检查代码位于 thread.cpp 的 check_safepoint_and_suspend_for_native_trans 方法里,该方法调用了 SafepointSynchronize::do_call_back 方法判断,如果当前状态不是 _not_synchronized,则 block。
// int thread.cpp
void JavaThread::check_safepoint_and_suspend_for_native_trans(JavaThread *thread) {
if (SafepointSynchronize::do_call_back()) {
// If we are safepointing, then block the caller which may not be
// the same as the target thread (see above).
SafepointSynchronize::block(curJT);
}
}
// in safepoint.cpp
inline static bool do_call_back() {
return (_state != _not_synchronized);
}
对于执行 JIT 编译代码的线程,是通过检查 polling page 是否可读来判断是否进入 safepoint 的。在 SafepointSynchronize::begin 方法里,调用了 make_polling_page_unreadable 方法,该方法最终是通过 mprotect 方法来修改 polling page 的访问权限的。
// in safepoint.cpp
void SafepointSynchronize::begin() {
if (UseCompilerSafepoints && int(iterations) == DeferPollingPageLoopCount) {
guarantee (PageArmed == 0, "invariant") ;
PageArmed = 1 ;
os::make_polling_page_unreadable();
}
// in os_bsd.cpp
void os::make_polling_page_unreadable(void) {
if( !guard_memory((char*)_polling_page, Bsd::page_size()) )
fatal("Could not disable polling page");
}
bool os::guard_memory(char* addr, size_t size) {
return bsd_mprotect(addr, size, PROT_NONE);
}
static bool bsd_mprotect(char* addr, size_t size, int prot) {
char* bottom = (char*)align_size_down((intptr_t)addr, os::Bsd::page_size());
assert(addr == bottom, "sanity check");
size = align_size_up(pointer_delta(addr, bottom, 1) + size, os::Bsd::page_size());
return ::mprotect(bottom, size, prot) == 0;
}
当编译好的程序去访问可不读的 polling page 时,会产生一个错误信号 SIGSEGV,经过处理后最终会调用 SafepointSynchronize::handle_polling_page_exception 方法,该方法最终调用了 SafepointSynchronize::block 方法来阻塞线程。