Prior to C++11, it is difficult to implement the singleton pattern in a both thread-safe and efficient way, much less in portable.
The cruel truth about the double-checked locking pattern (DCLP), once came up for attempting to implement singleton safely and efficiently, is broken should be familiar to all of us. But it would be better to review causes making DCLP vulnerable before we get into the topic of breaking changes provided by C++11.
NOTE : full content about why DCLP fails can be found in Meyers-Alexendrescue’s paper.
The classic DCLP implementation is as follows:
C++
Singleton* Singleton::instance()
{
if (pInstance == 0) {
// 1st test
Lock lock;
if (pInstance == 0) {
// 2nd test
pInstance = new Singleton;
}
}
return pInstance;
}
For the statement constructing the singleton object (the one right after second test), it actually has three constituents:
Unfortunately, there is a possibility that a compiler may swap stage 2 and 3 (named instruction reordering), therefore producing code essentially equivalent to
C++
// pseduo-code
Singleton* Singleton::instance()
{
if (pInstance == 0) {
// 1st test
Lock lock;
if (pInstance == 0) {
// 2nd test
pInstance = operator new(sizeof(Singleton));
new (pInstance) Singleton;
}
}
return pInstance;
}
After reordering, this function has a potential risk: an invocation on yet another thread may end up with referring to an uninitialized singleton object.
Compilers rearrange statements or instructions in order to gain possible CPU execution parallelism, to avoid spilling data from a register, and to keep instruction pipeline full. They do this optimization only if it won’t result in any perceivable behavior change.
However, the older version of C++ has no notion of multithreading, optimizers make assumptions only based on single thread environment.
On computers with multiple processor, the situation is even worse: reordering of instruction execution occurs on CPU-level.
Making a short instance: one processor modifies a shared variable x, and then modifies yet another shared variable y, but it may be more efficient if flushing value of y ahead of x. And if it happens, one or more of other processors may see y’s value change before x’s.
Such possibility is a severe problem for implementing safe DCLP, because it may cause the same bug as by reordering on compiler level.
Once again, the older version of C++ has no notion of multithreading.
Classic DLCP implementation fails because it tries to constraint instruction reordering by using only language facilities, which finally turns out to be impossible.
If you want to control this reordering accurately, you must go outside standard C++, using assembly or calling system calls which themselves are written in assembly.
For example, a thread-safe DCLP on Windows can be implemented as follows ( code excerpted from base lib in chromium ):
C++
OSInfo* OSInfo::GetInstance()
{
static OSInfo* info = nullptr;
if (!info) {
OSInfo* new_info = new OSInfo();
if (InterlockedCompareExchangePointer(
reinterpret_cast
(&info), new_info, nullptr)) {
delete new_info;
}
}
return info;
}
It uses an atomic operation API to enforce ordering of instruction execution.
Luckily, with the advent of C++11, the language itself eventually provides support for multithreading and specification about memory model. And we finally are capable of implementing the singleton pattern safely and efficiently.
Moreover, there is more than one way at your disposal.
By using atomic operations with acquire-release fences, you can implement DCLP safely and efficiently.
Sample code is excerpted from http://preshing.com/20130930/double-checked-locking-is-fixed-in-cpp11/
C++
std::atomic
Singleton::m_instance;
std::mutex Singleton::m_mutex;
Singleton* Singleton::getInstance()
{
Singleton* tmp = m_instance.load(std::memory_order_acquire);
if (tmp == nullptr) {
std::lock_guard
lock(m_mutex);
tmp = m_instance.load(std::memory_order_relaxed);
if (tmp == nullptr) {
tmp = new Singleton;
m_instance.store(tmp, std::memory_order_release);
}
}
return tmp;
}
By default, atomic operations use sequential-consistent fences, i.e. full memory barrier, which might be overkill in comparison with using acquire-release fences.
Yet another approach is to use std::call_once function.
This function executes a callable object exactly once, even if called from several threads, and implementation code is more succinct than DCLP.
Sample code below is a variant, and the original version is in https://github.com/kingsamchen/KBase_Demo/blob/master/kbase/memory/singleton.h
C++
static T* instance_
static std::once_flag flag_;
static T* Singleton::instance()
{
std::call_once(flag_,
[](Singleton& instance) {
instance = new Singleton;
},
instance_);
return instance_;
}
Surprisingly, the simplest method in C++11 to get a thread-safe singleton is the use of static initializer.
C++
Singleton& Singleton::Instance()
{
static Singleton instance;
return instance;
}
This awesome and magic improvement is backed up by the C++11 standard:
If control enters the declaration concurrently while the variable is being initialized, the concurrent execution shall wait for completion of the initialization.
However, this method only works in compilers that fully comply with this part of C++11. Unfortunately, Visual Studio 2013 is not one of them.
之前指导俺家的姑娘考研英语,过程中发现太久没练习导致这方面能力退化了不少,尤其写作,所以随手用英文写了一篇很久之前思考过但是没有公开的一些想法,权当练习写作。
c++11,singleton,thread safety