In the GCC 5.1 release libstdc++ introduced a new library ABI that includes new implementations of std::string and std::list. These changes were necessary to conform to the 2011 C++ standard which forbids Copy-On-Write strings and requires lists to keep track of their size. - Dual ABI
先说一些题外话,旧版本GCC中的 COW 印证了我在《Some Thoughts》中提到的观点,计算机中的世界是现实的,追求利益最大化,充满欺骗的。lazy思想满天飞,各种拆东墙补西墙,最典型的莫过于虚拟地址空间,当然计算机中的数据是没有思想的,就应该这样做。
COW就是Copy-On-Write,虽然C++的性能很好,程序员可以手动管理内存,AOT编译,但还是会有一些性能开销耗费在毫无意义的拷贝(特别是深拷贝)上。对于临时对象所带来的拷贝开销,以前编译器可以通过NRVO
和RVO
等优化措施,来规避这部分的开销。后面C++11 补充了右值引用和move语义,把权限放给程序员,让程序员主动暴露出更多的优化机会,类似的还有std::string_view
。
而COW则是编译器所做的优化,类似于NRVO
和RVO
,但是又和后两者不同,后两者属于Copy Elision,直接省略中间的constructor,遵循了C++中的as-if-rule。而COW则是对std::string
的实现进行了”定制“,也算是遵循了as-if-rule吧。
Allows any and all code transformations that do not change the observable behavior of the program. - The as-if rule
COW的具体思想如下:
Basic idea: to share a data buffer among string instances, and only make a copy for a specific instance (the copy on write) when that instance’s data is modified. - Why COW was deemed ungood for std::string
例如下面的代码,在对other
进行修改之前,str
和other
是共享一块内存区域的。这样的lazy
思想在计算机中的世界中很常见,linux中的fork也是类似。
// debian8
#include
#include
int main() {
std::string str = "hello world"; // str owns the string 'hello world'
std::string other = str; // no copy occurs, more like shallow copy
std::cout << (void*) str.data() << std::endl;
std::cout << (void*) other.data() << std::endl;
}
// 结果如下
$0x1a4a028
$0x1a4a028
When a fork() system call is issued, a copy of all the pages corresponding to the parent process is created, loaded into a separate memory location by the OS for the child process. But this is not needed in certain cases. Consider the case when a child executes an “exec” system call (which is used to execute any executable file from within a C program) or exits very soon after the fork(). When the child is needed just to execute a command for the parent process, there is no need for copying the parent process’ pages, since exec replaces the address space of the process which invoked it with the command to be executed.
In such cases, a technique called copy-on-write (COW) is used. With this technique, when a fork occurs, the parent process’s pages are not copied for the child process. Instead, the pages are shared between the child and the parent process. Whenever a process (parent or child) modifies a page, a separate copy of that particular page alone is made for that process (parent or child) which performed the modification. This process will then use the newly copied page rather than the shared one in all future references. The other process (the one which did not modify the shared page) continues to use the original copy of the page (which is now no longer shared). This technique is called copy-on-write since the page is copied when some process writes to it.
由于NRVO
和RVO
等编译器优化技术的流行,C++11使用Copy Elision
来表示此类技术,并将其纳入到了C++标准中。感兴趣的可以使用参数-fno-elide-constructors
来查看copy elision与否的区别。
Under the following circumstances, the compilers are required to omit the copy and move construction of class objects, even if the copy/move constructor and the destructor have observable side-effects. The objects are constructed directly into the storage where they would itherwise be copied/moved to. - Copy elision
我写了一个粗略的实现,有很多bug。
class my_string {
std::shared_ptr<std::vector<char>> ptr;
bool owner;
public:
my_string(const char* str) : ptr(std::make_shared<std::vector<char>>()), owner(true) {
while(*str != '\0') {
ptr->push_back(*str++);
}
}
my_string(const my_string &rhs) {
ptr = rhs.ptr;
owner = false;
}
char& operator[](size_t i) {
// Expose the internal address, we must be sure that current object owns the buffer.
if (!owner) {
ptr = std::make_shared<std::vector<char>>(*ptr);
}
return (*ptr)[i];
}
my_string& operator=(const char* str) {
if (!owner) {
ptr = std::make_shared<std::vector<char>>();
}
while(*str != '\0') {
ptr->push_back(*str++);
}
owner = true;
return *this;
}
char* data() {
return ptr->data();
}
};
其中Why COW was deemed ungood for std::string写了写了一个稍微没那么粗糙的版本,实现原理大同小异。
这里通过gcc-4.6.2对应的libstdc++
实现来说明libstdc++是如何在std:string
之上实现Copy-On-Write的,实现原理和上述的实现差不多,都是通过一个reference count来判断当前buffer被多少object share了,然后再遇到可能对std::string
进行修改的API调用时,再真正进行copy。具体的代码可以参照basic_string.h。
注:std::string就是std::basic_string
那么COW存在什么问题呢?stackoverflow中给出了一个说明COW会导致dangling pointer,这个例子真的很巧妙。
std::string s("str");
const char *p = s.data();
{
std::string s2(s);
(void) s[0]; // This line will unshares the buffer, so
}
std::cout << *p << '\n';
stackoverflow给出了大致的解释
What happens is that when
s2
is constructed it shares the data withs
, but obtaining a non-const reference vias[0]
requires the data to be unshared, sos
does a “copy on write” because the references[0]
could potentially be used to write intos
, thens2
goes out of scope, destroying the array pointed to byp
.
整个过程如下图所示:
注:上面的示意图有些粗糙,具体的实现代码比较复杂,有一些优化
想要查看代码的有一些关键字,_M_mutate()
,_M_dispose
,_M_destroy
。其实pointer失效的原因主要是COW的实现问题,关于如何修正COW避免pointer失效的discussion的评论下面有详细的探讨,最终答案还是使pointer失效的实现还是最优的,未来再补充。
那么COW
在什么地方违背了C++11呢?主要集中在operator[]
和data()
是否会让pointer等失效的问题上。
The C++03 standard explicitly permits that behaviour in 21.3 [lib.basic.string] p5 where it says that subsequent to a call to data() the first call to operator may invalidate pointers, references and iterators. - Legality of COW std::string implementation in C++11
示例代码中先调用了s.data()
,然后s.operator[]
,C++03中规定这样做会使pointer失效。但是这个规定在C++11中就不存在了。
The C++11 standard no longer permits that behaviour, because no call to operator may invalidate pointers, references or iterators, irrespective of whether they follow a call to data(). - Legality of COW std::string implementation in C++11
我翻了翻最新的标准(2020年2月1日),也提到了无论如何operator[]
和data()
不应使reference,pointer或者iterator失效。
另外一个重要的问题,就是multithread情况下,COW的性能很差,paper Concurrency Modifications to Basic String进行了详细的讲解,从而提议直接disallow copy-on-write的实现。
我使用vagrant添加了generic/debian9
和generic/debian8
两个映像,其中generic/debian9
中gcc的version是6.3.0,generic/debian8
中gcc的version是4.9.2
,前者没有COW,后者有COW。
首先使用下面的代码generic/debian8
和generic/debian9
上测试COW的情况,可以看到在debian8
上进行的是浅拷贝,未来有write需求的时候再进行深拷贝。
#include
#include
int main() {
std::string str = "hello world"; // str owns the string 'hello world'
std::string other = str;
std::cout << (void*) other.data() << std::endl;
std::cout << (void*) str.data() << std::endl;
other.append("!");
std::cout << (void*) other.data() << std::endl;
}
// g++ -std=c++03 test.cpp
// debian8结果如下
$ 0xd88028
$ 0xd88028
$ 0xc47058
// g++ -std=c++03 test.cpp
// debian9结果如下
$ 0x7ffe04a1ce80
$ 0x7ffe04a1ce60
$ 0x7ffe04a1ce60
注:两者都是64位,不知道为什么打印出来的地址format不一样
但是核心问题是如何复现stackoverflow中的代码,并判断指针p
确实被invalidated了,下面是测试的代码。
注:本来想debug libstdc++的,但好像比较麻烦,暂时使用下面的方法
// debian8, gcc
#include
#include
int main() {
// std::string str = "hello world"; // str owns the string 'hello world'
// std::string other = str;
// std::cout << (void*) other.data() << std::endl;
// std::cout << (void*) str.data() << std::endl;
std::string str("str");
const char* p = str.data();
std::cout << "p pointer: " << (void*)p << std::endl;
std::cout << (void*)(p - 4) << ": " << std::hex << (int)*(p - 4) << std::endl;
std::cout << (void*)(p - 8) << ": " << std::hex << (int)*(p - 8) << std::endl;
std::cout << (void*)(p - 12) << ": " << std::hex << (int)*(p - 12) << std::endl;
std::cout << (void*)(p - 16) << ": " << std::hex << (int)*(p - 4) << std::endl;
std::cout << (void*)(p - 20) << ": " << std::hex << (int)*(p - 4) << std::endl;
std::cout << (void*)(p - 24) << ": " << std::hex << (int)*(p - 4) << std::endl;
{
std::string other(str);
std::cout << "other pointer: " << (void*)other.data() << ", str pointer; " << (void*)str.data() << std::endl;
(void) str[0];
std::cout << "other pointer: " << (void*)other.data() << ", str pointer; " << (void*)str.data() << std::endl;
}
std::cout << (void*)(p - 4) << ": " << std::hex << (int)*(p - 4) << std::endl;
std::cout << (void*)(p - 8)<< std::hex << (int)*(p - 8) << std::endl;
std::cout << (void*)(p - 12) << std::hex << (int)*(p - 12) << std::endl;
std::cout << (void*)(p - 16) << ": " << std::hex << (int)*(p - 4) << std::endl;
std::cout << (void*)(p - 20) << ": " << std::hex << (int)*(p - 4) << std::endl;
std::cout << (void*)(p - 24) << ": " << std::hex << (int)*(p - 4) << std::endl;
std::cout << *p << std::endl;
}
// debian8 ouput
$ p pointer: 0xcfd028
$ 0xcfd024: 0
$ 0xcfd020: 0
$ 0xcfd01c: 0
$ 0xcfd018: 3
$ 0xcfd014: 0
$ 0xcfd010: 3
$ other pointer: 0xcfd028, str pointer; 0xcfd028
$ other pointer: 0xcfd028, str pointer; 0xcfd058
$ 0xcfd024: 0
$ 0xcfd020: ffffffff
$ 0xcfd01c: 0
$ 0xcfd018: 3
$ 0xcfd014: 0
$ 0xcfd010: 0
s
// debian8 output
$ p pointer: 0x7ffeab701aa0
$ 0x7ffeab701a9c: 0
$ 0x7ffeab701a98:3
$ 0x7ffeab701a94: fffffffe
$ 0x7ffeab701a90: ffffffa0
$ 0x7ffeab701a8c: 46
$ 0x7ffeab701a88ffffffbd
other pointer: 0x7ffeab701a80, str pointer; 0x7ffeab701aa0
other pointer: 0x7ffeab701a80, str pointer; 0x7ffeab701aa0
$ 0x7ffeab701a9c: 0
$ 0x7ffeab701a98:3
$ 0x7ffeab701a94: fffffffe
$ 0x7ffeab701a90: ffffffa0
$ 0x7ffeab701a8c: 46
$ 0x7ffeab701a88: ffffffbd
s
现在问题的核心就是“如何判断一个指针是不是dangling pointer?”,我没有找到比价合适的方法,这里就使用memory allocator在分配内存时的bookeeping information(safeguard bytes)来判断pointer指向的memory有没有被free。
从两者的output可以看出来,在debian8上,虽然*p
打印出来的内容还是s
,但是这块内存前面的bookkeeping information或者说guard bytes已经被invalidated了。但是要搞清楚具体的含义,还是要弄清楚allocator在分配内存和释放内存时,是如何处理guard bytes的。
关于内存分配时的bookkeeping information,我在几年前的博客C++中的new/delete和new[]/delete[]中介绍过相关内容。
最终的内存分配是否会通过glibc中的malloc实现?maybe。
std::move
与copy elisionRVO V.S. std::move
paper 更新版
What exactly is the “as-if” rule?
“Instruction Re-ordering Everywhere: The C++ ‘As-If’ Rule and the Role of Sequence"