// take input from file iname and produce output on file oname
//从文件 iname 读入,输出到文件 oname
class Filter {
public:
Filter(const string& iname, const string& oname); // constructor
~Filter(); // destructor
// ...
private:
ifstream is;
ofstream os;
// ...
};
void user()
{
Filter flt {“books”,”authors”};
Filter* p = new Filter{“novels”,”favorites”};
// use flt and *p
delete p;
}
void user2()
{
Filter flt {“books”,”authors”};
unique_ptr p {new Filter{“novels”,”favorites”}};
// use flt and *p
}
void user3()
{
Filter flt {“books”,”authors”};
auto p = make_unique(“novels”,”favorites”);
// use flt and *p
}
void user3()
{
Filter flt {“books”,”authors”};
Filter flt2 {“novels”,”favorites”};
// use flt and flt2
}
class Filter { // take input from file iname and produce output on file oname
public:
Filter(const string& iname, const string& oname);
// ...
private:
ifstream is;
ofstream os;
// ...
};
void user3()
{
Filter flt {“books”,”authors”};
Filter flt2 {“novels”,”favorites”};
// use flt and flt2
}
This happens to be simpler than what you would write in most garbage collected languages (e.g., Java or C#) and it is not open to leaks caused by forgetful programmers. It is also faster than the obvious alternatives (no spurious use of the free/dynamic store and no need to run a garbage collector). Typically, RAII also decreases the resource retention time relative to manual approaches.
这比那些支持垃圾回收的语言写起来更简洁,对于健忘的程序员,也不会导致溢出。显然也比其他可选方案快很多(无需模拟自由、动态内存的存储,无需运行垃圾回收机制)。相对于手动操作,RAII 也降低了资源滞留的时间。
This is my ideal for resource management. It handles not just memory, but general (non-memory) resources, such as file handles, thread handles, and locks. But is it really general? How about objects that needs to be passed around from function to function? What about objects that don’t have an obvious single owner?
这是我理想的资源管理方法,不仅用于内存,还可以用于普通资源像文件句柄,线程句柄,锁等等。但它真的通用了吗?如果对象需要在函数间传递呢?如果对象没有一个明确的单一所属呢?
4.1 Transferring Ownership: move
所有权的移交:move
Let us first consider the problem of moving objects around from scope to scope. The critical question is how to get a lot of information out of a scope without serious overhead from copying or error-prone pointer use. The traditional approach is to use a pointer:
我们先来思考一下在域间移动对象的问题。关键点在于在不避免拷贝或易错指针等重大开销的情况下怎么在域外获取其信息。传统方法是使用指针:
X* make_X()
{
X* p = new X:
// ... fill X ..
return p;
}
void user()
{
X* q = make_X();
// ... use *q ...
delete q;
}
Now who is responsible for deleting the object? In this simple case, obviously the caller of make_X() is, but in general the answer is not obvious. What if make_X() keeps a cache of objects to minimize allocation overhead? What if user() passed the pointer to some other_user()? The potential for confusion is large and leaks are not uncommon in this style of program.
现在谁负责指针的删除工作呢?在上例中,显然是 make_X 的调用者,但通常答案并不明确。如果为了降低开销,make_X 需要对象的缓存呢?如果 user 将指针传递给其他 other_user 呢?在这种编程风格中,极易混乱和溢出。
I could use a shared_ptr or a unique_ptr to be explicit about the ownership of the created object. For example:
我可以使用 shared_ptr 或者 unique_ptr 显式的表明已有对象的归属。举例:
unique_ptr make_X();
But why use a pointer (smart or not) at all? Often, I don’t want a pointer and often a pointer would distract from the conventional use of an object. For example, a Matrix addition function creates a new object (the sum) from two arguments, but returning a pointer would lead to seriously odd code:
但是为嘛非要用指针(智能或非智能)呢?通常我也不想用指针,和传统的使用对象比较,返回指针有点多余(看下面好像是这个意思),比如说,Matrix 类型的加法函数,计算2个参数的和,但却返回一个指针,这看起来好奇怪。
unique_ptr operator+(const Matrix& a, const Matrix& b);
Matrix res = *(a+b);
double sqrt(double); // a square root function
double s2 = sqrt(2); // get the square root of 2
Matrix operator+(const Matrix& a, const Matrix& b); // return the sum of a and b
Matrix r = x+y;
Matrix operator+(const Matrix& a, const Matrix& b)
{
Matrix res;
// ... fill res with element sums ...
return res;
}
class Matrix {
double* elem; // pointer to elements
int nrow; // number of rows
int ncol; // number of columns
public:
Matrix(int nr, int nc) // constructor: allocate elements
:elem{new double[nr*nc]}, nrow{nr}, ncol{nc}
{
for(int i=0; i
A copy operation is recognized by its reference (&) argument. Similarly, a move operation is recognized by its rvalue reference (&&) argument. A move operation is supposed to “steal” the representation and leave an “empty object” behind. For Matrix, that means something like this:
通过判断参数是左值引用或右值引用来区别 copy 和 move 移动。move “窃取信息”后,源对象就成了“空壳”。拿 Matrix 来说,就是这样的:
Matrix::Matrix(Matrix&& a) // move constructor
:nrow{a.nrow}, ncol{a.ncol}, elem{a.elem} // “steal” the representation “窃取资源”
{
a.elem = nullptr; // leave “nothing” behind 置空源对象
}
That’s it! When the compiler sees the return res; it realizes that res is soon to be destroyed. That is, res will not be used after the return. Therefore it applies the move constructor, rather than the copy constructor to transfer the return value. In particular, for
就这么简单!当编译器执行到 "return res;",会意识到 res 很快就会被销毁。那样的话,在 return 后,res 就不能使用了。于是,编译器使用 move 构造而不是 copy 构造转移返回值。
Matrix r = a+b;
the res inside operator+() becomes empty -- giving the destructor a trivial task -- and res’s elements are now owned by r. We have managed to get the elements of the result -- potentially megabytes of memory -- out of the function (operator+()) and into the caller’s variable. We have done that at a minimal cost (probably four word assignments).
特别注意的是,此时 operator+() 中的 res 已经空了,留下一点析构的善后工作,res 所有的元素现在归 r 所有。我们已经将operator+ 中的结果(或许有几兆)转移到调用者的变量中了,我们只用了一点成本,可能只是4行赋值语句。
Expert C++ users have pointed out that there are cases where a good compiler can eliminate the copy on return completely (in this case saving the four word moves and the destructor call). However, that is implementation dependent, and I don’t like the performance of my basic programming techniques to depend on the degree of cleverness of individual compilers. Furthermore, a compiler that can eliminate the copy, can as easily eliminate the move. What we have here is a simple, reliable, and general way of eliminating complexity and cost of moving a lot of information from one scope to another.
已经有专业用户指出,某些情况下,好的编译器可以清除返回的 copy 信息(这中情况下,会保存4行 move 操作和析构调用)。然而这是对现实的依赖,我不喜欢由个别编译器的智能程度来决定我的基础编程能力的性能。而且能清除 copy 的编译器肯定能清除 move. 我们现在有一套简单可行通用的方法去消除域间移动大数据时带来的复杂性和开销。
Often, we don’t even need to define all those copy and move operations. If a class is composed out of members that behave as desired, we can simply rely on the operations generated by default. Consider:
通常,我们不必定义所有的 copy move 操作,如果一个类缺少所需的成员操作,我们可以依赖默认生成的操作。
class Matrix {
vector elem; // elements
int nrow; // number of rows
int ncol; // number of columns
public:
Matrix(int nr, int nc) // constructor: allocate elements
:elem(nr*nc), nrow{nr}, ncol{nc}
{ }
// ...
};
This version of Matrix behaves like the version above except that it copes slightly better with errors and has a slightly larger representation (a vector is usually three words).
这个版本很像上面的,除了对错误稍微的处理和更多的描述(没看明白这句啥意思)
What about objects that are not handles? If they are small, like an int or a complex
那些不是句柄的对象呢?如果他们像 int 那么小,或者 complex
Unfortunately, a Matrix like the one I used in the example is not part of the ISO C++ standard library, but several are available (open source and commercial). For example, search the Web for “Origin Matrix Sutton” and see Chapter 29 of my The C++ Programming Language (Fourth Edition) [11] for a discussion of the design of such a matrix.
不幸的是,上面使用的 Matrix 并不是标准库里的,但是很多都可用。在网上搜索“Origin Matrix Sutton”,你可以看见在我的书The C++ Programming Language (Fourth Edition)的第29章在讨论如何设计这样的一个矩阵。
4.2 Shared Ownership: shared_ptr
共享所有
In discussions about garbage collection it is often observed that not every object has a unique owner. That means that we have to be able ensure that an object is destroyed/freed when the last reference to it disappears. In the model here, we have to have a mechanism to ensure that an object is destroyed when its last owner is destroyed. That is, we need a form of shared ownership. Say, we have a synchronized queue, a sync_queue, used to communicate between tasks. A producer and a consumer are each given a pointer to the sync_queue:
在讨论垃圾回收机制时,常常观察到不是所有的对象都有唯一的所有者。这就意味着当最后一个引用销毁后,我们必须确保该对象正确销毁释放。在这个例子中,我们必须有一套机制以保证最后一个所有者销毁后,该对象也会被销毁。我们需要一套所有权共享机制。这里,我们有一个用于任务间通讯的同步队列 sync_queue,提供者和使用者同时拥有指向 sync_queue 指针:
void startup()
{
sync_queue* p = new sync_queue{200}; // trouble ahead!
thread t1 {task1,iqueue,p}; // task1 reads from *iqueue and writes to *p
thread t2 {task2,p,oqueue}; // task2 reads from *p and writes to *oqueue
t1.detach();
t2.detach();
}
I assume that task1, task2, iqueue, and oqueue have been suitably defined elsewhere and apologize for letting the thread outlive the scope in which they were created (using detatch()). Also, you may imagine pipelines with many more tasks and sync_queues. However, here I am only interested in one question: “Who deletes the sync_queue created in startup()?” As written, there is only one good answer: “Whoever is the last to use the sync_queue.” This is a classic motivating case for garbage collection. The original form of garbage collection was counted pointers: maintain a use count for the object and when the count is about to go to zero delete the object. Many languages today rely on a variant of this idea and C++11 supports it in the form of shared_ptr. The example becomes:
我假设 task1 task2 iqueue oqueue 已经在其他地方定义,通过使用 detatch() 使线程的生命周期比它所在的域更长。你可能想到了多任务管道 和 sync_queues。可是在这里,我只对一件事感兴趣:谁删除了 startup() 中创建的sync_queue。只有一个正确的答案,那就是 sync_queue 最后的使用者。这是一个典型的垃圾回收机制的案列。垃圾回收的原型是计数指针:记录被使用的对象数,当计数为 0 时,删除对象。许多语言都是以这个原型演变来的,c++11中使用 shared_ptr 的形式 ,例子变为:
void startup()
{
auto p = make_shared(200); // make a sync_queue and return a stared_ptr to it
thread t1 {task1,iqueue,p}; // task1 reads from *iqueue and writes to *p
thread t2 {task2,p,oqueue}; // task2 reads from *p and writes to *oqueue
t1.detach();
t2.detach();
}
Now the destructors for task1 and task2 can destroy their shared_ptrs (and will do so implicitly in most good designs) and the last task to do so will destroy the sync_queue.
现在 task1 task2 的析构函数可以销毁他们的 shared_ptr(在多数好的设计中,这会做得很隐蔽),最后一个这个做得会销毁 sync_queue 对象。
This is simple and reasonably efficient. It does not imply a complicated run-time system with a garbage collector. Importantly, it does not just reclaim the memory associated with the sync_queue. It reclaims the synchronization object (mutex, lock, or whatever) embedded in the sync_queue to mange the synchronization of the two threads running the two tasks. What we have here is again not just memory management, it is general resource management. That “hidden” synchronization object is handled exactly as the file handles and stream buffers were handled in the earlier example.
这简单合理高效。这不不是说一个复杂的运行系统一定要一个垃圾回收器。他不仅仅可以回收与 sync_queue 关联的内存,还能回收sync_queue中用于管理不同任务的多线程同步性的同步对象(互斥,锁等),不仅管理内存,还可以管理资源。隐藏的同步对象可以精确处理前面例子中的文件句柄和流句柄。
We could try to eliminate the use of shared_ptr by introducing a unique owner in some scope that encloses the tasks, but doing so is not always simple, so C++11 provides both unique_ptr (for unique ownership) and shared_ptr (for shared ownership).
我们可以尝试通过引入唯一所有者在封装的域中淘汰 shared_ptr 。但这并不简单,所以 c++11 同时提供了 unique_ptr 和 shared_ptr。
4.3 Type safety
类型安全
Here, I have only addressed garbage collection in connection with resource management. It also has a role to play in type safety. As long as we have an explicit delete operation, it can be misused. For example:
这里,我只谈到了和资源管理相关的垃圾回收机制,它同样在类型安全中起了重要作用。只要我们显式使用 delete 操作,就可能出现失误。例如:
X* p = new X;
X* q = p;
delete p;
// ...
// the memory that held *p may have been re-used
// p 指向的内存已经被回收了
q->do_something();
Don’t do that. Naked deletes are dangerous -- and unnecessary in general/user code. Leave deletes inside resource management classes, such as string, ostream, thread, unique_ptr, and shared_ptr. There, deletes are carefully matched with news and harmless.
千万不要那么做。在一般的用户代码中,delete 的使用的危险多余的。在 string ostream thread unique_ptr shared_ptr 的资源管理类中,不要使用 delete。因此小心配合 new 使用 delete 以确保无害。
4.4 Summary: Resource Management Ideals
总结:资源管理理念
For resource management, I consider garbage collection a last choice, rather than “the solution” or an ideal:
对于资源管理,我会把作为最后的选择,而不是解决方案或理念
Use appropriate abstractions that recursively and implicitly handle their own resources. Prefer such objects to be scoped variables.
作用域变量对象优先使用合适的抽象递归地隐式的处理它们的资源。
When you need pointer/reference semantics, use “smart pointers” such as unique_ptr and shared_ptr to represent ownership.
当你需要指针或引用时,使用像 unique_ptr shared_ptr 的智能指针表示其所有关系。
If everything else fails (e.g., because your code is part of a program using a mess of pointers without a language supported strategy for resource management and error handling), try to handle non-memory resources “by hand” and plug in a conservative garbage collector to handle the almost inevitable memory leaks.
如果所有方法都失败了,(比如,你在没有资源管理策略和错误处理支持的语言代码中使用了大量指针),尝试手动处理非内存资源并插入一套垃圾回收机制去处理不可避免的内存溢出。
Is this strategy perfect? No, but it is general and simple. Traditional garbage-collection based strategies are not perfect either, and they don’t directly address non-memory resources.
这种策略完美吗?不,但它简单实用。基于传统垃圾回收的策略并不完美,它并不能直接解决非内存资源的问题。