译自five popular myths about c++ --by Bjarne Stroustrup (3)

Myth 3: "For reliable software, you need Garbage Collection"
作为可以信赖的软件,垃圾回收机制不可少

Garbage collection does a good, but not perfect, job at reclaiming unused memory. It is not a panacea. Memory can be retained indirectly and many resources are not plain memory. Consider:
在回收未使用的内存上,垃圾回收机制做得很好,但不完美,它并不是万能的。内存可以被间接保留而且许多资源并不是简单的内存问题。
// take input from file iname and produce output on file oname
//从文件 iname 读入,输出到文件 oname
class Filter { 
public:
  Filter(const string& iname, const string& oname); // constructor
  ~Filter();                                        // destructor
  // ...
private:
  ifstream is;
  ofstream os;
  // ...
};

This Filter’s constructor opens two files. That done, the Filter performs some task on input from its input file producing output on its output file. The task could be hardwired into Filter, supplied as a lambda, or provided as a function that could be provided by a derived class overriding a virtual function. Those details are not important in a discussion of resource management. We can create Filters like this:
Filter 类的构造函数打开2个文件,然后执行读入输入文件、输入结果保存到输出文件的任务。这些任务可能包括硬连接 Filter,提供 lambda 表达式,或者提供一个覆盖派生类虚函数的函数。讨论资源管理的这些细节并不重要。我们可以这样创建 Filter 对象。
void user()
{
  Filter flt {“books”,”authors”};
  Filter* p = new Filter{“novels”,”favorites”};
  // use flt and *p
  delete p;
}

From a resource management point of view, the problem here is how to guarantee that the files are closed and the resources associated with the two streams are properly reclaimed for potential re-use.
从资源管理的角度看,问题在于如何保证文件已经正确关闭以及与2个流对象关联的资源如何重新使用

The conventional solution in languages and systems relying on garbage collection is to eliminate the delete (which is easily forgotten, leading to leaks) and the destructor (because garbage collected languages rarely have destructors and “finalizers” are best avoided because they can be logically tricky and often damage performance). A garbage collector can reclaim all memory, but we need user actions (code) to close the files and to release any non-memory resources (such as locks) associated with the streams. Thus memory is automatically (and in this case perfectly) reclaimed, but the management of other resources is manual and therefore open to errors and leaks.
对于依赖垃圾回收机制的语言和系统来说,方便的方法就是根除 delete(容易被忘记,导致溢出) 和 析构(垃圾回收机制的语言很少使用析构,finalizers 也最好避免因为它们逻辑古怪并且常常会影响性能)。垃圾回收器可以重用所有内存,但是需要用户手动关闭文件并释放与流对象相关的所有非内存资源。内存被自动回收了,但其他资源需要手动操作,那么就会带来报错和溢出的风险。

The common and recommended C++ approach is to rely on destructors to ensure that resources are reclaimed. Typically, such resources are acquired in a constructor leading to the awkward name “Resource Acquisition Is Initialization” (RAII) for this simple and general technique. In user(), the destructor for flt  implicitly calls the destructors for the streams is and os. These constructors in turn close the files and release the resources associated with the streams. The delete would do the same for *p.
c++通常推荐使用析构去确保资源被回收。通常,构造使用的这些资源来自RAII(获得资源就是初始化)这一简单普通的技术。在函数 user 中,flt 的析构隐式调用is 和os 流对象的析构。这些析构(原文 constructors,构造?)依次关闭文件释放流对象关联的资源。 delete 对指针同样这么做。

Experienced users of modern C++ will have noticed that user() is rather clumsy and unnecessarily error-prone. This would be better:
有 c++11 经验的用户可能已经注意到 user 函数相当笨拙并有出错的可能,这么写应该更好:
void user2()
{
  Filter flt {“books”,”authors”};
  unique_ptr p {new Filter{“novels”,”favorites”}};
  // use flt and *p
}

Now *p will be implicitly released whenever user() is exited. The programmer cannot forget to do so. The unique_ptr is a standard-library class designed to ensure resource release without runtime or space overheads compared to the use of built-in “naked” pointers.
现在无论 user 何时退出,指针p指向的内存资源都会隐式释放。程序员应该记住这个方法,与内置指针不同,unique_ptr是一套可以保证资源释放后没有运行时和空间开销的标准库。

However, we can still see the new, this solution is a bit verbose (the type Filter is repeated), and separating the construction of the ordinary pointer (using new) and the smart pointer (here, unique_ptr) inhibits some significant optimizations. We can improve this by using a C++14 helper function make_unique that constructs an object of a specified type and returns a unique_ptr to it:
但是,我们仍然发现 new 的存在,新的方案有点啰嗦(Filter类型重复了),而且这种普通指针和智能指针的分隔结构掩盖了我们代码优化的意义(我觉得原文应该是这个意思),我们可以使用 c++14提供的函数继续优化,函数 make_unique 构造指定类型的对象,然后返回其unique_ptr
void user3()
{
  Filter flt {“books”,”authors”};
  auto p = make_unique(“novels”,”favorites”);
  // use flt and *p
}

Unless we really needed the second Filter to have pointer semantics (which is unlikely) this would be better still:
除非我们真的需要第二个Filter 对象的指针,否则下面的代码更好。
void user3()
{
  Filter flt {“books”,”authors”};
  Filter flt2 {“novels”,”favorites”};
  // use flt and flt2
}

This last version is shorter, simpler, clearer, and faster than the original.
最后一个版本最好,简单简洁快速。
But what does Filter’s destructor do? It releases the resources owned by a Filter; that is, it closes the files (by invoking their destructors). In fact, that is done implicitly, so unless something else is needed for Filter, we could eliminate the explicit mention of the Filter destructor and let the compiler handle it all. So, what I would have written was just:
但是 Filter 析构应该做些什么呢?释放一个 Filter 对象的资源;就是关闭文件(通过调用流对象的析构),实际上,这些是隐式完成的,除非对于 Filter 还要额外做些什么,否则我们不会显式定义其析构,都交给编译器默认生成。所以我只要这样写就可以了:
class Filter { // take input from file iname and produce output on file oname
public:
  Filter(const string& iname, const string& oname);
  // ...
private:
  ifstream is;
  ofstream os;
  // ...
};

void user3()
{
  Filter flt {“books”,”authors”};
  Filter flt2 {“novels”,”favorites”};
  // use flt and flt2
}


This happens to be simpler than what you would write in most garbage collected languages (e.g., Java or C#) and it is not open to leaks caused by forgetful programmers. It is also faster than the obvious alternatives (no spurious use of the free/dynamic store and no need to run a garbage collector). Typically, RAII also decreases the resource retention time relative to manual approaches.
这比那些支持垃圾回收的语言写起来更简洁,对于健忘的程序员,也不会导致溢出。显然也比其他可选方案快很多(无需模拟自由、动态内存的存储,无需运行垃圾回收机制)。相对于手动操作,RAII 也降低了资源滞留的时间。
This is my ideal for resource management. It handles not just memory, but general (non-memory) resources, such as file handles, thread handles, and locks. But is it really general? How about objects that needs to be passed around from function to function? What about objects that don’t have an obvious single owner?
这是我理想的资源管理方法,不仅用于内存,还可以用于普通资源像文件句柄,线程句柄,锁等等。但它真的通用了吗?如果对象需要在函数间传递呢?如果对象没有一个明确的单一所属呢?


4.1 Transferring Ownership: move
所有权的移交:move


Let us first consider the problem of moving objects around from scope to scope. The critical question is how to get a lot of information out of a scope without serious overhead from copying or error-prone pointer use. The traditional approach is to use a pointer:
我们先来思考一下在域间移动对象的问题。关键点在于在不避免拷贝或易错指针等重大开销的情况下怎么在域外获取其信息。传统方法是使用指针:

X* make_X()
{
  X* p = new X:
  // ... fill X ..
  return p;
}

void user()
{
  X* q = make_X();
  // ... use *q ...
  delete q;
}


Now who is responsible for deleting the object? In this simple case, obviously the caller of make_X() is, but in general the answer is not obvious. What if make_X() keeps a cache of objects to minimize allocation overhead? What if user() passed the pointer to some other_user()? The potential for confusion is large and leaks are not uncommon in this style of program.
现在谁负责指针的删除工作呢?在上例中,显然是 make_X 的调用者,但通常答案并不明确。如果为了降低开销,make_X 需要对象的缓存呢?如果 user 将指针传递给其他 other_user 呢?在这种编程风格中,极易混乱和溢出。


I could use a shared_ptr or a unique_ptr to be explicit about the ownership of the created object. For example:
我可以使用 shared_ptr 或者 unique_ptr 显式的表明已有对象的归属。举例:

unique_ptr make_X();


But why use a pointer (smart or not) at all? Often, I don’t want a pointer and often a pointer would distract from the conventional use of an object. For example, a Matrix addition function creates a new object (the sum) from two arguments, but returning a pointer would lead to seriously odd code:
但是为嘛非要用指针(智能或非智能)呢?通常我也不想用指针,和传统的使用对象比较,返回指针有点多余(看下面好像是这个意思),比如说,Matrix 类型的加法函数,计算2个参数的和,但却返回一个指针,这看起来好奇怪。

unique_ptr operator+(const Matrix& a, const Matrix& b);
Matrix res = *(a+b);

That * is needed to get the sum, rather than a pointer to it. What I really want in many cases is an object, rather than a pointer to an object. Most often, I can easily get that. In particular, small objects are cheap to copy and I wouldn’t dream of using a pointer:
那个解引用应该是一个结果,而不是指向结果的指针。多数情况下,我只要一个对象,而不是指针。尤其是那些小的类型,只要简单的copy 就好,根本不用考虑指针。
double sqrt(double); // a square root function
double s2 = sqrt(2); // get the square root of 2

On the other hand, objects holding lots of data are typically handles to most of that data. Consider istream, string, vector, list, and thread. They are all just a few words of data ensuring proper access to potentially large amounts of data. Consider again the Matrix addition. What we want is
另一方面,拥有许多数据的类型,一般也会有处理这些数据的操作,像 istream, string, vector, list, thread.它们只用几个简单的数据操作命令就保证了对大量数据的访问,再看回 Matrix 的加法函数,我们想要的是:
Matrix operator+(const Matrix& a, const Matrix& b); // return the sum of a and b
Matrix r = x+y;

We can easily get that.
简单的得到结果
Matrix operator+(const Matrix& a, const Matrix& b)
{
  Matrix res;
  // ... fill res with element sums ...
  return res;
}

By default, this copies the elements of res into r, but since res is just about to be destroyed and the memory holding its elements is to be freed, there is no need to copy: we can “steal” the elements. Anybody could have done that since the first days of C++, and many did, but it was tricky to implement and the technique was not widely understood. C++11 directly supports “stealing the representation” from a handle in the form of move operations that transfer ownership. Consider a simple 2-D Matrix of doubles:
默认情况下,这会拷贝 res 中的成员到 r,但是只要 res 销毁了,其成员占有的内存就会被释放,有一种不需要 copy 的方法,我们可以“偷”。从接触 c++的第一天起,很多人都想过这么干,但这种方法很难实现而且技术不容易被普遍接受。c++11直接支持“窃取信息”,通过move操作形式的句柄移交所有权,看一下二维双重 Matrix 的例子:
class Matrix {
  double* elem; // pointer to elements
  int nrow;     // number of rows
  int ncol;     // number of columns
public:
  Matrix(int nr, int nc)                  // constructor: allocate elements
    :elem{new double[nr*nc]}, nrow{nr}, ncol{nc}
  {
    for(int i=0; i


A copy operation is recognized by its reference (&) argument. Similarly, a move operation is recognized by its rvalue reference (&&) argument. A move operation is supposed to “steal” the representation and leave an “empty object” behind. For Matrix, that means something like this:
通过判断参数是左值引用或右值引用来区别 copy 和 move 移动。move “窃取信息”后,源对象就成了“空壳”。拿 Matrix 来说,就是这样的:

Matrix::Matrix(Matrix&& a)                   // move constructor
  :nrow{a.nrow}, ncol{a.ncol}, elem{a.elem}  // “steal” the representation “窃取资源”
{
  a.elem = nullptr;                          // leave “nothing” behind 置空源对象
}


That’s it! When the compiler sees the return res; it realizes that res is soon to be destroyed. That is, res will not be used after the return. Therefore it applies the move constructor, rather than the copy constructor to transfer the return value. In particular, for
就这么简单!当编译器执行到 "return res;",会意识到 res 很快就会被销毁。那样的话,在 return 后,res 就不能使用了。于是,编译器使用 move 构造而不是 copy 构造转移返回值。

Matrix r = a+b;


the res inside operator+() becomes empty -- giving the destructor a trivial task -- and res’s elements are now owned by r. We have managed to get the elements of the result -- potentially megabytes of memory -- out of the function (operator+()) and into the caller’s variable. We have done that at a minimal cost (probably four word assignments).
特别注意的是,此时 operator+() 中的 res 已经空了,留下一点析构的善后工作,res 所有的元素现在归 r 所有。我们已经将operator+ 中的结果(或许有几兆)转移到调用者的变量中了,我们只用了一点成本,可能只是4行赋值语句。


Expert C++ users have pointed out that there are cases where a good compiler can eliminate the copy on return completely (in this case saving the four word moves and the destructor call). However, that is implementation dependent, and I don’t like the performance of my basic programming techniques to depend on the degree of cleverness of individual compilers. Furthermore, a compiler that can eliminate the copy, can as easily eliminate the move. What we have here is a simple, reliable, and general way of eliminating complexity and cost of moving a lot of information from one scope to another.
已经有专业用户指出,某些情况下,好的编译器可以清除返回的 copy 信息(这中情况下,会保存4行 move 操作和析构调用)。然而这是对现实的依赖,我不喜欢由个别编译器的智能程度来决定我的基础编程能力的性能。而且能清除 copy 的编译器肯定能清除 move. 我们现在有一套简单可行通用的方法去消除域间移动大数据时带来的复杂性和开销。


Often, we don’t even need to define all those copy and move operations. If a class is composed out of members that behave as desired, we can simply rely on the operations generated by default. Consider:
通常,我们不必定义所有的 copy move 操作,如果一个类缺少所需的成员操作,我们可以依赖默认生成的操作。

class Matrix {
    vector elem; // elements
    int nrow;            // number of rows
    int ncol;            // number of columns
public:
    Matrix(int nr, int nc)    // constructor: allocate elements
      :elem(nr*nc), nrow{nr}, ncol{nc}
    { }

    // ...
};


This version of Matrix behaves like the version above except that it copes slightly better with errors and has a slightly larger representation (a vector is usually three words).
这个版本很像上面的,除了对错误稍微的处理和更多的描述(没看明白这句啥意思)


What about objects that are not handles? If they are small, like an int or a complex, don’t worry. Otherwise, make them handles or return them using “smart” pointers, such as unique_ptr and shared_ptr. Don’t mess with “naked” new and delete operations.
那些不是句柄的对象呢?如果他们像 int 那么小,或者 complex,不要担心。使用智能指针处理或返回他们,不要单纯的使用 new delete.


Unfortunately, a Matrix like the one I used in the example is not part of the ISO C++ standard library, but several are available (open source and commercial). For example, search the Web for “Origin Matrix Sutton” and see Chapter 29 of my The C++ Programming Language (Fourth Edition) [11] for a discussion of the design of such a matrix.
不幸的是,上面使用的 Matrix 并不是标准库里的,但是很多都可用。在网上搜索“Origin Matrix Sutton”,你可以看见在我的书The C++ Programming Language (Fourth Edition)的第29章在讨论如何设计这样的一个矩阵。


4.2 Shared Ownership: shared_ptr
共享所有


In discussions about garbage collection it is often observed that not every object has a unique owner. That means that we have to be able ensure that an object is destroyed/freed when the last reference to it disappears. In the model here, we have to have a mechanism to ensure that an object is destroyed when its last owner is destroyed. That is, we need a form of shared ownership. Say, we have a synchronized queue, a sync_queue, used to communicate between tasks. A producer and a consumer are each given a pointer to the sync_queue:
在讨论垃圾回收机制时,常常观察到不是所有的对象都有唯一的所有者。这就意味着当最后一个引用销毁后,我们必须确保该对象正确销毁释放。在这个例子中,我们必须有一套机制以保证最后一个所有者销毁后,该对象也会被销毁。我们需要一套所有权共享机制。这里,我们有一个用于任务间通讯的同步队列 sync_queue,提供者和使用者同时拥有指向 sync_queue 指针:

void startup()
{
  sync_queue* p  = new sync_queue{200};  // trouble ahead!
  thread t1 {task1,iqueue,p};  // task1 reads from *iqueue and writes to *p
  thread t2 {task2,p,oqueue};  // task2 reads from *p and writes to *oqueue
  t1.detach();
  t2.detach();
}


I assume that task1, task2, iqueue, and oqueue have been suitably defined elsewhere and apologize for letting the thread outlive the scope in which they were created (using detatch()). Also, you may imagine pipelines with many more tasks and sync_queues. However, here I am only interested in one question: “Who deletes the sync_queue created in startup()?” As written, there is only one good answer: “Whoever is the last to use the sync_queue.” This is a classic motivating case for garbage collection. The original form of garbage collection was counted pointers: maintain a use count for the object and when the count is about to go to zero delete the object. Many languages today rely on a variant of this idea and C++11 supports it in the form of shared_ptr. The example becomes:
我假设 task1 task2 iqueue oqueue 已经在其他地方定义,通过使用 detatch() 使线程的生命周期比它所在的域更长。你可能想到了多任务管道 和 sync_queues。可是在这里,我只对一件事感兴趣:谁删除了 startup() 中创建的sync_queue。只有一个正确的答案,那就是 sync_queue 最后的使用者。这是一个典型的垃圾回收机制的案列。垃圾回收的原型是计数指针:记录被使用的对象数,当计数为 0 时,删除对象。许多语言都是以这个原型演变来的,c++11中使用 shared_ptr 的形式 ,例子变为:

void startup()
{
  auto p = make_shared(200);  // make a sync_queue and return a stared_ptr to it
  thread t1 {task1,iqueue,p};  // task1 reads from *iqueue and writes to *p
  thread t2 {task2,p,oqueue};  // task2 reads from *p and writes to *oqueue
  t1.detach();
  t2.detach();
}


Now the destructors for task1 and task2 can destroy their shared_ptrs (and will do so implicitly in most good designs) and the last task to do so will destroy the sync_queue.
现在 task1 task2 的析构函数可以销毁他们的 shared_ptr(在多数好的设计中,这会做得很隐蔽),最后一个这个做得会销毁 sync_queue 对象。


This is simple and reasonably efficient. It does not imply a complicated run-time system with a garbage collector. Importantly, it does not just reclaim the memory associated with the sync_queue. It reclaims the synchronization object (mutex, lock, or whatever) embedded in the sync_queue to mange the synchronization of the two threads running the two tasks. What we have here is again not just memory management, it is general resource management. That “hidden” synchronization object is handled exactly as the file handles and stream buffers were handled in the earlier example.
这简单合理高效。这不不是说一个复杂的运行系统一定要一个垃圾回收器。他不仅仅可以回收与 sync_queue 关联的内存,还能回收sync_queue中用于管理不同任务的多线程同步性的同步对象(互斥,锁等),不仅管理内存,还可以管理资源。隐藏的同步对象可以精确处理前面例子中的文件句柄和流句柄。


We could try to eliminate the use of shared_ptr by introducing a unique owner in some scope that encloses the tasks, but doing so is not always simple, so C++11 provides both unique_ptr (for unique ownership) and shared_ptr (for shared ownership).
我们可以尝试通过引入唯一所有者在封装的域中淘汰 shared_ptr 。但这并不简单,所以 c++11 同时提供了 unique_ptr 和 shared_ptr。


4.3 Type safety
类型安全


Here, I have only addressed garbage collection in connection with resource management. It also has a role to play in type safety. As long as we have an explicit delete operation, it can be misused. For example:

这里,我只谈到了和资源管理相关的垃圾回收机制,它同样在类型安全中起了重要作用。只要我们显式使用 delete 操作,就可能出现失误。例如:
X* p = new X;
X* q = p;
delete p;
// ...
 // the memory that held *p may have been re-used 
 // p 指向的内存已经被回收了
q->do_something(); 


Don’t do that. Naked deletes are dangerous -- and unnecessary in general/user code. Leave deletes inside resource management classes, such as string, ostream, thread, unique_ptr, and shared_ptr. There, deletes are carefully matched with news and harmless.
千万不要那么做。在一般的用户代码中,delete 的使用的危险多余的。在 string ostream thread unique_ptr shared_ptr 的资源管理类中,不要使用 delete。因此小心配合 new 使用 delete 以确保无害。


4.4 Summary: Resource Management Ideals
总结:资源管理理念


For resource management, I consider garbage collection a last choice, rather than “the solution” or an ideal:
对于资源管理,我会把作为最后的选择,而不是解决方案或理念


Use appropriate abstractions that recursively and implicitly handle their own resources. Prefer such objects to be scoped variables.
作用域变量对象优先使用合适的抽象递归地隐式的处理它们的资源。


When you need pointer/reference semantics, use “smart pointers” such as unique_ptr and shared_ptr to represent ownership.
当你需要指针或引用时,使用像 unique_ptr shared_ptr 的智能指针表示其所有关系。


If everything else fails (e.g., because your code is part of a program using a mess of pointers without a language supported strategy for resource management and error handling), try to handle non-memory resources “by hand” and plug in a conservative garbage collector to handle the almost inevitable memory leaks.
如果所有方法都失败了,(比如,你在没有资源管理策略和错误处理支持的语言代码中使用了大量指针),尝试手动处理非内存资源并插入一套垃圾回收机制去处理不可避免的内存溢出。


Is this strategy perfect? No, but it is general and simple. Traditional garbage-collection based strategies are not perfect either, and they don’t directly address non-memory resources.
这种策略完美吗?不,但它简单实用。基于传统垃圾回收的策略并不完美,它并不能直接解决非内存资源的问题。

你可能感兴趣的:(c++)