Smart Pointers - What, Why, Which? 智能指针

来自:http://ootips.org/yonat/4dev/smart-pointers.html

Smart Pointers - What, Why, Which?

Smart pointers are objects that look and feellike pointers, but are smarter. What does this mean?

To look and feel like pointers, smart pointers need to have the same interface that pointers do: they need to support pointer operations like dereferencing (operator *) and indirection (operator ->). An object that looks and feels like something else is called a proxy object, or just proxy. The proxy pattern and its many uses are described in the books Design Patterns and Pattern Oriented Software Architecture.

To be smarter than regular pointers, smart pointers need to do things that regular pointers don't. What could these things be? Probably the most common bugs in C++ (and C) are related to pointers and memory management: dangling pointers, memory leaks, allocation failures and other joys. Having a smart pointer take care of these things can save a lot of aspirin...

The simplest example of a smart pointer is auto_ptr, which is included in the standard C++ library. You can find it in the header , or take a look at Scott Meyers' auto_ptr implementation. Here is part of auto_ptr's implementation, to illustrate what it does:

template <class T> class auto_ptr
{
    T* ptr;
public:
    explicit auto_ptr(T* p = 0) : ptr(p) {}
    ~auto_ptr()                 {delete ptr;}
    T& operator*()              {return *ptr;}
    T* operator->()             {return ptr;}
    // ...
};
As you can see, auto_ptr is a simple wrapper around a regular pointer. It forwards all meaningful operations to this pointer (dereferencing and indirection). Its smartness in the destructor: the destructor takes care of deleting the pointer.

For the user of auto_ptr, this means that instead of writing:

void foo()
{
    MyClass* p(new MyClass);
    p->DoSomething();
    delete p;
}
You can write:
void foo()
{
    auto_ptr<MyClass> p(new MyClass);
    p->DoSomething();
}
And trust p to cleanup after itself.

What does this buy you? See the next section.

Why would I use them?

Obviously, different smart pointers offer different reasons for use. Here are some common reasons for using smart pointers in C++.

Why: Less bugs

Automatic cleanup.  As the code above illustrates, using smart pointers that clean after themselves can save a few lines of code. The importance here is not so much in the keystrokes saved, but in reducing the probability for bugs: you don't need to remember to free the pointer, and so there is no chance you will forget about it.

Automatic initialization. Another nice thing is that you don't need to initialize the auto_ptr to NULL, since the default constructor does that for you. This is one less thing for the programmer to forget.

Dangling pointers. A common pitfall of regular pointers is the dangling pointer: a pointer that points to an object that is already deleted. The following code illustrates this situation:

MyClass* p(new MyClass);
MyClass* q = p;
delete p;
p->DoSomething();   // Watch out! p is now dangling!
p = NULL;           // p is no longer dangling
q->DoSomething();   // Ouch! q is still dangling!
For auto_ptr, this is solved by setting its pointer to NULL when it is copied:
template <class T>
auto_ptr& auto_ptr::operator=(auto_ptr& rhs)
{
    if (this != &rhs) {
        delete ptr;
        ptr = rhs.ptr;
        rhs.ptr = NULL;
    }
    return *this;
}
Other smart pointers may do other things when they are copied. Here are some possible strategies for handling the statement q = p, where p and q are smart pointers:
  • Create a new copy of the object pointed by p, and have q point to this copy. This strategy is implemented in copied_ptr.h.
  • Ownership transfer: Let both p and q point to the same object, but transfer the responsibility for cleaning up ("ownership") from p to q. This strategy is implemented in owned_ptr.h.
  • Reference counting: Maintain a count of the smart pointers that point to the same object, and delete the object when this count becomes zero. So the statement q = p causes the count of the object pointed by p to increase by one. This strategy is implemented in counted_ptr.h. Scott Meyers offers another reference counting implementation in his book More Effective C++.
  • Reference linking: The same as reference counting, only instead of a count, maintain a circular doubly linked list of all smart pointers that point to the same object. This strategy is implemented in linked_ptr.h.
  • Copy on write: Use reference counting or linking as long as the pointed object is not modified. When it is about to be modified, copy it and modify the copy. This strategy is implemented incow_ptr.h.
All these techniques help in the battle against dangling pointers. Each has each own benefits and liabilities. The  Which  section of this article discusses the suitability of different smart pointers for various situations.

Why: Exception Safety

Let's take another look at this simple example:
void foo()
{
    MyClass* p(new MyClass);
    p->DoSomething();
    delete p;
}
What happens if DoSomething() throws an exception? All the lines after it will not get executed and p will never get deleted! If we're lucky, this leads only to memory leaks. However, MyClass may free some other resources in its destructor (file handles, threads, transactions, COM references, mutexes) and so not calling it my cause severe resource locks.

If we use a smart pointer, however, p will be cleaned up whenever it gets out of scope, whether it was during the normal path of execution or during the stack unwinding caused by throwing an exception.

But isn't it possible to write exception safe code with regular pointers? Sure, but it is so painful that I doubt anyone actually does this when there is an alternative. Here is what you would do in this simple case:

void foo()
{
    MyClass* p;
    try {
        p = new MyClass;
        p->DoSomething();
        delete p;
    }
    catch (...) {
        delete p;
        throw;
    }
}
Now imagine what would happen if we had some if's and for's in there...

Why: Garbage collection

Since C++ does not provide automatic garbage collection like some other languages, smart pointers can be used for that purpose. The simplest garbage collection scheme is reference counting or reference linking, but it is quite possible to implement more sophisticated garbage collection schemes with smart pointers. For more information see  the garbage collection FAQ .

Why: Efficiency

Smart pointers can be used to make more efficient use of available memory and to shorten allocation and deallocation time.

A common strategy for using memory more efficiently is copy on write (COW). This means that the same object is shared by many COW pointers as long as it is only read and not modified. When some part of the program tries to modify the object ("write"), the COW pointer creates a new copy of the object and modifies this copy instead of the original object. The standard string class is commonly implemented using COW semantics (see the header).

string s("Hello");

string t = s;       // t and s point to the same buffer of characters

t += " there!";     // a new buffer is allocated for t before
                    // appending " there!", so s is unchanged.

Optimized allocation schemes are possible when you can make some assumptions about the objects to be allocated or the operating environment. For example, you may know that all the objects will have the same size, or that they will all live in a single thread. Although it is possible to implement optimized allocation schemes using class-specific new and delete operators, smart pointers give you the freedom to choose whether to use the optimized scheme for each object, instead of having the scheme set for all objects of a class. It is therefore possible to match the allocation scheme to different operating environments and applications, without modifying the code for the entire class.

Why: STL containers

The C++ standard library includes a set of containers and algorithms known as the standard template library (STL).  STL is designed  to be  generic  (can be used with any kind of object) and  efficient  (does not incur time overhead compared to alternatives). To achieve these two design goals, STL containers store their objects by value. This means that if you have an STL container that stores objects of class Base, it cannot store of objects of classes derived from Base.
class Base { /*...*/ };
class Derived : public Base { /*...*/ };

Base b;
Derived d;
vector<Base> v;

v.push_back(b); // OK
v.push_back(d); // error
What can you do if you need a collection of objects from different classes? The simplest solution is to have a collection of pointers:
vector<Base*> v;

v.push_back(new Base);      // OK
v.push_back(new Derived);   // OK too

// cleanup:
for (vector<Base*>::iterator i = v.begin(); i != v.end(); ++i)
    delete *i;
The problem with this solution is that after you're done with the container, you need to manually cleanup the objects stored in it. This is both error prone and not exception safe.

Smart pointers are a possible solution, as illustrated below. (An alternative solution is a smart container, like the one implemented in pointainer.h.)

vector< linked_ptr<Base> > v;
v.push_back(new Base);      // OK
v.push_back(new Derived);   // OK too

// cleanup is automatic
Since the smart pointer automatically cleans up after itself, there is no need to manually delete the pointed objects.

Note: STL containers may copy and delete their elements behind the scenes (for example, when they resize themselves). Therefore, all copies of an element must be equivalent, or the wrong copy may be the one to survive all this copying and deleting. This means that some smart pointers cannot be used within STL containers, specifically the standard auto_ptr and any ownership-transferring pointer. For more info about this issue, see C++ Guru of the Week #25.

Which one should I use?

Are you confused enough? Well, this summary should help.

Which: Local variables

The standard auto_ptr is the simplest smart pointer, and it is also, well, standard. If there are no special requirements, you should use it. For local variables, it is usually the right choice.

Which: Class members

Although you can use auto_ptr as a class member (and save yourself the trouble of freeing objects in the destructor), copying one object to another will nullify the pointer, as illustrated Below.
class MyClass
{
    auto_ptr<int> p;
    // ...
};

MyClass x;
// do some meaningful things with x
MyClass y = x; // x.p now has a NULL pointer
Using a copied pointer instead of auto_ptr solves this problem: the copied object (y) gets a new copy of the member.

Note that using a reference counted or reference linked pointer means that if y changes the member, this change will also affect x! Therefore, if you want to save memory, you should use a COW pointer and not a simple reference counted/linked pointer.

Which: STL containers

As explained above, using garbage-collected pointers with STL containers lets you store objects from different classes in the same container.

It is important to consider the characteristics of the specific garbage collection scheme used. Specifically, reference counting/linking can leak in the case of circular references (i.e., when the pointed object itself contains a counted pointer, which points to an object that contains the original counted pointer). Its advantage over other schemes is that it is both simple to implement and deterministic. The deterministic behavior may be important in some real time systems, where you cannot allow the system to suddenly wait while the garbage collector performs its housekeeping duties.

Generally speaking, there are two ways to implement reference counting: intrusive and non-intrusive. Intrusive means that the pointed object itself contains the count. Therefore, you cannot use intrusive reference counting with 3-rd party classes that do not already have this feature. You can, however, derive a new class from the 3-rd party class and add the count to it. Non-intrusive reference counting requires an allocation of a count for each counted object. The counted_ptr.h is an example of non-intrusive reference counting.

Smart Pointers - What, Why, Which? 智能指针_第1张图片 Smart Pointers - What, Why, Which? 智能指针_第2张图片
Reference linking does not require any changes to be made to the pointed objects, nor does it require any additional allocations. A reference linked pointer takes a little more space than a reference counted pointer - just enough to store one or two more pointers. Smart Pointers - What, Why, Which? 智能指针_第3张图片

Both reference counting and reference linking require using locks if the pointers are used by more than one thread of execution.

Which: Explicit ownership transfer

Sometimes, you want to receive a pointer as a function argument, but keep the ownership of this pointer (i.e. the control over its lifetime) to yourself. One way to do this is to use consistent naming-conventions for such cases.  Taligent's Guide to Designing Programs  recommends using "adopt" to mark that a function adopts ownership of a pointer.

Using an owned pointer as the function argument is an explicit statement that the function is taking ownership of the pointer.

Which: Big objects

If you have objects that take a lot of space, you can save some of this space by using COW pointers. This way, an object will be copied only when necessary, and shared otherwise. The sharing is implemented using some garbage collection scheme, like reference counting or linking.

Which: Summary

For this: Use that:
Local variables auto_ptr
Class members Copied pointer
STL Containers Garbage collected pointer (e.g. reference counting/linking)
Explicit ownership transfer Owned pointer
Big objects Copy on write

Conclusion

Smart pointers are useful tools for writing safe and efficient code in C++. Like any tool, they should be used with appropriate care, thought and knowledge. For a comprehensive and in depth analysis of the issues concerning smart pointers, I recommend reading Andrei Alexandrescu's  chapter about smart pointers  in his book  Modern C++ Design .

Feel free to use my own smart pointers in your code.
The Boost C++ libraries include some smart pointers, which are more rigorously tested and actively maintained. Do try them first, if they are appropriate for your needs.




智能指针——是什么?为什么?怎么用?

•智能指针是什么?

•为什么要使用智能指针?

减少bug

异常安全

垃圾收集

效率

STL容器

•我应该使用哪一种智能指针?

本地变量

类成员

STL容器

明确的所有权转让

大对象

汇总

•总结

 

智能指针是感观上与普通指针相同,但是更智能的对象。什么意思呢?

智能指针在感观上像普通指针,需要提供与普通指针相同的接口:即需要支持指针操作,如直接取内容(*),间接访问(->)。一个对象如果在感观和别的东西类似,则称之为一个代理对象,或者就叫代理。可以参考《DesignPatterns》和《Pattern Oriented Software Architecture》,书中介绍了设计模式,以及许多该模式的用法。

为了比普通指针更加智能,智能指针需要做到普通指针做不到的事情。是什么事情呢?C++(C)中最常见的错误,大多与指针和内存管理有关:悬空指针,内存泄漏,分配失败,等其他“欢乐”的错误。智能指针能够解决掉这些问题,从而节省大量的阿司匹林...(作者很幽默J)

举个最简单的例子,智能指针auto_ptr,包含在标准C++库中,你可以在头文件中找到它,或者看看ScottMeyers的关于auto_ptr的实现的文章。这里是auto_ptr实现中的部分代码,用来说明它做什么:

template class auto_ptr
{
    T* ptr;
public:
    explicit auto_ptr(T* p = 0) : ptr(p) {}
    ~auto_ptr()                 {delete ptr;}
    T& operator*()              {return *ptr;}
    T* operator->()             {return ptr;}
    // ...
};

正如你所看到的,auto_ptr是对普通指针做了一个简单的封装。它把所有有效的操作都转给这个普通指针(直接和间接取内容)。智能之处在于其析构函数:析构函数负责删除指针。

对于用户来说,下面的代码:

voidfoo()
{
    MyClass* p(new MyClass);
    p->DoSomething();
    delete p;
}

可以这样写:

voidfoo()
{
    auto_ptr p(new MyClass);
    p->DoSomething();
}

并且十分确信p会自动清理构造的对象。

那么,智能指针的卖点是什么呢?请参阅下一节。

 

我为什么要使用智能指针?

当然,不同的智能指针有不同的使用理由。下面是在C++中使用智能指针的一些共通的理由:

理由1:减少错误

自动清理。上面的代码说明,使用智能指针,可以清理自身,节省几行代码。这里不单单是少敲了几下键盘,更重要的是减少了出错的概率:你不需要记住要被释放的指针,自然也就不存在忘记释放的风险。

自动初始化。另一个好消息是,你不需要把auto_ptr初始化为NULL,因为默认的构造函数会替你完成。用NULL来初始化是使用智能指针的程序员需要忘记的事情。

悬空指针。使用常规指针时一个常见的​​错误就是悬空指针:指针指向的对象已被删除。下面的代码说明了这种情况:

MyClass*p(new MyClass);
MyClass*q = p;
delete p;
p->DoSomething();   // Watch out! p is now dangling!
p =NULL;           // p is no longerdangling
q->DoSomething();   // Ouch! q is still dangling!
Forauto_ptr, this is solved by setting its pointer to NULL when it is copied:
template
auto_ptr&auto_ptr::operator=(auto_ptr& rhs)
{
    if (this != &rhs) {
        delete ptr;
        ptr = rhs.ptr;
        rhs.ptr = NULL;
    }
    return *this;
}

不同的智能指针在被拷贝赋值时可能会采取不同的策略。处理语句q=p时,这里p和q是智能指针,一些可能的策略如下:

·       创建一个新的副本:为p所指向的对象创建一个新的副本,让q指向这个副本。这一策略的实现在copied_ptr.h中。

·       所有权转让:让p和q指向同一个对象,但清理的责任(“所有权”)从p转移到q。这一策略的实现在owned_ptr.h中。

·       引用计数:维护一个计数器,统计指向同一对象的智能指针的数目,当数目变为0时,删除对象。那么语句q=p导致p所指向的对象增加一个计数。这一策略的实现在counted_ptr.h中。ScottMeyers在他的《More Effective C++》一书中提供了另一种引用计数的实现方式。

·       引用链:思想与引用计数类似,但不是使用计数器,而是维护一个环形双向链表,保存所有指向同一对象的智能指针。这一策略的实现在linked_ptr.h中。

·       写拷贝:不修改对象的时候,使用引用计数或者引用链;当需要修改对象时,将对象复制一份并修改副本。这种策略的实现在incow_ptr.h中。

所有这些技术都可以解决悬挂指针的问题。它们各有优缺点。“使用哪一种”章节将会讨论各种情况下的不同的智能指针的适用情况。

理由2:异常安全

让我们再看看这个简单的例子:

voidfoo()
{
    MyClass* p(new MyClass);
    p->DoSomething();
    delete p;
}

如果DoSomething()抛出一个异常,会发生什么事?后面的所有代码行都将无法执行,p将永远不会被删除!如果足够幸运的话,可能只是引发内存泄漏。但是,MyClass是有可能在其析构函数中释放一些其他资源的(如文件句柄,线程,事务,COM引用,互斥),此时如果不调用析构,就可能造成严重的资源死锁。

相反,如果我们使用智能指针,p在超出作用域时都将得到清理:包括正常的执行过程,以及在抛出异常时引起的堆栈跳转。

难道使用普通指针就写不出异常安全的代码吗?当然能写,但是会非常痛苦(代码会很恶心),我估计有了这种替代方式后,没有人会愿意继续痛苦下去。在这个简单的例子中,你可能这样做:

voidfoo()
{
    MyClass* p;
    try {
        p = new MyClass;
        p->DoSomething();
        delete p;
    }
    catch (...) {
        delete p;
        throw;
    }
}

现在,请想像一下,如果我们还有一些if(),for()...

理由3:垃圾收集

虽然C++并不像其他一些语言那样提供自动垃圾收集,但是可以使用智能指针达到这个目的。最简单的垃圾收集机制是引用计数或引用链,通过智能指针很容易实现更为复杂的垃圾收集机制。想了解更多信息,请参阅垃圾收集的FAQ。

理由4:效率

智能指针可以更有效的利用内存,缩短内存申请和释放时间。

写拷贝(COW)是一种通用的提高内存效率的​​策略。即一个对象,在它没有被修改过之前,可以同时被多个COW指针共享。当程序的某些代码试图修改此对象(“写”)时,COW指针就会拷贝构造出一个对象副本,并且在新副本上修改。标准字符串string类通常使用COW的语义来实现(参考头文件)。

strings("Hello");
 
string t= s;       // t and s point to the samebuffer of characters
 
t +=" there!";     // a new bufferis allocated for t before
                    // appending " there!",so s is unchanged.

如果对对象或者运行环境做一些假设,可以得到更优化的分配方案。例如,你可能预先知道所有对象都有相同的大小,或者,他们都处于同一个线程中。尽管可以使用类特定的new和delete操作来实现优化的分配方案,但是智能指针可以让您自由选择是否对对象使用优化方案,而不是为所有对象设置同样的优化方案。这样可以为不同的运行环境和应用匹配不同的分配方案,而不需要修改整个类的代码。

理由5:STL容器

C++标准库包含了一系列被称为标准模板库(STL)的容器和算法。 STL的设计目标是通用性(可用于任何类型的对象)和高效性(不会比其他替代品有更多的时间开销)。为了达到这两个设计目标,STL容器中存储的是对象的值。这意味着,如果你有一个STL容器,存储基类Base的对象,那么它就不能存储从该基类派生的子类的对象。

classBase { /*...*/ };
classDerived : public Base { /*...*/ };
 
Base b;
Derivedd;
vectorv;
 
v.push_back(b);// OK
v.push_back(d);// error

如果你需要不同类的对象的集合,怎么办呢?最简单的解决方案是示用指针的集合:

vectorv;
 
v.push_back(newBase);      // OK
v.push_back(newDerived);   // OK too
 
//cleanup:
for(vector::iterator i = v.begin(); i != v.end(); ++i)
delete*i;

这个解决方案的问题在于,用完容器之后,你需要手动清除存储在它里面的对象。这里既容易出错,又是异常不安全的。

使用智能指针是一种可行的方案。如下面所示(另一种解决方案使用智能的容器,如在pointainer.h中实现的那个)。

vector > v;
v.push_back(newBase);      // OK
v.push_back(newDerived);   // OK too
 
//cleanup is automatic
智能指针会自动清理,所以不需要手动删除被指向的对象。

注:STL容器可以在“幕后”复制和删除元素(例如,resize)。因此,所有的副本元素必须是相同的,否则,错误的拷贝可能让所有的拷贝和删除的作用在同一个对象上。这意味着,一些智能指针不能用在STL容器内,特别是标准型auto_ptr智能指针,和所有的所有权转让型智能指针。如需此问题的更多信息,请参见《C++ Guru ofthe Week #25》。

 

我应该使用哪一种?

够困惑了吧?好吧,这个章节应该会对大家有所帮助。

选择1:局部变量

标准型auto_ptr智能指针是最简单的,它也是,嗯,标准的。如果没有其他特殊要求,你应该使用它。对于局部变量,它通常是正确的选择。

选择2:类成员

虽然你可以使用auto_ptr作为一个类的成员(并省去在析构函数释放对象的麻烦),但是复制对象时会把该指针置空,如下所示。

classMyClass
{
    auto_ptr p;
    // ...
};
 
MyClassx;
// dosome meaningful things with x
MyClass y= x; // x.p now has a NULL pointer
使用复制型的智能指针而不是auto_ptr型的可以解决这个问题:复制的对象(y)得到该成员的一份新的副本。

需要注意的是,使用引用计数或引用链型智能指针意味着,如果y改变成员,这种变化也将影响x!因此,如果你想节省内存,你应该使用COW型指针,而不是一个简单的引用计数/引用链型指针。

选择3:STL容器

如上所述,在STL容器中使用垃圾收集型的指针,可以让您在同一容器中存储来自不同类的对象的指针。

使用具体垃圾收集机制前要仔细考虑其特点。具体来说,引用计数/引用链在循环引用的情况下可能造成泄漏(即,当被指向的对象本身包含一个计数的指针,它指向另一个对象,而该对象又包含指向前一个对象的计数指针)。使用智能指针型垃圾收集机制的优点在于,它不仅简单易实现而且确定性强。确定性的行为对于一些实时系统来说很重要,比如那些不能让系统突然停下来去等待垃圾回收器执行其清理工作的系统。

一般来说,有两种方法来实现引用计数:侵入式和非侵入式。侵入式是指被指向的对象本身包含计数。因此,您不能对本身不提供此功能的第三方类使用侵入式的引用计数。但是,您可以从第三方类中派生一个新类,并在其中加入计数。非侵入式引用计数需要为每个被计数的对象的分配一个计数器。counted_ptr.h是一个非侵入式引用计数的例子。

 

Smart Pointers - What, Why, Which? 智能指针_第4张图片 Smart Pointers - What, Why, Which? 智能指针_第5张图片
引用链不需要修改被指向的对象(的代码),也不需要任何额外的内存。引用链型的智能指针比引用计数型的智能指针需要稍多一点的空间– 只需要存储一个或两个指针的空间。 Smart Pointers - What, Why, Which? 智能指针_第6张图片


如果是用在多线程环境中,引用计数和引用链都需要使用锁加以保护。

选择4:明确的所有权转让

有时候,你想接收指针作为函数参数,但是又想保留自己对对象的所有权(即在其生命周期内的控制权)。一种方法是使用一致的命名来约定好这种情况。《Taligent'sGuide to DesigningPrograms》建议使用“adopt”标记函数接受指针的所有权。

使用一个有所有权的指针作为函数的参数是相当于明确的声明:该函数的指针拥有对象的所有权。

选择5:大对象

如果你的对象占用很多的空间,你可以使用COW指针节省一些空间。通过这种方式,对象是共享的,仅仅在必要时才进行拷贝复制。这种共享的实现采用一些垃圾收集机制,如引用计数或引用链。

选择:汇总

For this:

Use that:

Local variables

auto_ptr

Class members

Copied pointer

STL Containers

Garbage collected pointer (e.g. reference counting/linking)

Explicit ownership transfer

Owned pointer

Big objects

Copy on write

 


结论

智能指针对于用C++编写安全,高效的代码来说,是非常有用的工具。同任何工具一样,使用时需要适当留意,考虑和知识。想要全面和深入的分析,智能指针的问题,我建议阅读AndreiAlexandrescu的《Modern C++ Design》一书中chapterabout smart pointers章节。

 


你可能感兴趣的:(C/C++)