In terms of time and space, a contiguous array of any kind is just about the optimal construct for accessing a sequence of objects in memory, and if you are serious about performance in any language you will “often” use arrays.
从时间和空间的角度来看,任何类型的连续数组都是访问内存中对象序列的最佳构造,如果真的考虑任何语言的性能,你都会“经常”使用数组。
However, there are good arrays (e.g., containers with contiguous storage such as std::array
and std::vector
) and there are bad arrays (e.g., C []
arrays). Simple C [] arrays are evil because a C array is a very low level data structure with a vast potential for misuse and errors and in essentially all cases there are better alternatives – where “better” means easier to write, easier to read, less error prone, and as fast.
然而,有好的数组(如具有连续存储的容器,如 std::array
和 std::vector
),也有坏的数组(如C中的 []
数组)。简单的C []
数组是很邪恶的,因为C数组是一种非常低级的数据结构,非常容易误用和出现错误,本质上所有情况下都有更好的替代方案——“更好”意味着更容易编写、更容易读取、更少容易出错,并且速度更快。
The two fundamental problems with C arrays are that
C 数组的两个基本问题是:
Consider some examples:
void f(int a[], int s)
{
// do something with a; the size of a is s
for (int i = 0; i < s; ++i) a[i] = i;
}
int arr1[20];
int arr2[10];
void g()
{
f(arr1,20);
f(arr2,20);
}
The second call will scribble all over memory that doesn’t belong to arr2
. Naturally, a programmer usually gets the size right, but it’s extra work and every so often someone makes a mistake. You should prefer the simpler and cleaner version using the standard library vector
or array
:
第二个调用会在内存中到处乱写不属于 arr2
的数据。自然地,程序员通常都能处理好大小,但这是额外的工作,而且不时会有人犯错。你应该更倾向使用标准库中的 vector
或 array
来实现更简单、更简洁的版本。
void f(vector<int>& v)
{
// do something with v
for (int i = 0; i < v.size(); ++i) v[i] = i;
}
vector<int> v1(20);
vector<int> v2(10);
void g()
{
f(v1);
f(v2);
}
template<size_t N> void f(array<int, N>& v)
{
// do something with v
for (int i = 0; i < N; ++i) v[i] = i;
}
array<int, 20> v1;
array<int, 10> v2;
void g()
{
f(v1);
f(v2);
}
Since a C array doesn’t know its size, there can be no array assignment:
由于C语言的数组不知道其大小,因此不能对数组进行赋值:
void f(int a[], int b[], int size)
{
a = b; // not array assignment
memcpy(a, b, size); // a = b
// ...
}
Again, prefer vector
or array
:
// This one can result in a changing size
void g(vector<int>& a, vector<int>& b)
{
a = b;
// ...
}
// In this one, a and b must be the same size
template<size_t N>
void g(array<int, N>& a, array<int, N>& b)
{
a = b;
// ...
}
Another advantage of vector
here is that memcpy()
is not going to do the right thing for elements with copy constructors, such as strings.
在这里,vector
的另一个优点是 memcpy()
不会对具有复制构造函数的元素(如字符串)进行正确的操作。
void f(string a[], string b[], int size)
{
a = b; // not array assignment
memcpy(a,b,size); // disaster
// ...
}
void g(vector<string>& a, vector<string>& b)
{
a = b;
// ...
}
A normal C array is of a fixed size determined at compile time (ignoring C99 VLAs, which currently have no analog in ISO C++):
普通的C数组在编译时具有固定的大小(忽略C99 VLAs,目前在ISO c++中没有类似的):
const int S = 10;
void f(int s)
{
int a1[s]; // error
int a2[S]; // ok
// if I want to extend a2, I'll have to change to an array
// allocated on free store using malloc() and use realloc()
// ...
}
To contrast:
对比:
const int S = 10;
void g(int s)
{
vector<int> v1(s); // ok
vector<int> v2(S); // ok
v2.resize(v2.size()*2);
// ...
}
C99 allows variable array bounds for local arrays, but those VLAs have their own problems. The way that array names “decay” into pointers is fundamental to their use in C and C++. However, array decay interacts very badly with inheritance. Consider:
C99允许本地数组有可变的数组边界,但这些VLA有自己的问题。数组将“衰变”命名为指针的方式是C和c++中使用指针的基础。然而,数组衰减与继承的交互非常糟糕。考虑:
class Base { void fct(); /* ... */ };
class Derived : Base { /* ... */ };
void f(Base* p, int sz)
{
for (int i=0; i<sz; ++i) p[i].fct();
}
Base ab[20];
Derived ad[20];
void g()
{
f(ab,20);
f(ad,20); // disaster!
}
In the last call, the Derived[]
is treated as a Base[]
and the subscripting no longer works correctly when sizeof(Derived)!=sizeof(Base)
– as will be the case in most cases of interest. If we used vector
s instead, the error would be caught at compile time:
在最后一个调用中,Derived[]
被视为 Base[]
,当 sizeof(Derived)!=sizeof(Base)
时,下标将不再正确工作——这将是我们感兴趣的大多数情况。如果我们使用vector
,错误将在编译时捕获:
void f(vector<Base>& v)
{
for (int i=0; i<v.size(); ++i) v[i].fct();
}
vector<Base> ab(20);
vector<Derived> ad(20);
void g()
{
f(ab);
f(ad); // error: cannot convert a vector to a vector
}
We find that an astonishing number of novice programming errors in C and C++ relate to (mis)uses of C arrays. Use std::vector
or std::array
instead.
我们发现,C和C++中惊人数量的初学者编程错误与(错误)使用C数组有关。请使用std::vector
或std::array
。
Let’s assume the best case scenario: you’re an experienced C programmer, which almost by definition means you’re pretty good at working with arrays. You know you can handle the complexity; you’ve done it for years. And you’re smart — the smartest on the team — the smartest in the whole company. But even given all that, please read this entire FAQ and think very carefully about it before you go into “business as usual” mode.
让我们假设最好的情况:你是一名经验丰富的C程序员,这几乎意味着你非常擅长使用数组。你知道你可以处理复杂的问题;你已经做了很多年了。你很聪明,是团队中最聪明的,是整个公司最聪明的。但即便如此,在你进入“正常业务”模式之前,请仔细阅读整个FAQ并仔细思考。
Fundamentally it boils down to this simple fact: C++ is not C. That means (this might be painful for you!!) you’ll need to set aside some of your hard earned wisdom from your vast experience in C. The two languages simply are different. The “best” way to do something in C is not always the same as the “best” way to do it in C++. If you really want to program in C, please do yourself a favor and program in C. But if you want to be really good at C++, then learn the C++ ways of doing things. You may be a C guru, but if you’re just learning C++, you’re just learning C++ — you’re a newbie. (Ouch; I know that had to hurt. Sorry.)
从根本上说,这可以归结为一个简单的事实:C++不是C。这意味着(这对你来说可能很痛苦!!)你需要将C中的丰富经验中一些来之不易的智慧放到一边。这两种语言完全不同。用C语言做某事的“最佳”方式并不总是与用C++做某事的“最佳”方式相同。如果你真的想用C语言编程,请帮自己一个忙,用C语言编程。但是如果你想真正擅长C++,那就学习C++的做事方式。你可能是一个C大师,但如果你只是学习C++,那么你只是一个初学者。(哎哟;我知道那一定很痛苦。抱歉。)
Here’s what you need to realize about containers vs. arrays: 【以下是关于容器和数组你需要了解的:】
Here are some specific problems with arrays: 【下面是数组的一些具体问题。】
4. Subscripts don’t get checked to see if they are out of bounds. (Note that some container classes, such as std::vector
, have methods to access elements with or without bounds checking on subscripts.) 【下标不会被检查是否越界。(注意,某些容器类,如 std::vector
,有访问元素的方法,可以对下标进行边界检查,也可以不进行边界检查。)】
5. Arrays often require you to allocate memory from the heap (see below for examples), in which case you must manually make sure the allocation is eventually delete
d (even when someone throw
s an exception). When you use container classes, this memory management is handled automatically, but when you use arrays, you have to manually write a bunch of code (and unfortunately that code is often subtle and tricky) to deal with this. For example, in addition to writing the code that destroys all the objects and delete
s the memory, arrays often also force you you to write an extra try
block with a catch
clause that destroys all the objects, delete
s the memory, then re-throws the exception. This is a real pain in the neck, as shown here. When using container classes, things are much easier. 【数组经常需要你从堆中分配内存(见下面的例子),在这种情况下,你必须手动确保分配最终被删除(即使有人抛出异常)。当你使用容器类时,这种内存管理是自动处理的,但当你使用数组时,你必须手动编写一堆代码(不幸的是,这些代码通常是微妙和棘手的)来处理它。例如,除了编写销毁所有对象并删除内存的代码外,数组通常还迫使你编写一个额外的 try
块,其中包含一个 catch
子句,用于销毁所有对象,删除内存,然后重新抛出异常。这是一个非常棘手的问题,如描述的这样。使用容器类时,事情会简单得多。】
6. You can’t insert an element into the middle of the array, or even add one at the end, unless you allocate the array via the heap, and even then you must allocate a new array and copy the elements.【不能在数组中间插入元素,甚至不能在末尾添加元素, 除非通过堆分配数组,但即使这样,也必须分配一个新数组并复制元素。】
7. Container classes give you the choice of passing them by reference or by value, but arrays do not give you that choice: they are always passed by reference. If you want to simulate pass-by-value with an array, you have to manually write code that explicitly copies the array’s elements (possibly allocating from the heap), along with code to clean up the copy when you’re done with it. All this is handled automatically for you if you use a container class. 【容器类可以选择按引用传递,也可以选择按值传递,但数组没有这个选择:它们总是按引用传递。如果你想模拟数组的值传递,就必须手动编写代码显式复制数组的元素(可能从堆中分配),以及在完成复制后清理副本的代码。如果您使用容器类,所有这些都会自动为您处理。】
8. If your function has a non-static
local array (i.e., an “automatic” array), you cannot return that array, whereas the same is not true for objects of container classes.【如果函数有一个非静态的局部数组(即“自动”数组),则不能返回该数组,而对于容器类的对象则不是这样。】
Here are some things to think about when using containers: 【以下是使用容器时需要考虑的一些事情:】
Different C++ containers have different strengths and weaknesses, but for any given job there’s usually one of them that is better — clearer, safer, easier/cheaper to maintain, and often more efficient — than an array. For instance, 【不同的C++容器有不同的优缺点,但对于任何给定的任务来说,通常有一种容器比数组更好——更清晰、更安全、更容易/更便宜维护,而且通常更高效。例如,】
std::map
instead of manually writing code for a lookup table. 【可以考虑使用 std::map
,而不是手动编写查找表代码】std::map
might also be used for a sparse array or sparse matrix. 【std::map
也可以用于稀疏数组或稀疏矩阵。】std::array
is the most array-like of the standard container classes, but it also offers various extra features such as bounds checking via the at()
member function, automatic memory management even if someone throws an exception, ability to be passed both by reference and by value, etc. 【std::array
是最像数组的标准容器类,但它还提供了各种额外的特性,例如通过 at()
成员函数进行边界检查,即使抛出异常也能自动管理内存,能够按引用和按值传递,等等。】std::vector
is the second-most array-like of the standard container classes, and offers additional extra features over std::array
such as insertions/removals of elements. 【 std::vector
是标准容器类中第二大类似数组的类,它提供了比 std::array
更多的特性,例如插入/删除元素。】std::string
is almost always better than an array of char
(you can think of a std::string
as a “container class” for the sake of this discussion). 【std::string
几乎总是比 char
数组好(为了讨论这个问题,你可以把 std::string
想象成一个“容器类”)。】Container classes aren’t best for everything, and sometimes you may need to use arrays. But that should be very rare, and if/when it happens: 【容器类并不适合所有情况,有时可能需要使用数组。但这种情况应该非常罕见,如果/当它发生时:】
public
interface in such a way that the code that uses the container class is unaware of the fact that there is an array inside. 【请设计容器类的公共接口,令使用容器类的代码不知道其中有一个数组。】To net this out, arrays really are evil. You may not think so if you’re new to C++. But after you write a big pile of code that uses arrays (especially if you make your code leak-proof and exception-safe), you’ll learn — the hard way. Or you’ll learn the easy way by believing those who’ve already done things like that. The choice is yours.
为了说明这一点,数组真的很邪恶。如果你刚接触C++,可能不会这么想。但是,当你编写了一大堆使用数组的代码(特别是在你的代码防泄漏和异常安全的情况下)后,你将了解这一点——这是一种艰难的方式。或者你可以通过相信那些已经做过类似事情的人来学会简单的方法。选择权在你。
perl
-like associative array in C++? (如何在C++中创建类似perl的关联数组?)Use the standard class template std::map
:
#include
#include
#include
int main()
{
// age is a map from string to int
std::map<std::string, int, std::less<std::string>> age;
age["Fred"] = 42; // Fred is 42 years old
age["Barney"] = 37; // Barney is 37
if (todayIsFredsBirthday()) // On Fred's birthday...
++ age["Fred"]; // ...increment Fred's age
std::cout << "Fred is " << age["Fred"] << " years old\n";
// ...
}
Nit: the order of elements in a std::map
are in the sort order based on the key, so from a strict standpoint, that is different from Perl’s associative arrays which are unordered. If you want an unsorted version for a closer match, you can use std::unordered_map
instead.
std::map
中元素的顺序是基于键的排序顺序,所以从严格的角度来看,这与Perl的无序关联数组是不同的。如果你想要一个未排序的版本来进行更紧密的匹配,可以使用 std::unordered_map
。
std::vector
or a std::array
guaranteed to be contiguous?(std::vector
或std::array
的存储空间保证是连续的吗?)Yes.
当然。
This means the following technique is safe:
#include
#include
#include "Foo.h" /* get class Foo */
// old-style code that wants an array
void f(Foo* array, unsigned numFoos);
void g()
{
std::vector<Foo> v;
std::array<Foo, 10> a;
// ...
f(v.data(), v.size()); // Safe
f(a.data(), a.size()); // Safe
}
In general, it means you are guaranteed that &v[0] + n == &v[n]
, where v
is a non-empty std::vector
or std::array
and n
is an integer in the range 0 .. v.size()-1
.
一般来说,这意味着保证 &v[0] + n == &v[n]
,其中 v
是非空的std::vector
或 std::array
,n
是范围为0 ~ v.size() - 1
的数。
However v.begin()
is not guaranteed to be a T*
, which means v.begin()
is not guaranteed to be the same as &v[0]
:
但是 v.begin()
不保证是 T*
,这意味着 v.begin()
不保证与 &v[0]
相同:
void g()
{
std::vector<Foo> v;
// ...
f(v.begin(), v.size()); // error, not guaranteed to be the same as &v[0]
↑↑↑↑↑↑↑↑↑ // cough, choke, gag; use v.data() instead
}
Also, using &v[0]
is undefined behavior if the std::vector
or std::array
is empty, while it is always safe to use the .data()
function.
另外,如果 std::vector
或 std::array
为空,那么使用 ·&v[0]· 就是未定义行为,而使用 .data()
函数总是安全的。
Note: It’s possible the above code might compile for you today. If your compiler vendor happens to implement std::vector
or std::array
iterators as T*
’s, the above may happen to work on that compiler – and at that, possibly only in release builds, because vendors often supply debug iterators that carry more information than a T*
. But even if this code happens to compile for you today, it’s only by chance because of a particular implementation. It’s not portable C++ code. Use .data()
for these situations.
注意:上面的代码可能今天就能编译成功。如果你的编译器厂商碰巧将 std::vector
或 std::array
迭代器实现为 T*
的迭代器,那么上述代码可能在该编译器上有效——而且可能只在发布版本中有效,因为厂商提供的调试迭代器通常携带的信息比 T*
更多。但即使这段代码今天能通过编译,这也只是偶然的,因为有特定的实现。它不是可移植的C++代码。在这些情况下使用 .data()
。
The C++ standard library provides a set of useful, statically type-safe, and efficient containers. Examples are vector
, list
, and map
:
C++ 标准库提供了一组有用的、静态类型安全的、高效的容器。例如 vector
、list
和 map
:
vector<int> vi(10);
vector<Shape*> vs;
list<string> lst;
list<double> l2
map<string,Record*> tbl;
unordered_map<Key,vector<Record*> > t2;
These containers are described in all good C++ textbooks, and should be preferred over arrays and “home cooked” containers unless there is a good reason not to.
这些容器在所有优秀的C++教科书中都有介绍,应该优先使用,而不是数组和“家常”容器,除非有充分的理由不使用。
These containers are homogeneous; that is, they hold elements of the same type. If you want a container to hold elements of several different types, you must express that either as a union or (usually much better) as a container of pointers to a polymorphic type. The classical example is:
这些容器是同构的;也就是说,它们持有相同类型的元素。如果想在容器中保存几种不同类型的元素,则必须表示为联合,或者(通常更好)表示为指向多态类型的指针的容器。经典的例子是:
vector<Shape*> vi; // vector of pointers to Shapes
Here, vi
can hold elements of any type derived from Shape
. That is, vi
is homogeneous in that all its elements are Shapes (to be precise, pointers to Shape
s) and heterogeneous in the sense that vi
can hold elements of a wide variety of Shape
s, such as Circle
s, Triangle
s, etc.
在这里,vi
可以保存继承自 Shape
的任何类型的元素。也就是说,vi
是同构的,因为它的所有元素都是Shape(准确地说,是指向Shape 的指针);在某种意义上,也可以说 vi
是异构的,因为 vi
可以包含各种形状的元素,如圆、三角形等。
So, in a sense all containers (in every language) are homogeneous because to use them there must be a common interface to all elements for users to rely on. Languages that provide containers deemed heterogeneous simply provide containers of elements that all provide a standard interface. For example, Java collections provide containers of (references to) Object
s and you use the (common) Object interface to discover the real type of an element.
因此,从某种意义上说,所有容器(每种语言中的容器)都是同构的,因为要使用它们,必须为用户依赖的所有元素提供一个公共接口。提供容器的语言被认为是异构的,只是提供了元素的容器,这些元素都提供了标准接口。例如,Java集合提供了对象的容器(引用),你可以使用(公共)Object接口来发现元素的真正类型。
The C++ standard library provides homogeneous containers because those are the easiest to use in the vast majority of cases, gives the best compile-time error message, and imposes no unnecessary run-time overheads.
C++标准库提供了同构的容器,因为这些容器在大多数情况下都是最容易使用的,提供了最好的编译时错误消息,并且没有不必要的运行时开销。
If you need a heterogeneous container in C++, define a common interface for all the elements and make a container of those. For example:
如果你需要在C++中使用一个异构容器,可以为所有元素定义一个公共接口,并将这些元素制成一个容器。例如:
class Io_obj { /* ... */ }; // the interface needed to take part in object I/O
vector<Io_obj*> vio; // if you want to manage the pointers directly
vector<shared_ptr<Io_obj>> v2; // if you want a "smart pointer" to handle the objects
Don’t drop to the lowest level of implementation detail unless you have to:
除非迫不得已,否则不要把实现细节降到最低:
vector<void*> memory; // rarely needed
A good indication that you have “gone too low level” is that your code gets littered with casts.
如果你的代码被强制转换弄得一团糟,就说明你的代码级别太低了。
Using an Any
class, such as boost::any, can be an alternative in some programs:
在某些程序中,使用Any
类(如boost:: Any)是一种替代方案:
vector<any> v = { 5, "xyzzy", 3.14159 };
If all objects you want to store in a container are publicly derived from a common base class, you can then declare/define your container to hold pointers to the base class. You indirectly store a derived class object in a container by storing the object’s address as an element in the container. You can then access objects in the container indirectly through the pointers (enjoying polymorphic behavior). If you need to know the exact type of the object in the container you can use dynamic_cast<>
or typeid()
. You’ll probably need the Virtual Constructor Idiom to copy a container of disparate object types. The downside of this approach is that it makes memory management a little more problematic (who “owns” the pointed-to objects? if you delete
these pointed-to objects when you destroy the container, how can you guarantee that no one else has a copy of one of these pointers? if you don’t delete
these pointed-to objects when you destroy the container, how can you be sure that someone else will eventually do the delete
ing?). It also makes copying the container more complex (may actually break the container’s copying functions since you don’t want to copy the pointers, at least not when the container “owns” the pointed-to objects). In that case, you can use std::shared_ptr
to manage the objects, and the containers will copy correctly.
如果你想存储在容器中的所有对象都是公共地派生自一个共同的基类,你可以声明/定义容器来保存指向基类的指针。通过将派生类对象的地址作为元素存储在容器中,可以间接地将该类对象存储在容器中。然后可以通过指针间接地访问容器中的对象(享受多态行为)。如果你想知道容器中对象的的确切类型,可以使用dynamic_cast<>
或 typeid()
。你可能需要虚构造函数惯用法来赋值不同对象类型的容器。这种方法的缺点是它使得内存管理更有问题(谁“拥有”指向的对象?如果在销毁容器的同时删除了这些指针指向的对象,如何保证这些指针中的任何一个都没有被其他人拥有副本呢?如果在销毁容器时不删除这些指向对象,又怎么能确定最终会有其他人来删除呢?)它还会使复制容器变得更加复杂(实际上可能会破坏容器的复制函数,因为你不想复制指针,至少在容器“拥有”指向的对象时不会)在这种情况下,可以使用 std::shared_ptr
来管理对象,容器将能够正确地进行复制。
The second case occurs when the object types are disjoint — they do not share a common base class. The approach here is to use a handle class. The container is a container of handle objects (by value or by pointer, your choice; by value is easier). Each handle object knows how to “hold on to” (i.e., maintain a pointer to) one of the objects you want to put in the container. You can use either a single handle class with several different types of pointers as instance data, or a hierarchy of handle classes that shadow the various types you wish to contain (requires the container be of handle base class pointers). The downside of this approach is that it opens up the handle class(es) to maintenance every time you change the set of types that can be contained. The benefit is that you can use the handle class(es) to encapsulate most of the ugliness of memory management and object lifetime.
第二种情况发生在对象类型不相交的时候——它们不共享一个共同的基类。这里的方法是使用一个handle类。容器是handle对象的容器(由值或指针决定;按值更简单)。每个handle对象都知道如何“保持”(即维护一个指向要放入容器中的对象的指针)。可以使用具有几种不同类型指针的单个handle类作为实例数据,也可以使用handle类的层次结构来遮蔽您希望包含的各种类型(要求容器为handle基类指针)。这种方法的缺点是,每次更改可以包含的类型集时,它都会打开handle类来维护。这样做的好处是,你可以使用handle类来封装内存管理和对象生存期的大部分麻烦。
See heterogeneous containers.
vector
to a vector
?(为什么不能将vector
赋值给vector
?)Because that would open a hole in the type system. For example:
因为这会在类型系统中打开一个漏洞。例如:
class Apple : public Fruit { void apple_fct(); /* ... */ };
class Orange : public Fruit { /* ... */ }; // Orange doesn't have apple_fct()
vector<Apple*> v; // vector of Apples
void f(vector<Fruit*>& vf) // innocent Fruit manipulating function
{
vf.push_back(new Orange); // add orange to vector of fruit
}
void h()
{
f(v); // error: cannot pass a vector as a vector
for (int i=0; i<v.size(); ++i) v[i]->apple_fct();
}
Had the call f(v)
been legal, we would have had an Orange
pretending to be an Apple
.
如果调用 f(v)
合法,我们就会有一个橙子假装成苹果。
An alternative language design decision would have been to allow the unsafe conversion, but rely on dynamic checking. That would have required a run-time check for each access to v
’s members, and h()
would have had to throw an exception upon encountering the last element of v
.
另一种语言设计决策是允许不安全的转换,但依赖动态检查。这就需要每次访问 v
的成员时都进行运行时检查,而 h()
遇到 v
的最后一个元素时必须抛出异常。
The most important thing to remember is this: don’t roll your own from scratch unless there is a compelling reason to do so. In other words, instead of creating your own list or hashtable, use one of the standard class templates such as std::vector
or std::list
or whatever.
要记住的最重要的事情是:不要自己动手,除非有令人信服的理由。换句话说,使用 std::vector
或 std::list
之类的标准类模板,而不是创建自己的链表或哈希表。
Assuming you have a compelling reason to build your own container, here’s how to handle inserting (or accessing, changing, etc.) the elements.
假设你有充分的理由构建自己的容器,下面是如何处理插入(或访问、更改等)元素的方法。
To make the discussion concrete, I’ll discuss how to insert an element into a linked list. This example is just complex enough that it generalizes pretty well to things like vectors, hash tables, binary trees, etc.
为了使讨论具体化,我将讨论如何向链表中插入元素。这个例子足够复杂,可以很好地推广到向量、散列表、二叉树等。
A linked list makes it easy to insert an element before the first or after the last element of the list, but limiting ourselves to these would produce a library that is too weak (a weak library is almost worse than no library). This answer will be a lot to swallow for novice C++’ers, so I’ll give a couple of options. The first option is easiest; the second and third are better.
在链表中,可以很容易地在第一个元素之前或最后一个元素之后插入元素,但只使用这些会让库过于弱(弱的库几乎比没有库更糟糕)。这个答案对于C++新手来说很难理解,所以我将给出几个选择。第一个选择是最简单的;第二个和第三个比较好。
List
with a “current location,” and member functions such as advance()
, backup()
, atEnd()
, atBegin()
, getCurrElem()
, setCurrElem(Elem)
, insertElem(Elem)
, and removeElem()
. Although this works in small examples, the notion of a current position makes it difficult to access elements at two or more positions within the list (e.g., “for all pairs x,y do the following…”). 【赋予List一个“当前位置”和成员函数,如advance()、backup()、atEnd()、atBegin()、getCurrElem()、setCurrElem(Elem)、insertElem(Elem)和removeElem()。虽然这在小示例中有效,但当前位置的概念使得访问列表中两个或多个位置的元素变得困难(例如,“对于所有对x,y执行以下…”)。】List
itself, and move them to a separate class, ListPosition
. ListPosition
would act as a “current position” within a list. This allows multiple positions within the same list. ListPosition
would be a friend
of class List
, so List
can hide its innards from the outside world (else the innards of List
would have to be publicized via public member functions in List
). Note: ListPosition
can use operator
overloading for things like advance()
and backup()
, since operator
overloading is syntactic sugar for normal member functions.【从List中删除上述成员函数,并将它们移动到单独的类ListPosition中。ListPosition相当于列表中的“当前位置”。这允许在同一个列表中有多个位置。ListPosition是List类的友元,因此List可以向外界隐藏它的内部结构(否则List的内部结构必须通过List中的public成员函数公开)。注意:ListPosition可以对advance()和backup()这样的函数使用运算符重载,因为运算符重载是普通成员函数的语法糖。】Yes, and you really want to do this, as smart pointers make your life easier and make your code more robust compared to the alternatives.
是的,你确实想这样做,因为与替代方案相比,智能指针使你的生活更轻松,使你的代码更健壮。
Note: forget that std::auto_ptr
ever existed. Really. You don’t want to use it, ever, especially in containers. It is broken in too many ways.
注意:请忘记 std::auto_ptr
曾经存在过。真的。你永远都不想使用它,尤其是在容器中。它在很多方面都有缺陷。
Let’s motivate this discussion with an example. This first section shows why you’d want to use smart pointers in the first place - this is what not to do:
让我们用一个例子来激发这个讨论。第一节展示了为什么你需要首先使用智能指针——这是不应该做的:
#include
class Foo {
public:
// ...blah blah...
};
void foo(std::vector<Foo*>& v) // ← BAD FORM: a vector of dumb pointers to Foo objects
{
v.push_back(new Foo());
// ...
delete v.back(); // you have a leak if this line is skipped
v.pop_back(); // you have a "dangling pointer" if control-flow doesn't reach this line
}
If control flow doesn’t reach either of the last two lines, either because you don’t have it in your code or you do a return
or something throws an exception, you will have a leak or a “dangling pointer”; bad news. The destructor of std::vector
cleans up whatever allocations were made by the std::vector
object itself, but it will not clean up the allocation that you made when you allocated a Foo
object, even though you put a pointer to that allocated Foo
object into the std::vector
object.
如果控制流没有到达最后两行,要么是因为你的代码中没有它,要么是因为执行了return
或抛出了异常,那么就会出现泄漏或“悬空指针”;坏消息。std::vector
的析构函数会清除 std::vector
对象本身所进行的任何分配,但是它不会清除为 Foo
对象分配内存时所进行的分配,即使你将指向已分配内存的 Foo
对象的指针放在了 std::vector
对象中。
That’s why you’d want to use a smart pointer.
这就是为什么你想用智能指针.
Now let’s talk about how to use a smart pointer. There are lots of smart pointers that can be copied and still maintain shared ownership semantics, such as std::shared_ptr
and many others. For this example, we will use std::shared_ptr
, though you might choose another based on the semantics and performance trade-offs you desire.
现在让我们谈谈如何使用智能指针。有很多智能指针可以在被复制的同时仍然保持共享的所有权语义,比如 std::shared_ptr
等。在这个例子中,我们将使用 std::shared_ptr
,不过你也可以根据语义和性能权衡选择另一种。
typedef std::shared_ptr<Foo> FooPtr; // ← GOOD: using a smart-pointer
void foo(std::vector<FooPtr>& v) // ← GOOD: using a container of smart-pointer
{
// ...
}
This just works safely with all operations. The object is destroyed when the last shared_ptr
to it is destroyed or set to point to something else.
这对所有操作都是安全的。当最后一个指向它的 shared_ptr
被销毁或被设置为指向其他对象时,对象就会被销毁。
Using a std::unique_ptr
in a std::vector
is safe, but it has some restrictions. The unique_ptr
is a move-only type, it can’t be copied. This move-only restriction then applies to the std::vector
containing them as well.
在 std::vector
中使用 std::unique_ptr
是安全的,但有一些限制。unique_ptr
是只允许移动的类型,不能被复制。这个只能移动的限制也适用于包含它们的 std::vector
。
void create_foo(std::vector<std::unique_ptr<Foo>> &v)
{
v.emplace_back(std::make_unique<Foo>(/* ... */));
}
If you want to put an element from this vector into another vector, you must move it to the other vector, as only one unique_ptr
at a time can point to the same Foo
object.
如果你想把这个vector中的一个元素放到另一个vector中,就必须把它移动到另一个vector中,因为同一时刻只有一个 unique_ptr
指向同一个 Foo
对象。
There are lots of good articles on this general topic, such as Herb Sutter’s in Dr. Dobbs and many others.
关于这个主题有很多不错的文章,比如Herb Sutter在《多布斯博士》(Dr. Dobbs)和其他很多文章。
They aren’t, they’re among the fastest on the planet.
它们不是,它们是地球上速度最快的。
Probably “compared to what?” is a more useful question (and answer). When people complain about standard-library container performance, we usually find one of three genuine problems (or one of the many myths and red herrings):
可能是“与什么相比?”是一个更有用的问题(和答案)。当人们抱怨标准库容器的性能时,我们通常会发现以下三个真正的问题之一(或者是许多误解和转移注意力的东西之一):
std::list
【我手工编写的(侵入式的)列表比 std::list
快得多】Before trying to optimize, consider if you have a genuine performance problem. In most of the cases sent to me, the performance problem is theoretical or imaginary: First measure, then optimize only if needed.
在尝试优化之前,请考虑是否真的存在性能问题。在我收到的大多数案例中,性能问题都是理论上或想象出来的:首先测量,然后只在需要时进行优化。
Let’s look at those problems in turn. Often, a vector
is slower than somebody’s specialized My_container
because My_container
is implemented as a container of pointers to X
(brief spoiler: if you want that, you have it too: vector
– more on this in a moment). A vector
(no *
) holds copies of values, and copies a value when you put it into the container. This is essentially unbeatable for small values, but can be quite unsuitable for huge objects:
让我们依次来看一下这些问题。通常,vector
比某人专用的 My_container
慢,因为 My_container
被实现为指向 X
的指针的容器(简要剧播:如果你想这样,你也有它: ·vectorvector
(没有*
)保存值的副本,并且当你将值放入容器时复制它。这对于小的值来说基本上是无敌的,但对于大型对象来说可能非常不合适:
vector<int> vi;
vector<Image> vim;
// ...
int i = 7;
Image im("portrait.jpg"); // initialize image from file
// ...
vi.push_back(i); // put (a copy of) i into vi
vim.push_back(im); // put (a copy of) im into vim
Now, if portrait.jpg
is a couple of megabytes and Image
has value semantics (i.e., copy assignment and copy construction make copies) then vim.push_back(im)
will indeed be expensive. But – as the saying goes – if it hurts so much, just don’t do it.
现在,如果 portrait.jpg
有几兆字节,并且图像具有值语义(即,复制赋值和复制构造会产生副本),那么 vim.push_back(im)
确实会很昂贵。但是,就像俗话说的那样,如果真的很疼,那就不要做了。
Move semantics and in-place construction can negate many of these costs if the vector is going to own the object, and you don’t need copies of it elsewhere.
如果vector 要拥有对象,且不需要在其他地方复制对象,那么移动语义和就地构造可以消除这些开销。
vector<Image> vim;
vim.emplace_back("portrait.jpg"); // create image from file in place in the vector
Alternatively, either use a container of handles or a container of pointers. For example, if Image
had reference semantics, the code above would incur only the cost of a copy constructor call, which would be trivial compared to most image manipulation operators. If some class, say Image
again, does have copy semantics for good reasons, a container of pointers is often a reasonable solution:
或者,使用句柄容器或指针容器。例如,如果 Image
具有引用语义,上面的代码只会产生复制构造函数调用的开销,与大多数图像操作操作符相比,这是微不足道的。如果某些类,比如 Image
,有充分的理由具有复制语义,则指针容器通常是一个合理的解决方案:
vector<int> vi;
vector<Image*> vim;
// ...
Image im("portrait.jpg"); // initialize image from file
// ...
vi.push_back(7); // put (a copy of) 7 into vi
vim.push_back(&im); // put (a copy of) &im into vim
Naturally, if you use pointers, you have to think about resource management, but containers of pointers can themselves be effective and cheap resource handles (often, you need a container with a destructor for deleting the “owned” objects), or you can simply use a container of smart pointers.
当然,如果你使用指针,你必须考虑资源管理,但是指针容器本身可以是有效且廉价的资源句柄(通常,你需要一个带有析构函数的容器,用于删除“拥有的”对象),或者你可以简单地使用智能指针的容器。
The second frequently occurring genuine performance problem is the use of a map
for a large number of (string,X)
pairs. Maps are fine for relatively small containers (say a few hundred or few thousand elements – access to an element of a map
of 10000 elements costs about 9 comparisons), where less-than is cheap, and where no good hash-function can be constructed. If you have lots of strings and a good hash function, use an unordered_map
.
第二个经常发生的真正性能问题是用map
表示大量的 (string, X)
对。map 适用于相对较小的容器(如只有几百或几千个元素——访问包含10000 个元素的 map 中的元素大概需要9次比较),这种情况下, less-than是便宜的,且无法构建良好的散列函数。如果你有很多字符串和一个好的散列函数,请使用 unordered_map
。
Sometimes, you can speed up things by using (const char*,X)
pairs rather than (string,X)
pairs, but remember that <
doesn’t do lexicographical comparison for C-style strings. Also, if X
is large, you may have the copy problem also (solve it in one of the usual ways).
有时,你可以使用 (const char*, X)
对而不是 (string, X)
来加快速度,但请记住 <
不会对C风格的字符串进行字典序比较。另外,如果 X
很大,你可能也会遇到复制问题(用一种常用的方法解决)。
Intrusive lists can be really fast. However, consider whether you need a list at all: A vector
is more compact and is therefore smaller and faster in many cases – even when you do inserts and erases. For example, if you logically have a list of a few integer elements, a vector
is significantly faster than a list (any list). Also, intrusive lists cannot hold built-in types directly (an int
does not have a link member). So, assume that you really need a list and that you can supply a link field for every element type. The standard-library list
by default performs an allocation followed by a copy for each operation inserting an element (and a deallocation for each operation removing an element). For std::list
with the default allocator, this can be significant. For small elements where the copy overhead is not significant, consider using an optimized allocator. Use a hand-crafted intrusive lists only where a list and the last ounce of performance is needed.
侵入式链表的速度非常快。然而,考虑一下你是否真的需要链表:vector
更紧凑,因此在许多情况下更小、更快——即使在进行插入和删除操作时也是如此。例如,如果逻辑上有一个由几个整数元素组成的链表,那么 vector
明显比链表(任何链表)快得多。此外,侵入式链表不能直接保存内置类型(int没有链接成员)。因此,假设你确实需要一个链表,并且可以为每种元素类型提供一个link字段。在默认情况下,标准库链表为每个插入元素的操作执行分配,并保存副本(并为每个删除元素的操作执行回收)。对于具有默认分配器的 std::list
来说,这可能很重要。对于复制开销不大的小元素,可以考虑使用优化的分配器。仅在需要链表和最后一点性能的地方使用手工制作的侵入式链表。
People sometimes worry about the cost of std::vector
growing incrementally. Many C++ programmers used to worry about that and used reserve()
to optimize the growth. After measuring their code and repeatedly having trouble finding the performance benefits of reserve()
in real programs, they stopped using it except where it is needed to avoid iterator invalidation (a rare case in most code). Again: measure before you optimize.
人们有时会担心 std::vector
的开销会递增。许多C++程序员过去常常担心这个问题,并使用 reserve()
来优化增长。在测量了他们的代码并不断地发现reverse()
没有性能优势后,他们停止使用它,除非需要避免迭代器失效(大多数代码中,这是一种很罕见的情况)。再次强调:优化之前进行测量。
The cost of std::vector
growing incrementally in C++11 can be a lot less than it was in C++98/03 when you are using move-aware types, such as std::string
or even std::vector
, as when the vector
is reallocated, the objects are moved into the new storage instead of copied.
当你使用 std::string
甚至 std::vector
等支持移动的类型时,在C++ 11中 std::vector
的增量增长开销比在C++ 98/03中要小得多,因为在重新分配 vector
时,对象会被移动到新的存储空间中,而不是复制。