Return value optimization, simply RVO, is a compiler optimization technique that allows the compiler to construct the return value of a function at the call site. The technique is also named "elision". C++98/03 standard doesn’t require the compiler to provide RVO optimization, but most popular C++ compilers contain this optimization technique, such as IBM XL C++ compiler, GCC and Clang . This optimization technique is included in the C++11 standard due to its prevalence. As defined in Section 12.8 in the C++11 standard, the name of the technique is "copy elision".
Let’s start with one example to see how the RVO works. Firstly, we have a class named BigObject. Its size could be so large that copying it would have high cost. Here I just define constructor, copy constructor and destructor for convenience.
class BigObject {
public:
BigObject() {
cout << "constructor. " << endl;
}
~BigObject() {
cout << "destructor."<< endl;
}
BigObject(const BigObject&) {
cout << "copy constructor." << endl;
}
};
We then define one function named foo to trigger the RVO optimization and use it in the main function to see what will happen.
BigObject foo() {
BigObject localObj;
return localObj;
}
int main() {
BigObject obj = foo();
}
Some people will call this case using named return value optimization(NRVO), because foo returns one temporary object named localObj. They think that what returns BigObject() is RVO. You don't need to worry about it, as NRVO is one variant of RVO.
Let’s compile this program and run: xlC a.cpp ;./a,out. (The version of the XL C/C++ compiler used is V13 and the environment is Linux little endian.) The output is like this:
constructor. (注:foo中构建时调用)
destructor. (注:main退出时调用)
It is amazing, right? There is no copy constructor here. When the price of coping is high, RVO enables us to run the program much faster.
However, when we modify our code a little, things will change.
BigObject foo(int n) {
BigObject localObj, anotherLocalObj;
if (n > 2) {
return localObj;
} else {
return anotherLocalObj;
}
}
int main() {
BigObject obj = foo(1);
}
The output will be like this:
constructor. (注:foo中构建时调用)
constructor. (注:foo中构建时调用)
copy constructor. (注:foo返回到main时从临时变量中拷贝到obj时调用)
destructor. (注:foo中析构时调用)
destructor. (注:foo中析构时调用)
destructor. (注:foo中析构时调用)
We can find that copy constructor is called. Why? It's time to show the mechanism of RVO.
This diagram is a normal function stack frame. If we call the function without RVO, the function simply allocates the space for the return in its own frame. The process is demonstrated in the following diagram:
What will happen if we call the function with RVO?
We can find that RVO uses parent stack frame (or an arbitrary block of memory) to avoid copying. So, if we add if-else branch, the compiler doesn’t know which return value to put.
We first need to understand what std::move is. Many C++ programmers misunderstand std::move so that they use std::move in wrong situations. Let us see the implementation of std::move.
template<typename T>
decltype(auto) move(T&& param)
{
using ReturnType = remove_reference_t<T>&&;
return static_cast<ReturnType>(param);
}
In fact, the key step std::move performs is to cast its argument to an rvalue. It also instructs the compiler that it is eligible to move the object, without moving anything. So you can also call "std::move" as "std::rvalue_cast", which seems to be more appropriate than "std::move".
The price of moving is lower than coping but higher than RVO. Moving does the following two things:
1. Steal all the data
2. Trick the object we steal into forgetting everything
If we want to instruct the compiler to move, we can define move constructor and move assignment operator. I just define move constructor for convenience here.
BigObject(BigObject&&) {
cout << "move constructor"<< endl;
}
Let us modify the code of foo.
BigObject foo(int n) {
BigObject localObj, anotherLocalObj;
if (n > 2) {
return std::move(localObj);
} else {
return std::move(anotherLocalObj);
}
}
Let’s compile it and run: xlC –std=c++11 a.cpp;./a.out. The result is:
constructor.
constructor.
move constructor (注:foo返回到main时调用)
destructor.
destructor.
destructor.
We can find that move constructor is called as expected. However, we must also note that the compiler also calls destructor while RVO doesn’t.
Some people would think that returning std::move(localObj) is always beneficial. If the compiler can do RVO, then RVO. Otherwise, the compiler calls move constructor. But I must say: Don't DO THAT!
Let us see what will happen if we do this:
BigObject foo(int n) {
BigObject localObj;
return std::move(localObj);
}
int main() {
auto f = foo(1);
}
Maybe you think that the compiler will do RVO, but you actually get this result:
constructor. (注:foo中构建时调用)
move constructor (注:从foo返回到main时调用)
destructor. (注:foo中析构)
destructor. (注:main中析构)
The compiler doesn’t do RVO but call move constructor! To explain it, we need to look at the details of copy elision in C++ standard first:
in a return statement in a function with a class return type, when the expression is the name of a non-volatile automatic object (other than a function or catch-clause parameter) with the same cv-unqualified type as the function return type, the copy/move operation can be omitted by constructing the automatic object directly into the function’s return value
That is to say, we must keep our type of return statement the same as function return type.
Let's then, refresh our memory about std::move a little bit. std::move is just an rvalue-cast. In other words, std::move will cast localObj to localObj&& and the type is BigObject&&, but the function type is just BigObject. BigObject is not BigObject&&, so this is the reason why RVO didn’t take place just now.
We now change the foo function return type and obj type is BigObject&&:
BigObject&& foo(int n) {
BigObject localObj;
return std::move(localObj);
}
int main() {
auto f = foo(1);
}
Then we compile and run it, and we will get the output like this: (注:这里没有懂,下面结果表示了什么?)
constructor.
destructor.
move constructor
destructor.
Yes! The compiler does RVO! (Note: We should not use this way in the real development, because it is a reference to a local object. Here just show how to make RVO happened.).
To summarize, RVO is a compiler optimization technique, while std::move is just an rvalue cast, which also instructs the compiler that it's eligible to move the object. The price of moving is lower than copying but higher than RVO, so never apply std::move to local objects if they would otherwise be eligible for the RVO.
下面一篇知乎的回答也很好地解释了RVO
编程时经常会写的一种函数叫做named constructor,这种函数的返回值是某个类的实例,其实本质上就是一种构造函数,但是因为可能需要在构建时执行一些其他的步骤,所以没有写成constructor的形式。比如:
User create_user(const std::string &username, const std::string &password) {
User user(username, password);
validate_and_save_to_db(user);
return user;
}
void signup(const std::string &username, const std::string &password) {
auto new_user = create_user(username, password);
login(user);
}
这里create_user就是一个named constructor。
一些c++初学者可能会觉得这个代码不够优化,因为按照这个代码字面意思来理解,create_user创建先创建了一个user,然后返回时又把user赋值给new_user,这个赋值会copy user里面的内容,如果user很大的话(很有可能user里面存了很多信息,比如username这种string的类型),这样太慢(copy string可能还需要多做一次malloc)。
这样难免一些人会想用c++11引入的move来优化,因为create_user里面的user return了之后就没用了,我可以把user move到new_user这样不就省掉了copy时很大的开销了么(比如user里面的username就不需要malloc新内存也不需要一个个字符copy了)。
但事实上,在这种简单的情况下编译器比你更聪明,编译器可以直接把user创建在new_user里,所以user只被创建一次,没有任何copy开销,user和new_user经过编译器优化之后其实是同一个variable!这种优化就叫做copy elision。但是很不幸的是,如果用户想自己用move优化的话,编译器就不用做copy elision了,只能乖乖地按照用户说的来,先创建一个user,然后在调用User的move constructor来创建new_user。这样肯定比前一种开销大很多。这就是为什么clang非常“聪明”地给题主的例子给了一个warning。
接下来我们再说说我们怎么能知道编译器会不会对我写的函数做copy elision的优化呢?有没有可能我写的函数逻辑特别复杂,编译器没法优化呢?如果有的话,我如果写return move(a)不就会比copy更快了吗?
这个逻辑是正确的,编译器其实很傻,一旦create_user里面的逻辑太复杂,编译器可能就没办法分析出你能不能用一个变量取代两个(user和new_user),那它就不做copy elision了。这时候用move就合情合理。
那到底什么时候应该move,什么时候应该依靠copy elision呢?通常主流的编译器都会100% copy elision以下两种情况:
1. URVO(Unnamed Return Value Optimization):函数的所有执行路径都返回同一个类型的匿名变量,比如
User create_user(const std::string &username, const std::string &password) {
if (find(username)) return get_user(username);
else if (validate(username) == false) return create_invalid_user();
else User{username, password};
}
这里所有的return都返回一个User类型,且每个返回的都是一个匿名变量。那编译器100%会执行copy elision。
2. NRVO(Named Return Value Optimization):函数的所有路径都返回同一个非匿名变量,比如
User create_user(const std::string &username, const std::string &password) {
User user{username, password};
if (find(username)) {
user = get_user(username);
return user;
} else if (user.is_valid() == false) {
user = create_invalid_user();
return user;
} else {
return user;
}
}
这里因为所有路径都返回同一个变量user。编译器100%会执行copy elision。
其他的情况编译器可能都不会使用copy elision的优化。
结合上述神奇先生的论述,在能够使用copy elision时,我们不要在return时加std::move()。在copy elision不work时,我们还是要加上std::move()从而调用move constructor而不是调用copy constructor.