By Ajay Vijayvargiya |
Elaborating new C++ language features with a clear, sharp, and detailed discussion.
As you might be aware, the C++ language is being updated by the ISO standard. The codename for the new C++ language is C++0x, and many compilers have already introduced some of the features. This tutorial is an attempt to give you an introduction to the new changes in the C++ language. Please note that I am explaining the new features for the Visual C++ 2010 compiler only, although they are applicable for other compilers as well. I do not comment for absolute syntax in other compilers.
This article assumes that you have moderate knowledge of C++, and that you know what type casting is, what const methods are, and what templates are (in a basic sense).
Following is the list of the new C++ language features that I will be discussing. I've put more emphasis on lambdas and R-values, since I do not find any digestible stuff anywhere. For now, I've not used templates or STL for simplicity, but I would probably update this article to add content on templates/STL.
The auto keyword |
For automatic data-type deduction (at compile time), depending on assignment. |
The decltype keyword |
For deducing data-type from expression, or an |
The nullptr keyword |
Null pointer is now promoted, and has been awarded a keyword! |
The static_assert keyword |
For compile time assertions. Useful for templates, and validations that cannot be done using |
Lambda Expressions |
Locally defined functions. Inherits features from function-pointers and class objects (functors). |
Trailing return types |
Mainly useful when a templated function's return type cannot be expressed. |
R-value references |
Move semantics - resource utilization before a temporary object gets destroyed. |
Other language features |
C++ language features that were already included in VC8 and VC9 (VS2005, VS2008) but were not added into the C++ standard. These features are now classified in C++0x. This article explains them (not all), in brief. |
Let's get started!
The auto
keyword now has one more meaning. I assume you do know the original purpose of this keyword. With the revised meaning, you can declare a local variable without specifying the data-type of the variable.
For example:
auto nVariable = 16;
The code above declares the nVariable
variable without specifying its type. With the expression on the right side, the compiler deduces the type of the variable. Thus the above code would be translated by the compiler as:
int nVariable = 16;
As you can also deduce, the assignment of the variable is now mandatory. Thus, you cannot declare the auto variable like:
auto nVariable ;
// Compiler error:
// error C3531: 'nVariable': a symbol whose type contains
// 'auto' must have an initializer
Here, the compiler does not (cannot) know the data type of the variable nResult
. Please note that, with the auto
keyword:
Having said that, no matter how complicated your assignment is, the compiler would still determine the data type. If the compiler cannot deduce the type, it would emit an error. This is not like in Visual Basic or web scripting languages.
auto nVariable1 = 56 + 54; // int deduction
auto nVariable2 = 100 / 3; // int deduction
auto nVariable3 = 100 / 3.0; // double deduction.
// Since compiler takes the whole expression to be double
auto nVariable4 = labs(-127); // long, Since return type of 'labs' is long
Let's get slightly complicated (continuing with the above declared variables):
// nVariable3 was deduced as double.
auto nVariable5 = sqrt(nVariable3);
// Deduced as double, since sqrt takes 3 different datatypes,
// but we passed double, and that overload returns double!
auto nVariable = sqrt(nVariable4);
// Error since sqrt call is ambiguous.
// This error is not related to 'auto' keyword!
Pointer deductions:
auto pVariable6 = &nVariable1; // deduced as 'int*', since nVariable one is int.
auto pVariable = &pVariable6; // int**
auto* pVariable7 = &nVariable1; // int*
Reference deductions:
auto & pVariable = nVariable1; // int &
// Yes, modifying either of these variable with modify the other!
With the new
operator:
auto iArray = new int[10]; // int*
With the const
and volatile
modifiers:
const auto PI = 3.14; // double
volatile auto IsFinished = false; // bool
const auto nStringLen = strlen("CodeProject.com");
Arrays cannot be declared:
auto aArray1[10];
auto aArray2[]={1,2,3,4,5};
// error C3532: the element type of an array
// cannot be a type that contains 'auto'
Cannot be a function argument or return type:
auto ReturnDeduction();
void ArgumentDeduction(auto x);
// C3533: 'auto': a parameter cannot have a type that contains 'auto'
If you need an auto
return type or auto
arguments, you can simply use templates! ;)
You cannot have auto
in a class
or struct
, unless it is a static member:
struct AutoStruct
{
auto Variable = 10;
// error C2864: 'AutoStruct::Variable' : only static const
// integral data members can be initialized within a class
};
You cannot have multiple data-types (types that can be deduced to different types):
auto a=10, b=10.30, s="new";
// error C3538: in a declarator-list 'auto'
// must always deduce to the same type
Likewise, if you initialize variables with different functions, and one or more function returns different data types, the compiler would emit the same error (C3538):
auto nVariable = sqrt(100.0), nVariableX = labs(100);
You can use the auto
keyword at a global level.
This variable may seem to be a boon for you, but can be a curse if misused. For example, a few programmers assume the following declares float
, but actually declares double
.
auto nVariable = 10.5;
Similarly, if a function returns int
, and then you modify the function return type to return short
or double
, the original automatic variable definition would go wary. If you are unlucky, and place only one variable in the auto
declaration, the compiler would just deduce that auto- variable with the new type. And, if you are lucky, and mixed that variable with other variables, the compiler would raise a C3538 (see above).
For example:
int nLength = strlen("The auto keyword.");
would return a 4-byte integer on a 32-bit compilation, but would return an 8-byte int
on a 64-bit compilation. Sure thing, you can use size_t
, instead of int
or __int64
. What if the code is compiled where size_t
isn't defined, or strlen
returns something else? In that case, you can simply use auto
:
auto nLength = strlen("The auto keyword.");
std::vector<std::string> Strings; // string vector
// Push several strings
// Now let's display them
for(std::vector<std::string>::iterator iter =
Strings.begin(); iter != Strings.end(); ++iter)
{ std::cout << *iter << std::endl; }
You know that typing the type std::vector
is cumbersome, error prone, and makes code less readable. Though, the option exists to typedef
the return type somewhere else in the program and use the type name. But for how many iterator types? And, what if the iterator-type is used only once? Thus, we shorten the code as follows:
for(auto iter = Strings.begin(); iter != Strings.end(); ++iter)
{ std::cout << *iter << std::endl; }
If you have been using STL for a while, you know about iterators and constant iterators. For the above example, you might prefer to use const_iterator
, instead of iterator
. Thus, you may want to use:
// Assume 'using namespace std'
for(vector<string>::const_iterator iter =
Strings.begin(); iter != Strings.end(); ++iter)
{ std::cout << *iter << std::endl; }
Thus making the iterator const, so that it cannot modify an element of the vector
. Remember that when you call the begin
method on a const
object, you cannot assign it to a non-const iterator
, and you must assign it to a const_iterator
(const iterator
is not the same as const_iterator
; since I am not writing about STL, please read/experiment about it yourself).
To overcome all these complexities, and to aid the auto
keyword, standard C++ now facilitates the cbegin, cend, crbegin, and crend methods for STL containers. The c prefix means constant. They always return const_iterator
, irrespective of if the object (container) is a const
or not. Old methods returns both types of iterators, depending on the constness of the object. The modified code:
// With cbegin the iterator type is always constant.
for(auto iter = Strings.cbegin(); iter!=Strings.cend(); ++iter) {...}
Another example can be an iterator/constant iterator on:
map<std::vector , string> mstr;
map<vector , string>::const_iterator m_iter = mstr.cbegin();
The iterator assignment code can be shortened as:
auto m_iter = mstr.cbegin();
Just like complex templates, you can use the auto
keyword to assign hard-to-type, error prone function pointers, which might be assigned from other variables/functions. I assume you do understand what I mean by this, thus no example is given.
Refer to the explanation about lambdas below in this article.
Though not related to lambdas, learning lambda expression syntax is required. So I will be discussing this one after lambdas.
This C++ operator gives the type of the expression. For example:
int nVariable1;
...
decltype(nVariable1) nVariable2;
declares nVariable2
of int
type. The compiler knows the type of nVariable1
, and translates decltype(nVariable1)
as int
. The decltype
keyword is not the same as the typeid
keyword. The typeid
operator returns the type_info
structure, and it also requires RTTI to be enabled. Since it returns type-information, and not type itself, you cannot use typeid
like:
typeid(nVariable1) nVariable2;
whereas, decltype
deduces the expression as a type, perfectly at compile time. You cannot get the type-name (like 'int
') with decltype
.
decltype
is typically used in conjunction with the auto
keyword. For instance, you have declared an automatic variable as:
auto xVariable = SomeFunction();
Assume the type of xVariable
(actually, the return type of SomeFunction
) is X. Now, you cannot call (or don't want to call) the same function again. How would you declare another variable of the same type?
Which one of the following suits you?
decltype(SomeFunc) yVar;
decltype(SomeFunc()) yVar;
The first one declares yVar
of type function-pointer, and the second one declares it to be of type X. As you can assess, using the function name is not reliable, since the compiler will not give an error or warning until you use the variable. Also, you must pass the actual number of arguments, and the actual type of arguments of the function/method is overloaded.
The recommended approach is to deduce the type from the variable directly:
decltype(xVariable) yVar;
Furthermore, as you have seen in the auto
discussion, using (typing) the template types is complicated and ugly, and you should use auto
. Likewise, you can/should use decltype
to state the proper type:
decltype(Strings.begin()) string_iterator;
decltype(mstr.begin()->second.get_allocator()) under_alloc;
Unlike the previous examples, where we used auto
to deduce the type from the expression on the right side, we deduce the type without assigning. With decltype
, you need not assign the variable, just declare it - since the type is already known. Note that the function is not being called when you use the Strings.begin()
expression, it is just deducing the type from the expression. Similarly, when you put the expression in decltype
, the expression will not be evaluated. Only the basic syntax check will be performed.
In the second example above, mstr
is a std::map
object, where we are retrieving the iterator, the second
member of that map element, and finally its allocator type. Thus, the deduced type is std::
allocator for string
(see above where mstr
is declared).
Few good, though absurd, examples:
decltype(1/0) Infinite; // No compiler error for "divide by zero" !
// Program does not 'exit', only type of exit(0), is determined.
// Specifying exit without function call would make
// the return type of 'MyExitFunction' different!
decltype(exit(0)) MyExitFunction();
The null-pointer finally got its classification in the form of a keyword! It is mostly the same as the NULL
macro or integer 0. Though I am covering only native C++, it must be mentioned however that the nullptr
keyword can be used in native (unmanaged code) as well as in managed code. If you write mixed mode code in C++, you may use the __nullptr
keyword to explicitly state the native-null-pointer, and nullptr
to express the managed null. Even in a mixed mode program, you'd rarely need to use __nullptr
.
void* pBuffer = nullptr;
...
if ( pBuffer == nullptr )
{ // Do something with null-pointer case }
// With classes
void SomeClass::SomeFunction()
{
if ( this != nullptr)
{ ... }
}
Remember, nullptr
is a keyword, not a type. Thus, you cannot use the sizeof
or decltype
operator on it. Having said that, the NULL
macro and the nullptr
keyword are two different entities. NULL
is just 0, which is nothing but int
.
For example:
void fx(int*){}
void fx(int){}
int main()
{
fx(nullptr); // Calls fx(int*)
fx(NULL); // Calls fx(int)
}
(The /clr compiler option requirement is wrong. MSDN is not updated, as of this writing.)
With the static_assert
keyword, you can verify some condition at compile time. Here is the syntax:
static_assert( expression, message)
The expression has to be a compile time constant-expression. For non-templated static assertions, the compiler immediately evaluates the expression. For a template assertion, the compiler tests the assert for and when the class is instantiated.
If the expression is true, that means your required assertion (requirement) is fulfilled, and the statement does nothing. If the expression is false, the compiler raises an error C2338 with the message you mentioned. For example:
static_assert (10==9 , "Nine is not equal to ten");
Is it obviously not true, thus the compiler raises:
error C2338: Nine is not equal to ten
The more meaningful assertion, which would be raised when the program is not being compiled as 32-bit:
static_assert(sizeof(void *) == 4,
"This code should only be compiled as 32-bit.");
Since the size of any type of pointer is the same as the target platform chosen for compilation.
For earlier compilers, we need to use _STATIC_ASSERT
, which does nothing but declare an array of size-condition. Thus, if the condition is true, it declares an array of size 1; if the condition is false, it declares an array of size 0 - which results in a compiler error. The error is not user friendly.
This is one of the most striking language features added to C++. Useful, interesting, and complicated too! I will start with the absolute basic syntax and examples, to make things clear. Thus, the first few code examples below may not feature the usefulness of lambdas. But, be assured, lambdas are a very powerful, yet code-concise feature of the C++ language!
Before we start, let me brief it first:
The absolute basic lambda:
[]{}; // In some function/code-block, not at global level.
Yes, the above code is perfectly valid (in C++0x only!).
[] is the Lambda-introducer, which tells the compiler that the expression/code followed is lambda. {} is the definition of the lambda, just like any function/method. The lambda defined above does not take any arguments, doesn't return any value, and of course, doesn't do anything.
Let's proceed...
[]{ return 3.14159; }; // Returns double
The lambda coded above does simple work: returns the value of PI. But who is calling this lambda? Where does the return value go? Let's proceed further:
double pi = []{ return 3.14159; }(); // Returns double
Making sense? The return value from the lambda is being stored in a local variable pi
. Also, note in the above example the lambda being called (notice the function-call at the end). Here is a minimal program:
int main()
{
double pi;
pi = []{return 3.14159;}();
std::cout << pi;
}
The parentheses at the end of the lambda are actually making a call to the lambda-function. This lambda doesn't take any arguments, but still, the operator ()
is required at the end so that the compiler would know the call. The pi
lambda may also be implemented like:
pi = [](){return 3.14159;}(); // Notice the first parenthesis.
which is similar to:
pi = [](void){return 3.14159;}(); // Note the 'void'
Though, it is a matter of choice if you put the first parenthesis for the argument-less lambdas, or not. But I prefer you put them. The C++ standard committee wanted to make lambdas less complicated, that's why they (might have) made the lambda-parameter-specification optional for parameter-less lambdas.
Let's move further, where the lambda takes an argument:
bool is_even;
is_even = [](int n) { return n%2==0;}(41);
The first parenthesis, (int n)
, specifies the parameter-specification of the lambda. The second, (41)
, passes the value to the lambda. The body of the lambda tests if the passed number is divisible by 2 or not. We can now implement the max or min lambda, as follows:
int nMax = [](int n1, int n2) {
return (n1>n2) ? (n1) : (n2);
} (56, 11);
int nMin = [](int n1, int n2) {
return (n1<n2) ? (n1) : (n2);
} (984, 658);
Here, instead of declaring and assigning variables separately, I put them in a single line. The lambdas now take two arguments, and return one of the values, which would be assigned to nMin
and nMax
. Likewise, the lambda can take more arguments, and of multiple types also.
A few questions you should have:
int
available?Before I answer them one by one, let me present to you the Lambda Expression Grammar with the following illustration:
You can specify the return type after the ->
operator. For example:
pi = []()->double{ return 3.14159; }();
Remember, as shown in the illustration, if the lambda contains only one statement (i.e., only a return
statement), there is no need to specify the return type. So, in the example above, specifying double
as the return type is optional. It is up to you if you should specify the return type for automatic inferential return types.
An example where specifying the return type is mandatory:
int nAbs = [] (int n1) -> int
{
if(n1<0)
return -n1;
else
return n1;
}(-109);
If you do not specify -> int
, the compiler would raise:
error C3499: a lambda that has been specified to have
a void return type cannot return a value
If the compiler does not see only a return
statement, it infers the lambda as having a void
return type. The return type can be anything:
[]()->int* { }
[]()->std::vector ::const_iterator& {}
[](int x) -> decltype(x) { }; // Deducing type of 'x'
It cannot return arrays. It also cannot have auto
as the return type:
[]()-> float[] {}; // error C2090: function returns array
[]()-> auto {}; // error C3558: 'auto': a lambda return type cannot contain 'auto'
Yes, of course, you can put a return value of lambda into an auto
variable:
auto pi = []{return 3.14159;}();
auto nSum = [](int n1, int n2, int n3)
{ return n1+n2+n3; } (10,20,70);
auto xVal = [](float x)->float
{
float t;
t=x*x/2.0f;
return t;
} (44);
As a last note, if you specify the return-type to a parameter-less lambda, you must use parentheses. The following is an error:
[]->double{return 3.14159;}(); // []()->double{...}
The explanation above is sufficient enough to disclose that a lambda can contain any code a regular function can have. A lambda can contain everything a function/method can have - local variables, static variables, calls to other functions, memory allocation, and other lambdas too! The following code is valid (absurd though!):
[]()
{
static int stat=99;
class TestClass
{
public:
int member;
};
TestClass test;
test.member= labs(-100);
int ptr = [](int n1) -> int*
{
int* p = new int;
*p = n1;
return p;
}(test.member);
delete ptr;
};
Let's define a lambda that determines if a number is even or not. With the auto
keyword, we can store the lambda in a variable. Then we can use the variable (i.e., call the lambda)! I will discuss what the type of the lambda is later. It is as simple as:
auto IsEven = [](int n) -> bool
{
if(n%2 == 0)
return true;
else
return false;
}; // No function call!
As you can infer, the lambda's return type is bool
, and it is taking an argument. And importantly, we are not calling the lambda, just defining it. If you put ()
, with some argument, the type of the variable would have been bool
, not lambda-type! Now the locally defined function (i.e., lambda) can be called after the above statement:
IsEven(20);
if( ! IsEven(45) )
std::cout << "45 is not even";
The definition of IsEven
, as given above, is in the same function in which two calls have been made. What if you want the lambdas to be called from other functions? Well, there are approaches, like storing in some local or class-level variable, passing it to another function (just like a function pointer), and calling from another function. Another mechanism is to store and define the function at global scope. Since I haven't discussed what the type-of-lambda is, we'll use the first approach later. But let's discuss the second approach (global-scope):
Example:
// The return type of lambda is bool
// The lambda is being stored in IsEven, with auto-type
auto IsEven = [](int n) -> bool
{
if(n%2 == 0) return true;
else return false;
}
void AnotherFunction()
{
// Call it!
IsEven (10);
}
int main()
{
AnotherFunction();
IsEven(10);
}
Since the auto
keyword is applicable only for local or global scope, we can use it to store lambdas. We need to know the type so that we can store it in a class variable. Later on.
As mentioned earlier, a lambda can almost do what a regular function can do. So, displaying a value isn't a remote thing that lambda cannot do. It can display a value.
int main()
{
using namespace std;
auto DisplayIfEven= [](int n) -> void
{
if ( n%2 == 0)
std::cout << "Number is even/n";
else
std::cout << "Number is odd/n";
}
cout << "Calling lambda...";
DisplayIfEven(40);
}
One important thing to note is that locally-defined lambdas do not get the namespace resolution from the upper-scope they are being defined in. Thus, the std
namespace inclusion is not available for DisplayIfEven
.
Definitely. Provided the lambda/function name is known at the time of call, as it is required for the function call in a function.
No.
Now I will discuss the thing I kept empty till now: Capture Specification.
Lambdas can be one of the following:
The state defines how variables from a higher scope are captured. I define them into the following categories:
You should understand the four categories are derivations of the following C++ mantras:
Let's play with captures! The capture specification, as mentioned in the above illustration, is given in []. The following syntax is used to specify capture-specification:
var
by value; nothing else, in either mode, is captured.var
by reference; nothing else, in either mode, is captured.Example 1:
int a=10, b=20, c=30;
[a](void) // Capturing ONLY 'a' by value
{
std::cout << "Value of a="<<
a << std::endl;
// cannot modify
a++; // error C3491: 'a': a by-value capture
// cannot be modified in a non-mutable lambda
// Cannot access other variables
std::cout << b << c;
// error C3493: 'b' cannot be implicitly captured
// because no default capture mode has been specified
}();
Example 2:
auto Average = [=]() -> float // '=' says: capture all variables, by value
{
return ( a + b + c ) / 3.0f;
// Cannot modify any of the variables
};
float x = Average();
Example 3:
// With '&' you specify that all variables be captured by REFERENCE
auto ResetAll = [&]()->void
{
// Since it has captured all variables by reference, it can modify them!
a = b = c = 0;
};
ResetAll();
// Values of a,b,c is set to 0;
Putting = specifies by-value. Putting & specifies by-reference. Let's explore more about these. For shortness, now I am not putting lambdas into an auto
variable and then calling them. Instead, I am calling them directly.
Example 4:
// Capture only 'a' and 'b' - by value
int nSum = [a,b] // Did you remember that () is optional for parameterless lambdas!?
{
return a+b;
}();
std::cout << "Sum: " << nSum;
As shown in example 4, we can specify multiple capture specifications in a lambda-introducer (the []
operator). Let's take another example, where the sum of all three (a
, b
, c
) would be stored into the nSum
variable.
Example 5:
// Capture everything by-value, BUT capture 'nSum' by reference.
[=, &nSum]
{
nSum = a+b+c;
}();
In the above example, the capture-all-by-value (i.e., =
operator) specifies the default capture mode, and the &nSum
expression overrides that. Note that the default capture mode, which specifies all-capture, must appear before other captures. Thus, = or & must appear before other specifications. The following causes an error:
// & or = must appear as first (if specified).
[&nSum,=]{}
[a,b,c,&]{} // Logically same as above, but erroneous.
A few more examples:
[&, b]{}; // (1) Capture all by reference, but 'b' by value
[=, &b]{}; // (2) Capture all by value, but 'b' by reference
[b,c, &nSum]; // (3) Capture 'b', 'c' by value,
// 'nSum' by reference. Do no capture anything else.
[=](int a){} // (4) Capture all by value. Hides 'a' - since a is now
// function argument. Bad practise. Compiler doesn't warn!
[&, a,c,nSum]{}; // Same as (2).
[b, &a, &c, &nSum]{} // Same as (1)
[=, &]{} // Not valid!
[&nSum, =]{} // Not valid!
[a,b,c, &]{} // Not valid!
As you can see, there are multiple combinations to capture the same set of variables. We can extend the capture-specification syntax by adding:
var
, which is by value.var
, which is by reference.var1
, var2
by value.var1
, var2
by reference.var1
by value, var2
by reference.Till now, we have seen that we can prevent some variables from being captured, capture by-value const, and capture by reference non-const. Thus, we have covered 1, 2, and 4 of the capture-categories (see above). Capturing const reference is not possible (i.e., [const &a]
). We will now look into the last one - capturing in call-by-value mode.
Just after the parentheses of a parameter-specification, we specify the mutable
keyword. With this keyword, we put all by-value captured variables into call-by-value mode. If you do not put the mutable
keyword, all by-value variables are constant, you cannot modify them in lambda. Putting mutable
enforces the compiler to create copies of all variables which are being captured by-value. You can then modify all by-value captures. There is no method to selectively capture const and non-const by-value. Or simply, you can think of them being passed to the lambda as an argument.
Example:
int x=0,y=0,z=0;
[=]()mutable->void // () are required, when you specify 'mutable'
{
x++;
// Since all variable are captured in call-by-value mode,
// compiler raises warning that y,z are not used!
}();
// the value of x remain zero
After the lambda-call, the value of 'x
' remains zero. Since only a copy of x is modified, not the reference. It is also interesting to know that the compiler raises a warning only for y
and z
, and not the previously defined (a
, b
, c
...) variables. However, it doesn't complain if you use previously defined variables. Smart compiler - I cannot speak more on it!
Function-pointers do not maintain state. Lambdas do. With by-reference captures, lambdas can maintain their state between calls. Functions cannot. Function pointers are not type safe, they are prone to errors, we must mangle with calling-conventions, and requires complicated syntax.
Function-objects do maintain states very well. But even for a small routine, you must write a class, place some variables into it, and overload the ()
operator. Importantly, you must do this outside the function block so that the other function, which is supposed to call the operator()
for this class, must know it. This breaks the flow of the code.
Lambdas are actually classes. You can store them in a function class object. This class, for lambdas, is defined in the std::tr1 namespace. Let's look at an example:
#include<functional>
....
std::tr1::function (int)>s IsEven = [](int n)->bool { return n%2 == 0;};
...
IsEven(23);
The tr1 namespace is for Technical Report 1, which is used by the C++0x committee members. Please search by yourself for more information.
expresses the templates parameter for the function
class, which says: the function returns a bool
and takes an argument int
. Depending on the lambda being put into the function object, you must typecast it properly; otherwise, the compiler would emit an error or warning for type-mismatch. But, as you can see, using the auto
keyword is much more convenient.
There are cases, however, where you must use a function
- when you need to pass lambdas across function calls. For example:
using namespace std::tr1;
void TakeLambda(function (int)> lambda)
// Cannot use 'auto' in function argument
{
// call it!
lambda(32);
}
// Somewhere in program ...
TakeLambda(DisplayIfEven); // See code above for 'DisplayIfEven'
The DisplayIfEven
lambda (or function!) takes int
, and returns nothing. The function
class is used the same way as the argument in TakeLambda
. Further, it calls the lambda, which eventually calls the DisplayIfEven
lambda.
I have simplified TakeLamba
, which should have been (shown incrementally):
// Reference, should not copy 'function' object
void TakeLambda(function< void(int) > & lambda);
// Const reference, should not modify function object
void TakeLambda(const function< void(int) > & lambda);
// Fully qualified name
void TakeLambda(const std::tr1::function< void(int) > & lambda);
Lambdas are very useful for many STL functions - functions that require function-pointers or function-objects (with operator()
overloaded). In short, lambdas are useful for those routines that demand callback functions. Initially, I will not cover the STL functions, but explain the usability of lambdas in a simpler and understandable form. The non-STL examples may be superfluous and nonsense, but quite capable of clearing out this topic.
For example, the following function requires a function to be passed. It will call the passed function. The function-pointer, function-object, or the lambda should be of a type that returns void
and takes an int
as the sole argument.
void CallbackSomething(int nNumber, function (int)> callback_function)
{
// Call the specified 'function'
callback_function(nNumber);
}
Here, I call the CallbackSomething
function in three different ways:
// Function
void IsEven(int n)
{
std::cout << ((n%2 == 0) ? "Yes" : "No");
}
// Class with operator () overloaded
class Callback
{
public:
void operator()(int n)
{
if(n<10)
std::cout << "Less than 10";
else
std::cout << "More than 10";
}
};
int main()
{
// Passing function pointer
CallbackSomething(10, IsEven);
// Passing function-object
CallbackSomething(23, Callback());
// Another way..
Callback obj;
CallbackSomething(44, obj);
// Locally defined lambda!
CallbackSomething(59, [](int n) { std::cout << "Half: " << n/2;} );
}
Okay! Now I want that the Callback
class can display if a number is greater than some N number (instead of a constant 10). We can do it this way:
class Callback
{
/*const*/ int Predicate;
public:
Callback(int nPredicate) : Predicate(nPredicate) {}
void operator()(int n)
{
if( n < Predicate)
std::cout << "Less than " << Predicate;
else
std::cout << "More than " << Predicate;
}
};
In order to make this callable, we just need to construct it with some integer constant. The original CallbackSomething
need not be changed - it still can call a routine with an integer argument! This is how we do it:
// Passing function-object
CallbackSomething(23, Callback(24));
// 24 is argument to Callback CTOR, not to CallbackSomething!
// Another way..
Callback obj(99); // Set 99 are predicate
CallbackSomething(44, obj);
This way, we made the Callback
class capable of maintaining its state. Remember, as long as the object remains, its state remains. Thus, if you pass an obj
object into multiple calls of CallbackSomething
(or any other similar function), it will have the same Predicate (state). As you know, this is not possible with function pointers - unless we introduce another argument to the function. But doing so breaks the entire program structure. If a particular function is demanding a callable function, with a specific type, we need to pass a function of that type only. Function pointers are unable to maintain state, and thus are unusable in these kind of scenarios.
Is this possible with lambdas? As mentioned previously, lambdas can maintain state through capture specification. So, yes, it is possible with lambdas to achieve this stateful functionality. Here is the modified lambda, being stored in an auto
variable:
int Predicate = 40;
// Lambda being stored in 'stateful' variable
auto stateful = [Predicate](int n)
{ if( n < Predicate)
std::cout << "Less than " << Predicate;
else
std::cout << "More than " << Predicate;
};
CallbackSomething(59, stateful ); // More than 40
Predicate=1000;
CallbackSomething(100, stateful); // Predicate NOT changed for lambda!
The stateful
lambda is locally defined in a function, is concise than a function-object, and cleaner than a function pointer. Also, it now has its state. Thus, it will print "More than 40" for the first call, and the same thing for the second call as well.
Note that Predicate
is passed as by-value (non-mutable also), so modifying the original variable will not affect its state in lambda. To reflect the predicate-modification in lambda, we just need to capture this variable by reference. When we change the lambda as follows, the second call will print "Less than 1000".
auto stateful = [&Predicate](int n) // Capturing by Reference
This is similar to adding a method like SetPredicate
in a class that would modify the predicate (state). Please see the VC++ blog, linked below, for a discussion on lambda - class mapping (the blogger calls it mental translation).
The for_each
STL function calls the specified function for each element in the range/collection. Since it uses a template, it can take any type of data-type as its argument. We will use this as an example for lambdas. For simplicity, I would use plain arrays, instead of vectors or lists. For example:
using namespace std;
int Array[10] = {1,2,3,4,5,6,7,8,9,10};
for_each(Array, &Array[10], IsEven);
for_each(Array, Array+10, [](int n){ std::cout << n << std::endl;});
The first call calls the IsEven
function, and the second call calls the lambda, defined within the for_each
function. It calls both of the functions 10 times, since the range contains/specifies 10 elements in it. I need not repeat that the second argument to for_each
is exactly same (oh! but I repeated!).
This was a very simple example, where for_each
and lambdas can be utilized to display values without needing to write a function or class. For sure, a lambda can be extended further to perform extra work - like displaying if a number is prime or not, or calculating the sum (use reference-by-capture), or to modify (say, multiply by 4) the element of a range.
Well, yes! You can do that. For long, I talked about taking captures by references and making modifications, but did not cover modifying the argument itself. The need did not arise till now. To do this, just take the lambda's parameter by reference (or pointer):
// The 'n' is taken as reference (NOT same as capture-by-reference!)
for_each(Array, Array+10, [](int& n){ n *= 4; });
The above for_each
call multiplies each element of Array
by 4.
Like I explained how to utilize lambdas with the for_each
function, you can use it with other transform
, generate
, remove_if
, etc. Lambdas are not just limited to STL algorithms, they can also be efficiently used wherever a function-object is required. You need to make sure it takes the proper number and type of arguments, and check if it needs argument modifications and things like that. Since this article is not about STL or templates, I will not discuss this further.
Yes, quite disappointing and confusing, but true! You cannot use a lambda as an argument to a function that requires a function-pointer. Through sample code, let me first express what I am trying to say:
// Typedef: Function pointer that takes and int
typedef void (*DISPLAY_ROUTINE)(int);
// The function, that takes a function-pointer
void CalculateSum(int a,int b, DISPLAY_ROUTINE pfDisplayRoutine)
{
// Calling the supplied function pointer
pfDisplayRoutine(a+b);
}
CalculateSum
takes a function-pointer of type DISPLAY_ROUTINE
. The following code would work, as we are supplying a function-pointer:
void Print(int x)
{
std::cout << "Sum is: " << x;
}
int main()
{
CalculateSum(500,300, Print);
}
But the following call will not:
CalculateSum (10, 20, [](int n) {std::cout<<"sum is: "<<n;} );
// C2664: 'CalculateSum' : cannot convert parameter 3 from
// '`anonymous-namespace'::' to 'DISPLAY_ROUTINE'
Why? Because lambdas are object-oriented, they are actually classes. The compiler internally generates a class-model for the lambdas. That internally generated class has operator ()
overloaded; and has some data members (inferred via capture-specification and mutable-specification) - those may be const, reference, or normal member variables and classic stuff like that. That class cannot be downgraded to a normal function-pointer.
Well, because of the smart class named std::function
! See (above) that CallbackSomething
is actually taking function
as an argument, not a function-pointer.
As with for_each
- this function doesn't take std::function
, but uses a template instead. It directly calls the passed argument with parenthesis. Carefully understand the simplified implementation:
template Iteartor, class Function>
void for_each(Iteartor first, Iterator, Function func)
// Ignore return type,and other arguments
{
// Assume the following call is in loop
// The 'func' can be normal function, or
// it can be a class object, having () operator overloaded.
func(first);
}
Likewise, other STL functions, like find
, count_if
etc., would work will all three cases: function-pointers, function-objects, and lambdas.
Thus, if you plan to use lambdas in APIs like SetTimer
, EnumFontFamilies
, etc. - unplan it! Even with forceful typecasting the lambda (by taking its address), it won't work. The program will crash at runtime.
Let's start with a simple example. Following is a function whose return type is long
. This is not a lambda, but a function.
auto GetCPUSpeedInHertz() -> long
{
return 1234567890;
}
It returns a long
, as you can see. It uses the new syntax of specifying a return type, known as Trailing Return Type. The keyword auto
on the left is just a placeholder, the actual type is specified after the ->
operator. One more example:
auto GetPI() -> decltype(3.14)
{
return 3.14159;
}
Where the return type is deduced from the expression. Please re-read about the decltype
keyword above, to recollect. For both of the functions given above, you obviously need not use this feature!
Consider the template function:
template FirstType, typename SecondType>
/*UnknonwnReturnType*/ AddThem(FirstType t1, SecondType t2)
{
return t1 + t2;
}
The function adds two arbitrary values, and returns it. Now, if I pass an int
and a double
as the first and second arguments, respectively, what should the return type be? You'd say double
. Does that mean we should make SecondType
the return type of the function?
template FirstType, typename SecondType>
SecondType AddThem(FirstType t1, SecondType t2);
Actually, we cannot. For the obvious reason that this function may be called with any type of argument on the left or right side, and either type can be of a higher magnitude. For example:
AddThem(10.0, 'A');
AddThem("CodeProject", ".com");
AddThem("C++", 'X');
AddThem(vector_object, list_object);
Also, the operator +
(in the function) may call another overloaded function that may return a third type. The solution is to use the following:
template FirstType, typename SecondType>
auto AddThem(FirstType t1, SecondType t2) -> decltype(t1 + t2)
{
return t1 + t2;
}
As mentioned in decltype
's explanation, that type is determined though the expression; the actual type of t1+t2
is determined that way. If the compiler can upgrade the type, it will (like int
, double
upgrades to double
). If type/types are of class(es), and +
leads an overloaded operator to be called, the return type of that overloaded +
operator would be the return type. If types are not native, and no overload could be found, the compiler would raise an error. It is important to note that the type deduction will only take place when you instantiate the template function with some data-types. Before that, the compiler won't bother checking anything (same rules as with normal template functions).
I assume that you know what call-by-value, by reference, and by constant reference mean. I further assume that you know what L-value and R-value mean. Now, let's take an example where R-value references would make sense:
class Simple {};
Simple GetSimple()
{
return Simple();
}
void SetSimple(const Simple&)
{ }
int main()
{
SetSimple( GetSimple() );
}
Here, you can see that the GetSimple
method returns a Simple
object. And, SetSimple
takes a Simple
object by reference. In a call to SetSimple
, we are passing GetSimple
- as you can see, the returned object is about to be destroyed, as soon as SetSimple
returns. Let's extend SetSimple
:
void SetSimple(const Simple& rSimple)
{
// Create a Simple object from Simple.
Simple object (rSimple);
// Use object...
}
Kindly ignore the missing copy constructor, we are using the default copy constructor. For understanding, assume the copy constructor (or a normal constructor) is allocating some amount of memory, say 100 bytes. The destructor is supposed to destroy 100 bytes. Don't get the problem in here? Okay, let me explain. Two objects are being created (one in GetSimple
, one in SetSimple
), and both are allocating 100 bytes of memory. Right? This is like copying a file/folder to another location. But as you can see from the example code, only 'object
' is being utilized. Then, why should we allocate 100 bytes twice? Why cannot we use the 100 bytes allocated by the first Simple
object construction? In earlier versions of C++, there was no simple way, unless we wrote our own memory/object management routines (like the MFC/ATL CString
class does). In C++0x, we can do that. Thus, in short, we'll optimize the routine this way:
null
).By this, we save 100 bytes! Not a big amount, though. But if this feature is used in larger containers like strings, vectors, list, it would save huge amounts of memory, and time, both! Thus, it would improve the overall application performance. Though the problem and solution is available in the form of RVO and NRVO (Named Return Value Optimization), in Visual C++ 2005 and later versions, it is not as efficient and meaningful. For this, we use a new operator introduced: R-value Reference Declarator: &&
. Basic syntax: Type&& identifier. Now we modify the above code, step by step:
void SetSimple(Simple&& rSimple) // R-Value NON-CONST reference
{
// Performing memory assignment here, for simplicity.
Simple object;
object.Memory = rSimple.Memory;
rSimple.Memory = nullptr;
// Use object...
delete []object.Memory;
}
The above code is for moving the content from the old object to the new object. This is very similar to moving a file/folder. Note that the const has been removed, since we also need to reset (detach) the memory originally allocated by the first object. The class and GetSimple
are modified as mentioned below. The class now has a memory-pointer (say, void*
) named Memory
. The default constructor sets it as null
. This member is made public for simplifying the topic.
class Simple
{
public:
void* Memory;
Simple() { Memory = nullptr; }
Simple(int nBytes) { Memory = new char[nBytes]; }
};
Simple GetSimple()
{
Simple sObj(10);
return sObj;
}
What if you make a call like this:
Simple x;
SetSimple(x);
It would result in an error since the compiler cannot convert Simple
to Simple&&
. The variable 'x
' is not temporary, and cannot behave like an R-value reference. For this, we may provide an overloaded SetSimple
function that takes Simple
, Simple&
, or const Simple&
. Thus, you know by now that temporary objects are actually R-value references. With R-values, you achieve what is known as move semantics. Move semantics enables you to write code that transfers resources (such as dynamically allocated memory) from one object to another. To implement move semantics, we need to provide a move constructor, and optionally a move assignment operator (operator =
), to the class.
Let's implement moving the object within the class itself with the help of a move constructor. As you know, a copy constructor would have a signature like:
Simple(const Simple&);
The move-constructor would be very similar - just one more ampersand:
Simple(Simple&&);
But as you can see, the move-constructor is non-const. Like a copy constructor can take a non-const object (thus modify the source!), the move-constructor can also take const; nothing prevents this - but doing so forfeits the whole purpose of writing a move-constructor. Why? If you understood correctly, we are detaching the resource ownership from the original source (the argument of the move-constructor). Before I write more words to confuse you, let's take an example where the so-called move-constructor may be called:
Simple GetSimple()
{
Simple sObj(10);
return sObj;
}
Why? The object sObj
has been created on the stack. The return type is Simple
, which would mean the copy-constructor (if provided; otherwise, the default compiler provided) would be called. Further, the destructor for sObj
would be called. Now, assume the move-constructor is available. In that case, the compiler knows the object is being moved (ownership transfer), and it would call the move-constructor instead of the copy-constructor.
Here is the updated Simple
class implementation:
class Simple
{
// The resource
void* Memory;
public:
Simple() { Memory = nullptr; }
// The MOVE-CONSTRUCTOR
Simple(Simple&& sObj)
{
// Take ownership
Memory = sObj.Memory;
// Detach ownership
sObj.Memory = nullptr;
}
Simple(int nBytes)
{
Memory = new char[nBytes];
}
~Simple()
{
if(Memory != nullptr)
delete []Memory;
}
};
Here is what happens when you call the GetSimple
function:
GetSimple
function, allocates a new Simple
object on stack.Simple
.sObj
to a returnable object.Simple(Simple&&)
) now takes the Memory
content (instead of allocating again, as the copy-constructor would have done). Then it sets the original object's Memory
to be null
. It does not allocate and copy the memory!sObj
is to be destroyed.sObj
is called (~Simple()
), which sees that Memory
is null
- does nothing!Simple
, not Simple&
or Simple*
) causes the copy-constructor or move-constructor to be called. This point is very important to understand!Memory
to null
, so that DTOR
doesn't delete it.Memory!=nullptr
) is not required as per C++ standards, but is mentioned for clarity. On your classes, you must design MC and DTOR
so that they have a similar protocol.We see that we moved the original data to the new object. This way, we saved memory and processing time. As mentioned earlier, this saving is negligent - but when it is employed in larger data structures, and/or when temporary-objects are created and destroyed a lot, the saving is paramount!
Understand the following code:
Simple obj1(40);
Simple obj2, obj3;
obj2 = obj1;
obj3 = GetSimple();
As you know, obj2 = obj1
would call the assignment-operator, there is no ownership transfer. The contents of obj2
are replaced with contents of obj1
. If we don't provide an assignment operator, the compiler would provide the default assignment operator (and would copy byte-by-byte). The object on the right side remains unchanged.
The signature of the user-defined operator can be:
// void return is put for simplicity
void operator=(const Simple&); // Doesn't modify source!
What about the statement: obj3 = GetSimple()
? The object returned by GetSimple
is temporary, as you should clearly know by now. Thus, we can (and we should) utilize what is known as Move- Semantics (we've used the same concept in the move-constructor also!). Here is a simplified move assignment-operator:
void operator=(Simple&&); // Modifies the source, since it is temporary
And here is the modified Simple
class (previous code omitted for brevity). Self assignment is not taken care of:
class Simple
{
...
void operator = (const Simple& sOther)
{
// De-allocate current
delete[] Memory;
// Allocate required amount of memory,
// and copy memory.
// The 'new' and 'memcpy' calls not shown
// for brevity. Another variable in this
// class is also required to hold buffersize.
}
void operator = (Simple&& sOther)
{
// De-allocate current
delete[] Memory;
// Take other's memory contnent
Memory = sOther.Memory;
// Since we have taken the temporary's resource.
// Detach it!
sOther.Memory = nullptr;
}
};
Thus, in the case of the obj3 = GetSimple()
statement, the following things are happening:
GetSimple
function is being called, which returns a temporary object.SetSimple
mentioned above.Just like the move-constructor, the move-assignment operator, and the destructor (in short, all three) must agree with the same protocol for resource allocation/deallocation.
Assume the Simple
class we have been working on is some data container; like string, date/time, array, or anything you prefer. That data container is supposed to allow the following, with the help of operator overloading:
Simple obj1(10), obj2(20), obj3, obj4; // Pardon me for using such names!
obj3 = obj1 + obj2;
obj2 = GetSimple() + obj1;
obj4 = obj2 + obj1 + obj3;
You can simply say, "Provide the plus (+) operator in the class". Okay, we provide an operator+
in our Simple
class:
// This method is INSIDE the class
Simple operator+(const Simple&)
{
Simple sObj;
// Do binary + operation here...
return sObj;
}
Now, as you can see, a temporary object is created and is being returned from operator+
, which would eventually cause a call to the move-assignment operator (for the obj3 = obj1 + obj2
expression), and the resource is preserved - fair enough! (I hope you understand it fully before you move to the next paragraph.) For the next statement (obj2 = GetSimple() + obj1
), the object on the left itself is temporary. Note that in SetSimple
and in the move special-functions, the argument is temporary, not in this. There is no technique (at least in my knowledge) to make this a temporary object. Okay, okay, I am not Bjarne Stroustrup; here is the solution:
class Simple
{
...
// Give access to helper function
friend Simple operator+(Simple&& left, const Simple& right);
};
// Left argument is temporary/r-value reference
Simple operator+(Simple&& left, const Simple& right)
{
// Simulated operator+
// Just shows the resource ownership/detachment
// Does not use 'right'
Simple newObj;
newObj.Memory = left.Memory;
left.Memory = nullptr;
return newObj;
}
Explanation about the code:
operator+
is taking an R-value reference as its left argument; the right side argument is an ordinary constant reference of the object.Simple
object, which has just snatched ownership from a temporary object.What about the last statement (obj4 = obj2 + obj1 + obj3
)?
operator+
is called (for obj2 + obj1
). It returns a new Simple
object, we call it t1
.t1
(the outcome of obj2+obj1
), which is a temporary object, operator+
gets called again (t1+obj3
) - the out-of-class version of operator+
is called, which takes ownership from t1
.operator+
returns another (presumably the binary plus) object. We call the returnable object as t2
.t2
is to be assigned to obj4
, and since it is also a temporary object, the move-assignment operator gets called.Here it goes in a more simplified, non-verbal form (italic is the call being made):
This section lists the C++ features that were not added into the C++0x standard, but were added in VC8/VC9 compilers. They are now part of the C++0x standard.
What is the sizeof
enum? Four bytes? Well, depending on the compiler you choose, the size could vary. Does sizeof
an enum matter? Yes, if you put the enum as a member in a class/struct. The sizeof
class/struct changes, which makes the code less portable. Also, if struct is to be stored in a file or transferred, the problem is further compounded. Strongly typed enums enforce type, thus save from any bug creeping in. They make the code more portable within the software system. The solution is to specify the base-type of an enum:
enum Priority : BYTE // unsigned char
{
VeryLow = 0,
Low,
Medium,
High,
VeryHigh
};
Which results in sizeof(Priority)
to be 1 byte. Likewise, you can have any integral type as the base-type for an enum:
enum ByteUnit : unsigned __int64
{
Byte = 1,
KiloByte = 1024,
MegaByte = 1024L * 1024,
GigaByte = (unsigned __int64)1 << 30,
TeraByte = (unsigned __int64)1 << 40,
PetaByte = (unsigned __int64)1 << 50
};
The sizeof
this enum becomes 8-bytes, since the base-type is unsigned __int64
. If you do not specify the base-type, in this case, the compiler would warn you for putting an out of range value in the enum. Note: the Microsoft C/C++ compiler implements this feature only partially.
External reference: Proposal N2347.
When you declare templates of templates, like in the example below, you need to put extra white-space between the consecutive right angles (greater-than sign):
vector<list > ListVector;
Otherwise, the compiler would emit an error, since >>
is a valid C++ token (bitwise right shift). Other than multiple templates, this operator might appear when you typecast to a template, using the static_cast
operator:
static_cast<vector> (expression);
With the new C++0x standard, you can use consecutive right-angle brackets (more than twice also), like:
vector <vector<vector>> ComplexVector;
External reference: Proposal N1757
I failed to find the exact purpose and meaning of extern templates. Anyway, I am sharing what I discovered with this term. True, that I might be wrong - and expect you to share your knowledge, so that I could update this section. This is what I ascertained:
Explicating point 1 clearly is out of my comprehension - the details available are vague and overlapping, so I am not discussing point (1). For example, you have a class:
template T>
class Number
{
T data;
public:
Number() { data = 0; }
Number(T t) // or (const T& t)
{ data = t; }
T Add(T t) { return data + t; }
T Randomize(T t) { return data % t; }
};
Now, before you actually instantiate the template for some data-type, you want to make sure that the class compiles for that data type. That is, for this example, the type should support the +
and %
operators. Thus, you can specify the template instantiation in advance:
template Number ;
template Number ;
which would raise an error for the second specification, since operator %
is not valid for float
. Likewise, when you specify a template argument that doesn't support the operation the template class might need, the compiler would complain. For this template class, the template-type must support assignment to zero, assignment operator, operator +
, and operator %
. Note that we did not actually instantiate the template-class. This is just like declaring a function, and specifying the argument and return type. The function is externally defined somewhere else.
Though this article is almost complete, there are may be a few glitches, spelling/grammatical mistakes, small mistakes in code etc. Please let me know of them. For the downloadable code - I am wondering if this content requires some code?
This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)
If you interest to the source code, you can get them from Code Project.