This is where the real fun begins. So let’s get to it!
Consider putting single statements associated with an if or else in blocks (particularly while you are learning). More experienced C++ developers sometimes disregard this practice in favor of tighter vertical spacing.
If the programmer does not declare a block in the statement portion of an if statement or else statement, the compiler will implicitly declare one. Thus:
if (condition)
true_statement;
else
false_statement;
is actually the equivalent of:
if (condition)
{
true_statement;
}
else
{
false_statement;
}
A null statement is an expression statement that consists of just a semicolon:
if (x > 10)
; // this is a null statement
Be careful not to “terminate” your if statement with a semicolon, otherwise your conditional statement(s) will execute unconditionally (even if they are inside a block).
Prefer switch statements over if-else chains when there is a choice.
Following the conditional expression, we declare a block. Inside the block, we use labels to define all of the values we want to test for equality. There are two kinds of labels.
Place the default case last in the switch block.
A break statement (declared using the break
keyword) tells the compiler that we are done executing statements within the switch, and that execution should continue with the statement after the end of the switch block. This allows us to exit a switch statement
without exiting the entire function.
Each set of statements underneath a label should end in a break statement or a return statement. This includes the statements underneath the last label in the switch.
#include
int main()
{
switch (2)
{
case 1: // Does not match
std::cout << 1 << '\n'; // Skipped
case 2: // Match!
std::cout << 2 << '\n'; // Execution begins here
case 3:
std::cout << 3 << '\n'; // This is also executed
case 4:
std::cout << 4 << '\n'; // This is also executed
default:
std::cout << 5 << '\n'; // This is also executed
}
return 0;
}
This is probably not what we wanted! When execution flows from a statement underneath a label into statements underneath a subsequent label, this is called fallthrough.
Once the statements underneath a case or default label have started executing, they will overflow (fallthrough) into subsequent cases. Break
or return
statements are typically used to prevent this.
Attributes are a modern C++ feature that allows the programmer to provide the compiler with some additional data about the code. To specify an attribute, the attribute name is placed between double hard braces. Attributes are not statements – rather, they can be used almost anywhere where they are contextually relevant.
The [[fallthrough]]
attribute modifies a null statement
to indicate that fallthrough is intentional (and no warnings should be triggered):
#include
int main()
{
switch (2)
{
case 1:
std::cout << 1 << '\n';
break;
case 2:
std::cout << 2 << '\n'; // Execution begins here
[[fallthrough]]; // intentional fallthrough -- note the semicolon to indicate the null statement
case 3:
std::cout << 3 << '\n'; // This is also executed
break;
}
return 0;
}
This program prints:
2
3
And it should not generate any warnings about the fallthrough.
Use the [[fallthrough]]
attribute (along with a null statement) to indicate intentional fallthrough.
Remember, execution begins at the first statement after a matching case label. Case labels aren’t statements (they’re labels), so they don’t count.
With if statements, you can only have a single statement after the if-condition, and that statement is considered to be implicitly inside a block:
if (x > 10)
std::cout << x << " is greater than 10\n"; // this line implicitly considered to be inside a block
However, with switch statements, the statements after labels are all scoped to the switch block. No implicit blocks are created.
You can declare or define (but not initialize) variables inside the switch, both before and after the case labels:
switch (1)
{
int a; // okay: definition is allowed before the case labels
int b{ 5 }; // illegal: initialization is not allowed before the case labels
case 1:
int y; // okay but bad practice: definition is allowed within a case
y = 4; // okay: assignment is allowed
break;
case 2:
int z{ 4 }; // illegal: initialization is not allowed if subsequent cases exist
y = 5; // okay: y was declared above, so we can use it here too
break;
case 3:
break;
}
Although variable y was defined in case 1, it was used in case 2 as well. All statements inside the switch are considered to be part of the same scope. Thus, a variable declared or defined in one case can be used in a later case, even if the case in which the variable is defined is never executed (because the switch jumped over it)!
However, initialization of variables does require the definition to execute at runtime (since the value of the initializer must be determined at that point). Initialization of variables is disallowed in any case that is not the last case (because the initializer could be jumped over, which would leave the variable uninitialized). Initialization is also disallowed before the first case, as those statements will never be executed, as there is no way for the switch to reach them.
If a case needs to define and/or initialize a new variable, the best practice is to do so inside an explicit block underneath the case statement:
switch (1)
{
case 1:
{ // note addition of explicit block here
int x{ 4 }; // okay, variables can be initialized inside a block inside a case
std::cout << x;
break;
}
default:
std::cout << "default case\n";
break;
}
If defining variables used in a case statement, do so in a block inside the case.
Spaghetti code
is code that has a path of execution that resembles a bowl of spaghetti (all tangled and twisted), making it extremely difficult to follow the logic of such code.
Avoid goto statements (unless the alternatives are significantly worse for code readability).
Favor while(true)
for intentional infinite loops.
Often, we want a loop to execute a certain number of times. To do this, it is common to use a loop variable, often called a counter.
Loop variables should be of type (signed) int.
Each time a loop executes, it is called an iteration.
Favor while loops over do-while when given an equal choice.
The for statement looks pretty simple in abstract:
for (init-statement; condition; end-expression)
statement;
Avoid operator!= when doing numeric comparisons in the for-loop condition.
Defining multiple variables (in the init-statement) and using the comma operator (in the end-expression) is acceptable inside a for statement.
Prefer for loops
over while loops
when there is an obvious loop variable.
Prefer while loops
over for loops
when there is no obvious loop variable.
New programmers sometimes have trouble understanding the difference between break
and return
. A break statement
terminates the switch or loop, and execution continues at the first statement beyond the switch or loop. A return statement
terminates the entire function that the loop is within, and execution continues at point where the function was called.
The continue statement provides a convenient way to end the current iteration of a loop without terminating the entire loop.
Continue statements
work by causing the current point of execution to jump to the bottom of the current loop.
In the case of a for loop, the end-statement of the for loop still executes after a continue (since this happens after the end of the loop body).
Be careful when using a continue statement
with while or do-while loops. These loops typically change the value of variables used in the condition inside the loop body. If use of a continue statement
causes these lines to be skipped, then the loop can become infinite!
Use break and continue when they simplify your loop logic.
Our stance is that early returns are more helpful than harmful, but we recognize that there is a bit of art to the practice.
Use early returns when they simplify your function’s logic.
One important note about calling std::exit() explicitly: std::exit() does not clean up any local variables (either in the current function, or in functions up the call stack). Because of this, it’s generally better to avoid calling std::exit().
The std::exit() function does not clean up local variables in the current function or up the call stack.
Because std::exit() terminates the program immediately, you may want to manually do some cleanup before terminating. In this context, cleanup means things like closing database or network connections, deallocating any memory you have allocated, writing information to a log file, etc…
In the above example, we called function cleanup() to handle our cleanup tasks. However, remembering to manually call a cleanup function before calling every call to exit() adds burden to the programmer.
To assist with this, C++ offers the std::atexit() function, which allows you to specify a function that will automatically be called on program termination via std::exit().
A few notes here about std::atexit() and the cleanup function: First, because std::exit() is called implicitly when main() terminates, this will invoke any functions registered by std::atexit() if the program exits that way. Second, the function being registered must take no parameters and have no return value. Finally, you can register multiple cleanup functions using std::atexit() if you want, and they will be called in reverse order of registration (the last one registered will be called first).
The short answer is “almost never”. Destroying local objects is an important part of C++ (particularly when we get into classes), and none of the above-mentioned functions clean up local variables. Exceptions are a better and safer mechanism for handling error cases.
Only use a halt if there is no safe way to return normally from the main function. If you haven’t disabled exceptions, prefer using exceptions for handling errors safely.
Software testing (also called software validation) is the process of determining whether or not the software actually works as expected.
Write your program in small, well defined units (functions or classes), compile often, and test your code as you go.
Because writing functions to exercise other functions is so common and useful, there are entire frameworks (called unit testing frameworks) that are designed to help simplify the process of writing, maintaining, and executing unit tests. Since these involve third party software, we won’t cover them here, but you should be aware they exist.
Question #1
When should you start testing your code?
As soon as you’ve written a non-trivial function.
The term code coverage is used to describe how much of the source code of a program is executed while testing. There are many different metrics used for code coverage. We’ll cover a few of the more useful and popular ones in the following sections.
The term statement coverage refers to the percentage of statements in your code that have been exercised by your testing routines.
Branch coverage refers to the percentage of branches that have been executed, each possible branch counted separately. An if statement has two branches – a branch that executes when the condition is true, and a branch that executes when the condition is false (even if there is no corresponding else statement to execute). A switch statement can have many branches.
int foo(int x, int y)
{
int z{ y };
if (x > y)
{
z = x;
}
return z;
}
The previous call to foo(1, 0) gave us 100% statement coverage and exercised the use case where x > y, but that only gives us 50% branch coverage. We need one more call, to foo(0, 1), to test the use case where the if statement does not execute.
Aim for 100% branch coverage of your code.
Loop coverage (informally called the 0, 1, 2 test) says that if you have a loop in your code, you should ensure it works properly when it iterates 0 times, 1 time, and 2 times. If it works correctly for the 2-iteration case, it should work correctly for all iterations greater than 2. These three tests therefore cover all possibilities (since a loop can’t execute a negative number of times).
Consider:
#include
void spam(int timesToPrint)
{
for (int count{ 0 }; count < timesToPrint; ++count)
std::cout << "Spam! ";
}
To test the loop within this function properly, you should call it three times: spam(0) to test the zero-iteration case, spam(1) to test the one-iteration case, and spam(2) to test the two-iteration case. If spam(2) works, then spam(n) should work, where n > 2.
Use the 0, 1, 2 test to ensure your loops work correctly with different number of iterations.
When writing functions that accept parameters, or when accepting user input, consider what happens with different categories of input. In this context, we’re using the term “category” to mean a set of inputs that have similar characteristics.
For example, if I wrote a function to produce the square root of an integer, what values would it make sense to test it with? You’d probably start with some normal value, like 4
. But it would also be a good idea to test with 0
, and a negative number.
Here are some basic guidelines for category testing:
For integers, make sure you’ve considered how your function handles negative values, zero, and positive values. You should also check for overflow if that’s relevant.
For floating point numbers, make sure you’ve considered how your function handles values that have precision issues (values that are slightly larger or smaller than expected). Good double
type values to test with are 0.1
and -0.1
(to test numbers that are slightly larger than expected) and 0.6
and -0.6
(to test numbers that are slightly smaller than expected).
For strings, make sure you’ve considered how your function handles an empty string (just a null terminator), normal valid strings, strings that have whitespace, and strings that are all whitespace.
If your function takes a pointer, don’t forget to test nullptr
as well (don’t worry if this doesn’t make sense, we haven’t covered it yet).
Test different categories of input values to make sure your unit handles them properly.
The more arithmetic you do with a floating point number, the more it will accumulate small rounding errors.
A program that handles error cases well is said to be robust.
We can generally separate input text errors into four types:
Thus, to make our programs robust, whenever we ask the user for input, we ideally should determine whether each of the above can possibly occur, and if so, write code to handle those cases.
Let’s dig into each of these cases, and how to handle them using std::cin.
Some lessons still pass 32767 to std::cin.ignore(). This is a magic number with no special meaning to std::cin.ignore() and should be avoided. If you see such an occurrence, feel free to point it out.
As you write your programs, consider how users will misuse your program, especially around text input. For each point of text input, consider:
Could extraction fail?
Could the user enter more input than expected?
Could the user enter meaningless input?
Could the user overflow an input?
You can use if statements and boolean logic to test whether input is expected and meaningful.
The following code will clear any extraneous input:
std::cin.ignore(std::numeric_limits<std::streamsize>::max(), '\n');
The following code will test for and fix failed extractions or overflow:
if (!std::cin) // has a previous extraction failed or overflowed?
{
// yep, so let's handle the failure
std::cin.clear(); // put us back in 'normal' operation mode
std::cin.ignore(std::numeric_limits<std::streamsize>::max(), '\n'); // and remove the bad input
}
Finally, use loops to ask the user to re-enter input if the original input was invalid.
Input validation is important and useful, but it also tends to make examples more complicated and harder to follow. Accordingly, in future lessons, we will generally not do any kind of input validation unless it’s relevant to something we’re trying to teach.
An assertion is an expression that will be true unless there is a bug in the program.
Use assertions to document cases that should be logically impossible.
An algorithm is considered to be stateful if it retains some information across calls. Conversely, a stateless algorithm does not store any information (and must be given all the information it needs to work with when it is called). Our plusOne() function is stateful, in that it uses the static variable s_state to store the last number that was generated. When applied to algorithms, the term state refers to the current values held in stateful variables.
When a PRNG is instantiated, an initial value (or set of values) called a random seed (or seed for short) can be provided to initialize the state of the PRNG. When a PRNG has been initialized with a seed, we say it has been seeded.
Probably. For most applications, Mersenne Twister is fine, both in terms of performance and quality.
However, it’s worth noting that by modern PRNG standards, Mersenne Twister is a bit outdated. The biggest issue with Mersenne Twister is that its results can be predicted after seeing 624 generated numbers, making it non-suitable for any application that requires non-predictability.
If you are developing an application that requires the highest quality random results (e.g. a statistical simulation), the fastest results, or one where non-predictability is important (e.g. cryptography), you’ll need to use a 3rd party library.
Popular choices as of the time of writing:
The Xoshiro family and Wyrand for non-cryptographic PRNGs.
The Chacha family for cryptographic (non-predictable) PRNGs.
Okay, now that your eyes are probably bleeding, that’s enough theory. Let’s discuss how to actually generate random numbers with Mersenne Twister in C++.
Since mt is a variable, you may be wondering what mt() means.
In lesson 4.17 – Introduction to std::string, we showed an example where we called the function name.length(), which invoked the length() function on std::string variable name.
mt() is a concise syntax for calling the function mt.operator(), which for these PRNG types has been defined to return the next random result in the sequence. The advantage of using operator() instead of a named function is that we don’t need to remember the function’s name, and the concise syntax is less typing.
#include
#include // for std::mt19937
#include // for std::chrono
int main()
{
// Seed our Mersenne Twister using the
std::mt19937 mt{ static_cast<unsigned int>(
std::chrono::steady_clock::now().time_since_epoch().count()
) };
// Create a reusable random number generator that generates uniform numbers between 1 and 6
std::uniform_int_distribution die6{ 1, 6 }; // for C++14, use std::uniform_int_distribution<> die6{ 1, 6 };
// Print a bunch of random numbers
for (int count{ 1 }; count <= 40; ++count)
{
std::cout << die6(mt) << '\t'; // generate a roll of the die here
// If we've printed 10 numbers, start a new row
if (count % 10 == 0)
std::cout << '\n';
}
return 0;
}
std::chrono::high_resolution_clock is a popular choice instead of std::chrono::steady_clock. std::chrono::high_resolution_clock is the clock that uses the most granular unit of time, but it may use the system clock for the current time, which can be changed or rolled back by users. std::chrono::steady_clock may have a less granular tick time, but is the only clock with a guarantee that users can not adjust it.
Use std::random_device to seed your PRNGs (unless it’s not implemented properly for your target compiler/architecture).
std::random_device{} creates a value-initialized temporary object of type std::random_device. The () then calls operator() on that temporary object, which returns a randomized value (which we use as an initializer for our Mersenne Twister)
It’s the equivalent of the calling the following function, which uses a syntax you should be more familiar with:
unsigned int getRandomDeviceValue()
{
std::random_device rd{}; // create a value initialized std::random_device object
return rd(); // return the result of operator() to the caller
}
Using std::random_device{}()
allows us to get the same result without creating a named function or named variable, so it’s much more concise.
Only seed a given pseudo-random number generator once, and do not reseed it.
Here’s an example of a common mistake that new programmers make:
#include
#include
int getCard()
{
std::mt19937 mt{ std::random_device{}() }; // this gets created and seeded every time the function is called
std::uniform_int_distribution card{ 1, 52 };
return card(mt);
}
int main()
{
std::cout << getCard();
return 0;
}
In the getCard() function, the random number generator is being created and seeded every time the function is called. This is inefficient at best, and will likely cause poor random results.
What happens if we want to use a random number generator in multiple functions? One way is to create (and seed) our PRNG in our main() function, and then pass it everywhere we need it. But that’s a lot of passing for something we may only use sporadically, and in different places.
Although you can create a static local std::mt19937 variable in each function that needs it (static so that it only gets seeded once), it’s overkill to have every function that uses a random number generator define and seed its own local generator. A better option in most cases is to create a global random number generator (inside a namespace!). Remember how we told you to avoid non-const global variables? This is an exception.
#include
#include // for std::mt19937 and std::random_device
namespace Random // capital R to avoid conflicts with functions named random()
{
std::mt19937 mt{ std::random_device{}() };
int get(int min, int max)
{
std::uniform_int_distribution die{ min, max }; // we can create a distribution in any function that needs it
return die(mt); // and then generate a random number from our global generator
}
}
int main()
{
std::cout << Random::get(1, 6) << '\n';
std::cout << Random::get(1, 10) << '\n';
std::cout << Random::get(1, 20) << '\n';
return 0;
}
In the above example, Random::mt is a global variable that can be accessed from any function. We’ve created Random::get() as an easy way to get a random number between min and max. std::uniform_int_distribution is typically cheap to create, so it’s fine to create when we need it.
Halts allow us to terminate our program. Normal termination means the program has exited in an expected way (and the status code will indicate whether it succeeded or not). std::exit() is automatically called at the end of main, or it can be called explicitly to terminate the program. It does some cleanup, but does not cleanup any local variables, or unwind the call stack.