C++ has several powerful features available for debugging no matter which platform you use, whether or not you have access to a debugger. The purpose of this article is to enumerate the methods you can use to debug your code, and discuss circumstances for their use.
When finding out about a new feature in a programming language, one's first inclination is often to ignore its drawbacks and try to substitute it for all other features. Since no design model is perfect for every problem, this inclination is wasteful and merely leads to poorly designed code, since everything must be made to fit into the "better" design model.
You cannot choose the most suitable model without first considering your circumstances and the relative strengths and weaknesses of several different methods. Assertions, exceptions, logging, return values, etc. all have specific strengths and weaknesses.
I'll list some of my observations on these methods.
Pros: Easy to write tons of code, imposes no execution burden in debug or release builds
Cons: Better skip the country if it doesn't work
This method is more of a non-method - that's why it is called method 0. I thought I'd include it for completeness. If you use this method often, do your clients a favor and seek professional help.
It also gives me a chance to explain the theory of debugging as I see it and some conventions we'll use throughout this article. There are basically two versions of code in C++ - debug and release. The code must be functionally equivalent in both modes. The difference is that in debug mode we favor useful debugging aids over speed, and in release mode we often value speed over debugging. Of course, you can define different levels if need be, but for this article we'll only use two.
Note that debugging is different than cleanup.
Bugs are typically poorly designed code that fails under certain conditions. Debugging is the process of finding and eliminating bugs. There are many causes of bugs, but here is a short list:
Note that when you write code, your code is rarely completely independent. Viz., your code is typically dependent on the standard library. Further, you will rarely code alone (if you intend to be a commercial success), which means interdependencies will exist between your code and that of your team members, and possibly that of your customers (if you code libraries).
Two terms are often used to describe the roles of people or source code as they relate to depencies: server and client. People who use and depend on your code are said to be clients. When you depend on their code, you are their client. Server is rarely used here, but it means "the person who wrote the code."
Pros: Relatively fast, imposes no overhead in a release build, extremely simple to code.
Cons: Slows the code down a little bit in a debug build, provide no safety in a release build, requires clients to read source code when they debug.
An assertion is a boolean expression that must hold true in order for the program to continue to execute properly. You state an assertion in C++ by using the assert function, and passing it the expression that must be true:
assert( this );
If this is zero, then the assert function stops program execution, displays a message informing you that "assert( this )" failed on such and such a line in your source file, and lets you go from there. If this wasn't zero, assert will simply return and your program will continue to execute normally.
Note that the assert function does nothing and imposes no overhead in a release build, so do NOT use it like this:
FILE* p = 0;
assert( p = fopen("myfile.txt", "r+") );
...because fopen will not be called in the release version! This is the correct way to do it:
FILE* p = fopen("myfile.txt", "r+") );
assert( p );
These are best used in writing new code, where assumptions must almost always be made. Consider the following function:
void sort(int* const myarray) // an overly simple example
{
for( unsigned int x = 0; x < sizeof(myarray)-1; x++ )
if( myarray[x] < myarray[y] ) swap(myarray[x], myarray[y]);
}
Count the number of assumptions this function makes. Now take a look at the better version, which makes debugging a bit easier:
void sort_array(int* const myarray)
{
assert( myarray );
assert( sizeof(myarray) > sizeof(int*) );
for( unsigned int x = 0; x < sizeof(myarray)-1; x++ )
if( myarray[x] < myarray[y] ) swap(myarray[x], myarray[y]);
}
You see, that innocent-looking algorithm won't work if:
void blend(const video::memory& source,
video::memory& destination,
const float colors[3])
{
// The algorithm used is: B = A * alpha
const unsigned int width = source.width();
const unsigned int height = source.height();
const unsigned int depth = source.depth();
const unsigned int pitch = source.pitch();
switch( depth )
{
case 15:
// ...
break;
case 16:
// ...
break;
case 24:
{
unsigned int offset = 0;
unsigned int index = 0;
for( unsigned int y = 0; y < height; y++ )
{
offset = y * pitch;
for( unsigned int x = destination.get_width(); x > 0; x-- )
{
index = (x * 3) + offset;
destination[index + 0] = source[index + 0] * colors[0];
destination[index + 1] = source[index + 1] * colors[1];
destination[index + 2] = source[index + 2] * colors[2];
}
}
} break;
case 32:
// ...
break;
}
}
Do you realize the amount of assumptions that function makes in the name of optimization? Let's try listing them:
assert( source.locked() and destination.locked() );
assert( source.width() == destination.width() );
assert( source.height() == destination.height() );
assert( source.depth() == destination.depth() );
assert( source.pitch() == destination.pitch() );
assert( source.depth() == 15 or source.depth() == 16
or source.depth() == 24 or source.depth() == 32 );
Typically, the more you optimize in low-level code such as that, the more assumptions you make. My function requires that the source and destination video memory be locked (so multiple blend functions can be called with a single lock()/unlock()), and that they both have the same width, height, depth, and pitch. Placing these assertions at the top of the function will prevent some programmer in the future from wondering why my function doesn't work or causes access violations - all requirements are now stated clearly at the top of the function.
However, you have to be careful which version of assert you use. If you use the ANSI C assert (defined in
Also, you don't have to check every parameter and condition - some things are painfully obvious to a good programmer. Sometimes comments might be better because they do not increase the size of the final build.
Good code should not require a plethora of assertions at the top of each function. If you find that you are writing a class and you have to place assertions in each member function to test the state, etc. then it is probably better to split the class up into other classes.
Getting into the habit of sprinkling assertions throughout your code has the following benefits:
Pros: Automatic cleanup and elegant shutdown, opportunity to continue if handled, works in both debug and release builds
Cons: Relatively slow
Basically, you use the throw keyword to throw data up to some unknown function caller and the stack continues to unwind (think of it like the program is reversing itself) until someone catches the data. You use the try keyword to enclose code that you'd like to catch exceptions from. See:
void some_function() { throw 5; } // some function that throws an exception
int main(int argc, char* argv[])
{
try // just letting the compiler know we want to catch any exceptions from this code
{
some_function();
}
catch(const int x) // if the type matches the data thrown, we execute this code
{
// do something about the exception...
}
}
If you don't place try blocks in your code, then the stack will simply continue to unwind until it gets past the main function, and your program will exit. You don't have to place them everywhere - only where you can catch an exception and can recover from it. If you can only recover from it partially, you can rethrow the original exception (by the empty throw statement "throw;") and the stack will continue unwinding until it hits the next matching catch block.
Exceptions are best used in key places in debug and release builds to track exceptional conditions. If used properly, they provide automatic cleanup and then either force the application to quit or put itself back into a valid state. Thus exceptions are perfect for release code, because they provide everything the end user wants in a well-behaved program that encounters unexpected errors. Correctly used, they provide the following benefits:
Because of the overhead, it is generally a bad idea to use them for normal flow control because other control structures will be faster and more efficient.
Pros: Fast when used with built-in types and/or constants, allow a change in the client's logic and possible cleanup
Cons: Error-handling isn't mandatory, values could be confusing
Basically, we either return valid data back to the caller, or a value to indicate an error:
const int divide(const int divisor, const int dividend)
{
if( dividend == 0 ) return( 0 ); // avoid "integer divide-by-zero" error
return( divisor/dividend );
}
A value of zero indicates that the function failed. Unfortunately, in this example you can also get a return value of zero by passing in zero for the divisor (which is perfectly valid), so the caller has no idea whether this function returned an error. This function is nonsensical, but it illustrates the problem of using return values for error handling. It's hard or impossible to choose error values for all functions.
Return values are best used in conditions when there is a grey area between an error and a simple change in logic. For example, a function might return a set of bit flags, some of which might be considered erroroneous by one client, and not by the other. Return values are great for conditional logic.
A function trusting a function caller to notice an error condition is like a lifeguard trusting other swimmers to notice someone who is drowning.
Sometimes you do not have access to a debugger, and logging errors to a file can be quite helpful in debugging. Declare a global log file (or use std::clog) and output text to it when an error occurs. It might also help to output the file name and line number so you can tell where the error occurred. __FILE__ and __LINE__ tell the compiler to insert the current filename and line number.
You can also use log files to record messages other than errors, such as the maximum number of matrices in use or some other such data that you can't access with a debugger. Or you could output the data that caused your function to fail, etc. std::fstream is great for this purpose. If you are really clever, you could figure out some way to make your assertions and exceptions log their messages to a file. :)
This provides the following benefits:
Of course, it does have some overhead so you'll have to decide whether that is offset by the benefits in your situation.
I had intended to finish this article here, but I wish to show how valuable a mixed approach to debugging can be. The easiest way to do this is by creating a simple class, preferrably one that must work with non-C++ code. A file class will do nicely. We'll use C's fopen() and related functions for simplicity and portability.
We need to meet the following requirements:
Here it is:
#include <cstdio>
#include <cassert>
#include <ciso646>
#include <string>
class file
{
public:
// Exceptions
struct exception {};
struct not_found : public exception {};
struct end : public exception {};
// Constants
enum modes { relative, absolute };
file(const std::string& filename, const std::string& parameters);
~file();
void seek(const unsigned int position, const enum modes = relative);
void read(void* const data, const unsigned int size);
void write(const void* const data, const unsigned int size);
void flush();
// Stack only!
template <typename T> void read(T& data) { read(&data, sizeof(data)); }
template <typename T> void write(const T& data) { write(&data, sizeof(data)); }
private:
FILE* pointer;
file(const file& other) {}
file& operator = (const file& other) { return( *this ); }
};
file::file(const std::string& filename, const std::string& parameters)
: pointer(0)
{
assert( not filename.empty() );
assert( not parameters.empty() );
pointer = fopen(filename.c_str(), parameters.c_str());
if( not pointer ) throw not_found();
}
file::~file()
{
int n = fclose(pointer);
assert( not n );
}
void file::seek(const unsigned int position, const enum file::modes mode)
{
int n = fseek(pointer, position, (mode == relative) ? SEEK_CUR : SEEK_SET);
assert( not n );
}
void file::read(void* const data, const unsigned int size)
{
size_t s = fread(data, size, 1, pointer);
if( s != 1 and feof(pointer) ) throw end();
assert( s == 1 );
}
void file::write(const void* const data, const unsigned int size)
{
size_t s = fwrite(data, size, 1, pointer);
assert( s == 1 );
}
void file::flush()
{
int n = fflush(pointer);
assert( not n );
}
int main(int argc, char* argv[])
{
file myfile("myfile.txt", "w+");
int x = 5, y = 10, z = 20;
float f = 1.5f, g = 29.4f, h = 0.0129f;
char c = 'I';
myfile.write(x);
myfile.write(y);
myfile.write(z);
myfile.write(f);
myfile.write(g);
myfile.write(h);
myfile.write(c);
return 0;
}
If you compile this under Windows, make sure the project type is set to "Win32 Console App." What benefits does this class provide to its clients?
Note that the exceptions are thrown at two points which are vital to the file pointer's state:
Note that all variables used to hold the return values in preparation for evaluation by assertions should be taken out by any optimizing compiler in a release build. The assertions are not evaluated in release build. This ensures that the program runs with very little overhead. If you look at the source code for the member functions, they evaluate to basically the function call in a release build.
The exceptions are, however, present in a release build, and this is especially good because they may affect the program's ability to function and recover properly.
We don't need to use assert( this ) or assert( pointer ) in the member functions because file objects can only ever be created in a valid state. If operator new doesn't allocate a file object correctly, the constructor/destructor is never called and operator new throws an exception. If fopen doesn't return a valid file pointer, we throw a file::not_found exception and the file destructor is never called, nor are any of the member functions used. So we never have to worry about having an invalid this pointer or invalid file pointer in our member functions.
(If the user tries to call a member function using a null pointer, this will probably be null and accessing the member functions will probably cause access violations. Programmers should have experience with that though.)
If we were to put the member functions into the header files and use the inline keyword, the compiler should be able to inline those functions in the release build, eliminating the function call overhead and associated temporaries, and making the class almost as fast in a release build as if we had simply used straight C code. :)
Judicious use of assertions can make your code easier to debug without decreasing the speed or size of the final (release) build.
The techniques illustrated with the file class can be used for most legacy code that exposes only handles or pointers to its internal objects.
Extension of the file class's functionality is an educational exercise left to the reader. Try adding a copy constructor and assignment operator, and test the class under different conditions. Note which assertions become invalid when you modify the class, and which existing assertions help you to catch errors with new code. With your changes, is it possible to put the file object in an invalid state?
It is best to use assertions when debugging conditions that will almost certainly cause the code to fail anyway down the function, to use exceptions as panic buttons for release code, and return values for what they were intended - passing valid data back to the calling function. Use logs for data that you can't or don't want to check when the program is running.
Fortunately, all of these suggestions are good ol' portable C++. Unfortunately, not everyone writes in C++ :) and there are also times when clients should not need to browse your source code. You must choose which methods to use based on your circumstances, and more importantly, those of your client(s).
The sad part is that return values are often the only communication of errors between C++ code and non-C++ code. It is quite a pain to check every return value (and you should). Assertions serve to alleviate some of that pain.