转载自:点击打开链接
Introduction
Often I get asked whether programming with templates is hard or easy. The answer I usually give is: "It is easy to use templates, but it is hard to make them". Just take a look at some template libraries that we use in our everyday programming, like STL, ATL, WTL, some libraries from Boost, and you will see what I mean by this. Those libraries are great example of the principle "simple interface - complex implementation".
I started using templates five years ago when I discovered MFC template containers, and until last year I had no need to develop them myself. When I finally got to the point that I needed to develop some template classes, the first thing that hit me was the fact that the "traditional" way of organizing source code (declarations in *.h files, and definitions in *.cpp files) does not work with templates. It took me some time to understand why this is the case, and how to work around this problem.
This article is aimed at developers who understand templates well enough to use them, but are not very experienced at developing them. Here, I will cover only template classes and not template functions, but the principles are the same in both cases.
The Problem Described
To illustrate the problem, we will use an example. Suppose we have a template class array (nothing to do with boost::array template class) in a file array.h.
Hide Copy Code
// array.h
template <typename T, int SIZE>
class array
{
T data_[SIZE];
array (const array& other);
const array& operator = (const array& other);
public:
array(){};
T& operator[](int i) {return data_[i];}
const T& get_elem (int i) const {return data_[i];}
void set_elem(int i, const T& value) {data_[i] = value;}
operator T*() {return data_;}
};
Also, we have a file main.cpp in which is the code that uses array:
Hide Copy Code
// main.cpp
#include "array.h"
int main(void)
{
array<int, 50> intArray;
intArray.set_elem(0, 2);
int firstElem = intArray.get_elem(0);
int* begin = intArray;
}
This compiles fine, and does exactly what we want: first we make an array of 50 integers, then set the first element to be 2, read the first element, and finally take the pointer to the beginning of the array.
Now, what happens if we try to organize the code in more traditional way? Let's try to split the code in array.h and see what happens. Now we have two files: array.h and array.cpp (main.cpp remains unchanged).
Hide Copy Code
// array.h
template <typename T, int SIZE>
class array
{
T data_[SIZE];
array (const array& other);
const array& operator = (const array& other);
public:
array(){};
T& operator[](int i);
const T& get_elem (int i) const;
void set_elem(int i, const T& value);
operator T*();
};
Hide Copy Code
// array.cpp
#include "array.h"
template<typename T, int SIZE>
T& array<T, SIZE>::operator [](int i)
{
return data_[i];
}
template<typename T, int SIZE>
const T& array<T, SIZE>::get_elem(int i) const
{
return data_[i];
}
template<typename T, int SIZE>
void array<T, SIZE>::set_elem(int i, const T& value)
{
data_[i] = value;
}
template<typename T, int SIZE> array<T, SIZE>::operator T*()
{
return data_;
}
Try to compile this, and you will get three linker errors. The questions are:
Why are these errors reported in the first place?
Why there are only three linker errors? We have four member functions in array.cpp.
To answer these questions, we will need to dig into a little more details about the template instantiation process.
Template Instantiation
One of the mistakes programmers usually make when they work with template classes is to treat them as types. The term parameterized types which is often used for template classes certainly does lead us to think this way. Well, template classes are not types, they are just what the name suggests: templates. There are several important concepts to understand about the relation between template classes and types:
Compiler uses template classes to create types by substituting template parameters, and this process is called instantiation.
The type that is created from a template class is called a specialization.
Template instantiation happens on-demand, which means that the compiler will create the specialization when it finds its use in code (this place is called point of instantiation).
To create a specialization, compiler will need to "see" not only the declaration of the template in the point of instantiation, but also the definition.
Template instantiation is lazy, which means that only the definitions of functions that are used are instantiated.
If we go back to our example, array is a template, and array<int, 50> is a template specialization - a type. The process of creating array<int, 50> from array is instantiation. The point of instantiation is in the file main.cpp. If we organize the code in the "traditional" way, compiler will see the declaration of the template (array.h), but not the definition (array.cpp ). Therefore, compiler will not be able to generate the type array<int, 50>. However, it will not report an error: it will assume that this type is defined in some other compilation unit, and leave it to linker to resolve.
Now, what happens with another compilation unit (array.cpp)? Compiler will parse the template definition and check for syntax correctness, but it will not generate the code for the member functions. How it could? In order to generate the code, compiler will need to know template parameters - it needs a type, not a template.
Therefore, linker will find the definition for array<int, 50> neither in main.cpp nor in array.cpp and therefore it will report an error for all unresolved member definitions.
OK. That answers the question 1. But what about question 2? We have four member functions defined in array.cpp, and only three error messages reported by linker. The answer is in the concept of lazy instantiation. In main.cpp we don't use operator[] and compiler never even tried to instantiate its definition.
Solutions
Now that we understand what the problem is, it would be nice to offer some solutions. Here they are:
Make the template definition visible to compiler in the point of instantiation.
Instantiate the types you need explicitly in a separate compile unit, so that linker can find it.
Use keyword export.
The first two are often called inclusion model, while the third is sometimes referred as separation model.
The first solution really means that we need to include not only template declarations, but also the definitions in every translation unit in which we use the templates. In our example it means that we will use the first version of array.h with all member functions inlined, or that we include array.cpp in our main.cpp. In that case, compiler will see both the declaration and definition of all member functions from array and it will be able to instantiate array<int, 50>. The drawback of this approach is that our compilation units can become huge, and it can increase build and link time significantly.
Now the second solution. We can explicitly instantiate the template for the types we need. It is best to keep all explicit instantiation directives in a separate compilation unit. In our example, we can add a new file templateinstantiations.cpp
Hide Copy Code
// templateinstantiations.cpp
#include "array.cpp"
template class array <int, 50>; // explicit instantiation
Type array<int, 50> will be generated not in main.cpp but in templateinstantiations.cpp and linker will find its definition. With this approach, we don't have huge headers, and hence the build time will drop. Also, the header files will be "cleaner" and more readable. However, we don't have the benefits of lazy instantiation here (explicit instantiation generates the code for all member functions), and it can become tricky to maintain templateinstantiations.cpp for big projects.
The third solution is to mark the template definitions with the keyword export and the compiler will take care about the rest. When I read about export in the Stroustrup book, I was very enthusiastic about it. It took me several minutes to find out that it was not implemented on VC 6.0, and a little more to find out that no compiler supported this keyword at all (the first compiler that supports this keyword was released in late 2002). Since then, I have read more about export and learnt that it hardly solves any of the problems encountered with the inclusion model. For more information about issues with this keyword, I recommend articles by Herb Sutter.
Conclusion
In order to develop template libraries, we need to understand that template classes are not "ordinary types" and that we need to think differently when working with them. The purpose of this article was not to scare the developers who want to do some template programming. On the contrary, I hope it will help them to avoid some usual mistakes that people who start template development usually make.