Allocators are one of the most mysterious parts of the C++ Standard library. Allocators are rarely used explicitly; the Standard doesn't make it clear when they should ever be used. Today's allocators are substantially different from those in the original STL proposal, and there were two other designs in between — all of which relied on language features that, until recently, were available on few compilers. The Standard appears to make promises about allocator functionality with one hand and then take those promises away with the other.
This column will discuss what you can use allocators for and how you can define your own. I'm only going to discuss allocators as defined by the C++ Standard: bringing in pre-Standard designs, or workarounds for deficient compilers, would just add to the confusion.
Allocators in the C++ Standard come in two pieces: a set of generic requirements, specified in 20.1.5 (Table 32), and the class std::allocator, specified in 20.4.1. We call a class an allocator if it conforms to the requirements of Table 32. The std::allocator class conforms to those requirements, so it is an allocator. It is the only predefined allocator class in the standard library.
Every C++ programmer already knows about dynamic memory allocation: you write new X to allocate memory and create a new object of type X, and you write delete p to destroy the object that p points to and return its memory. You might reasonably think that allocators have something to do with new and delete — but they don't. (The Standard describes ::operator new as an "allocation function," but, confusingly, that's not the same as an allocator.)
The most important fact about allocators is that they were intended for one purpose only: encapsulating the low-level details of STL containers' memory management. You shouldn't invoke allocator member functions in your own code, unless you're writing an STL container yourself. You shouldn't try to use allocators to implement operator new[]; that's not what they're for. If you aren't sure whether you need to use allocators, then you don't.
An allocator is a class with member functions allocate and deallocate, the rough equivalents of malloc and free. It also has helper functions for manipulating the memory that it allocated and typedefs that describe how to refer to the memory — names for pointer and reference types. If an STL container allocates all of its memory through a user-provided allocator (which the predefined STL containers all do; each of them has a template parameter that defaults to std::allocator), you can control its memory management by providing your own allocator.
This flexibility is limited: a container still decides for itself how much memory it's going to ask for and how the memory will be used. You get to control which low-level functions a container calls when it asks for more memory, but you can't use allocators to make a vector act like a deque. Sometimes, though, even this limited flexibility is useful. If you have a special fast_allocator that allocates and deallocates memory quickly, for example (perhaps by giving up thread safety, or by using a small local heap), you can make the standard list class use it by writing std::list
If this seems esoteric to you, you're right. There is no reason to use allocators in normal code.
This already shows you something about allocators: they're templates. Allocators, like containers, have value types, and an allocator's value type must match the value type of the container it's used with. This can sometimes get ugly: map's value type is fairly complicated, so a map with an explicit allocator involves expressions like std::map
Let's start with a simple example. According to the C++ Standard, std::allocator is built on top of ::operator new. If you're using an automatic tool to trace memory usage, it's often more convenient to have something a bit simpler than std::allocator. We can use malloc instead of ::operator new, and we can leave out the complicated performance optimizations that you'll find in a good implementation of std::allocator. We'll call this simple allocator malloc_allocator.
Since the memory management in malloc_allocator is simple, we can focus on the boilerplate that's common to all STL allocators. First, some types: an allocator is a class template, and an instance of that template allocates memory specifically for objects of some type T. We provide a series of typedefs that describe how to refer to objects of that type: value_type for T itself, and others for the various flavors of pointers and references.
It's no accident that these types are so similar to those in an STL container: a container class usually gets those types directly from its allocator. Why so many typedefs? You might think that pointer is superfluous: it's just value_type*. Most of the time that's true, but you might occasionally want to define an unconventional allocator where pointer is some pointer-like class, or where it's some nonstandard vendor-specific type like value_type __far*; allocators are a standard hook for nonstandard extensions. Unusual pointer types are also the reason for the address member function, which in malloc_allocator is just an alternate spelling for operator&:
Now we can get to the real work: allocate and deallocate. They're straightforward, but they don't look quite like malloc and free. We pass two arguments to allocate: the number of objects that we're allocating space for (max_size returns the largest request that might succeed), and, optionally, an address that can be used as a locality hint. A simple allocator like malloc_allocator makes no use of that hint, but an allocator designed for high performance might. The return value is a pointer to a block of memory that's large enough for n objects of type value_type and that has the correct alignment for that type. We also pass two arguments to deallocate: a pointer, of course, but also an element count. A container has to keep track of sizes on its own; the size arguments to allocate and deallocate must match. Again, this extra argument exists for reasons of performance, and again, malloc_allocator doesn't use it.
The allocate and deallocate member functions deal with uninitialized memory; they don't construct or destroy objects. An expression like a.allocate(1) is more like malloc(sizeof(int)) than like new int. Before using the memory you get from allocate, you have to create some objects in that memory; before returning that memory with deallocate, you have to destroy those objects. C++ provides a mechanism for creating an object at a specific memory location: placement new. If you write new(p) T(a, b) then you are invoking T's constructor to create a new object, just as if you had written new T(a, b) or T t(a, b). The difference is that when you write new(p) T(a, b) you're specifying the location where that object is constructed: the address where p points. (Naturally, p has to point to a large enough region of memory, and it has to point to raw memory; you can't construct two different objects at the same address.) You can also call an object's destructor, without releasing any memory, by writing p->~T(). These features are rarely used, because usually memory allocation and initialization go together: it's inconvenient and dangerous to work with pointers to uninitialized memory. One of the few places where you need such low-level techniques is if you're writing a container class, so allocators decouple allocation from initialization. The member function construct performs placement new, and the member function destroy invokes the destructor.
(Why do allocators have those member functions, when containers could use placement new directly? One reason is to hide the somewhat awkward syntax, and another is that if you're writing a more complicated allocator you might want construct and destroy to have some side effects beside object construction and destruction. An allocator might, for example, maintain a log of all currently active objects.) None of these member functions is static, so the first thing a container has to do before using an allocator is create allocator objects — and that means we should define some constructors. We don't need an assignment operator, though: once a container creates its allocator, the allocator isn't ever supposed to be changed. The allocator requirements in Table 32 don't include assignment. Just to be on the safe side, to make sure nobody uses an assignment operator accidentally, we'll disable the one that would otherwise be generated automatically.
None of these constructors actually does anything, because this allocator doesn't have any member variables to initialize. For the same reason, any two malloc_allocator objects are interchangeable; if a1 and a2 are both of type malloc_allocator
Would you ever want to have an allocator where different objects weren't interchangeable? Certainly — but simple and useful examples are hard to come by. One obvious possibility is memory pools. It's common for large C programs to allocate memory from several different places ("pools"), instead of directly doing everything through malloc. This has several benefits, one of which is that it only takes a single function call to reclaim all of the memory associated with a particular phase of the program. A program that uses memory pools might define utility functions like mempool_Alloc and mempool_Free, where mempool_Alloc(n, p) allocates n bytes from pool p. It's easy to write a mempool_allocator that fits into such a framework: each mempool_allocator object would have a member variable to specify which pool it's associated with, and mempool_allocator::allocate would invoke mempool_Alloc to get memory from the appropriate pool [1]. Finally, we get to the one tricky part of defining an allocator: mapping between different types. The problem is that an allocator class, like malloc_allocator
What this really means is that an allocator class can't ever just be a single class; it has to be a family of related classes, each with its own value type. An allocator class must always have a rebind member, because that's what makes it possible to go from one class in that family to another. If you have an allocator class A1, the corresponding allocator class for a different value type is typename A1::template rebind
Finally, one last detail: what do we do about void? Sometimes a container has to refer to void pointers (again, we'll see more about that in the next section), and the rebind mechanism almost gives us what we need, but not quite. It doesn't work, because we would need to write something like malloc_allocator
That's it! The complete source code for malloc_allocator is shown in Listing 1.
And finally, in case you think that this is far too much effort for far too small a benefit, a reminder: just because you can write a container class that uses allocators doesn't mean that you have to, or that you should. Sometimes you might want to write a container class that relies on a specific allocation strategy, whether it's something as ambitious as a disk-based B-tree container or as simple as the block class that I describe in my book. Even if you do want to write a container class that uses allocators, you don't have to support alternate pointer types. You can write a container where you require that any user-supplied allocator uses ordinary pointers, and document that restriction. Not everything has to be fully general.
If you want to write a simple allocator like malloc_allocator, you should have no difficulty. (Provided that you're using a reasonably modern compiler, that is.) If you have more ambitious plans, however — a memory pool allocator or an allocator with nonstandard pointer types for distributed computing — the situation is less satisfactory.
If you want to use some alternative pointer-like type, what operations does it have to support? Must it have a special null value, and, if so, how is that value written? Can you use casts? How can you convert between pointer-like objects and ordinary pointers? Do you have to worry about pointer operations throwing exceptions? I made some assumptions in the last section; the C++ Standard doesn't say whether those assumptions are right or wrong. These details are left to individual standard library implementations, and it's even legal for an implementation to ignore alternative pointer types altogether. The C++ Standard also leaves a few unanswered questions about what happens when different instances of an allocator aren't interchangeable.
Fortunately, the situation isn't quite as dire as the words in the Standard (20.1.5, paragraphs 4-5) might make it seem. The Standard left some questions unanswered because, at the time it was written, the C++ standardization committee wasn't able to agree on the answers; the necessary experience with allocators did not exist. Everyone involved in writing this section of the Standard considered it to be a temporary patch, and the vagueness will definitely be removed in a future revision.
For the moment, it's best to stay away from alternative pointer types if you're concerned about portability, but, if you're willing to accept a few limitations, you can safely use allocators like mempool_allocator where the differences between individual objects is important. All major standard library implementations now support such allocators in some way, and the differences between implementations are minor.
Just as the containers take allocator types as template parameters, so the containers' constructors take allocator objects as arguments. A container makes a copy of that argument and uses the copy for all of its memory management; once it is initialized in the constructor, the container's allocator is never changed.
The only question is what happens when you perform an operation that requires two containers to cooperate on memory management. There are exactly two such operations in the standard library: swap (all containers) and std::list::splice. In principle, an implementation could handle them in several different ways:
If you just stay away from swap and splice whenever two containers might be using different allocators, you'll be safe. In practice, I haven't found this to be a serious restriction: you need tight discipline to use a feature like memory pools safely, and you probably won't want indiscriminate mixing between containers with different allocators.
Partly because of unfamiliarity and partly because of the unsatisfactory state of the C++ Standard's requirements, most uses of allocators today are simple. As the C++ community becomes more familiar with allocators, and as the Standard is clarified, we can expect more sophisticated uses to emerge.
[1] You can see an example of a pool allocator in the open source SGI Pro64TM compiler, http://oss.sgi.com/projects/Pro64/.
[2] Why the funny template keyword in that expression? It's an annoying little technicality; like typename, it helps the compiler resolve a parsing ambiguity. The problem is that when A is a template parameter, and the compiler sees an expression like A::B