再探小型对象分配技巧

再探小型对象分配技巧(Small-Object Allocation)

/*

 * References, 参考文献:

 * [1]. Alexandrescu, Andrei. "Modern C++ Design: Generic Programming and Design 

 * Patterns Applied". Copyright (c) 2001. Addison-Wesley.

 *

 * 本文研究的源代码取自loki库SmallObj.h & SmallObj.cpp 

 */

 

#include <boost/progress.hpp>
#include <vector>

using namespace boost;
using namespace std;

struct SmallObj {
    char c;
};

int main()
{
    progress_timer t;
    
    SmallObj *s = new SmallObj();   //case 1.
    vector<char> vc;  //case 2.
    vector<string>vs; //case 3.
    vector<int> *vi = new vector<int>(1); //case 4.
    
    std::cout << t.elapsed() << std::endl;
}

利用boost的progress_timer库编了一个小测试,看看c++默认的new操作大概需要多长时间.

机器配置:2.26 GHz Intel Core 2 Duo, 4GB 1067 MHz DDR3, Mac OS 10.7.3.


1. 单独执行case 1时,耗时平均3微妙,小对象内存的动态内存分配时间;

2. 同时执行2,3时,耗时最低3微妙,最多4微妙,通过std::alloc分配小对象内存;这种情况下,通常case 3是从case 2 已经分配的segregated freelist中得到分配的内存快,耗时应该很低。

3. 只执行case 4时,最低耗时4微妙,最多6微妙。


按照[1]的说法,系统缺省的free store分配器速度很慢,并且跟loki编写小型对象分配器比较,执行速度慢一个等量级。下面就看看loki的Small-Object Allocator是怎么实现的,有哪些优化技巧在实现中。


loki的小型对象分配器一共分为四层,处于最底层的是chunk struct。每一个chunk对象包含并管理一大块内存,这一大块内存本身包含整数个固定大小的区块(block)。chunk内包含逻辑信息,使用者可根据这些逻辑信息分配和归还区块,当chunk中不再剩余blocks时,分配失败并传回零。


第二层是FixedAllocator,其对象以chunk为构件。FixedAllocator主要用来满足那些“累计总量超过chunk容量”的内存分配请求。FixedAllocator会通过一个vector将chunks组合起来以达到目的。如果新的内存请求出现,但vector中的chunk都被占用了,此时FixedAllocator会产生一块新的chunk,并将它添加到vector中,再由chunk满足需求。


第三层SmallObjAllocator提供的通用性的分配归还函数,此对象拥有数个FixedAllocator对象,每一个负责分配某特定大小的对象。根据申请的bytes个数不同,SmallObjAllocator对象会将内存分配申请分发给辖内某个FixedAllocator。如果请求数量过大,会转发给系统提供的::operator new。


最后一层是SmallObject,它包装了FixedAllocator,以便向C++ Classes提供封装良好的分配服务。SmallObject重载了operator new 和 operator delete,将任务转给SmallObjAllocator对象去完成。

 

Chunk

Chunk的定义如下

 struct Chunk
        {
            void Init(std::size_t blockSize, unsigned char blocks);
            void* Allocate(std::size_t blockSize);
            void Deallocate(void* p, std::size_t blockSize);
            void Reset(std::size_t blockSize, unsigned char blocks);
            void Release();
            unsigned char* pData_;
            unsigned char
                firstAvailableBlock_,
                blocksAvailable_;
        };

 pData指向被管理内存本身,除此以外chunk还保存一下整数值:

  • firstAvailableBlock_, chunk内的第一个可用区块的索引号
  • blocksAvailable_, chunk内的可用区块总数

因为firstAvailableBlock_, blocksAvailable_的类型都是unsigned char,因此一个chunk在一部8-bit char机器上无法拥有255个以上的区块。如果block未被使用,就拿第一个字节放置“下一个未被使用的区块”的索引号。由于firstAvailableBlock_已经持有第一个block的索引号,因此我们便有了一个由“可用区块”组成的freelist,无须占用额外内存。

void FixedAllocator::Chunk::Init(std::size_t blockSize, unsigned char blocks)
{
    assert(blockSize > 0);
    assert(blocks > 0);
    // Overflow check
    assert((blockSize * blocks) / blockSize == blocks);
    
    pData_ = new unsigned char[blockSize * blocks];
    Reset(blockSize, blocks);
}

void FixedAllocator::Chunk::Reset(std::size_t blockSize, unsigned char blocks)
{
    assert(blockSize > 0);
    assert(blocks > 0);
    // Overflow check
    assert((blockSize * blocks) / blockSize == blocks);

    firstAvailableBlock_ = 0;
    blocksAvailable_ = blocks;

    unsigned char i = 0;
    unsigned char* p = pData_;
    for (; i != blocks; p += blockSize)
    {
        *p = ++i;
    }
}

 其中里面有两个技巧特别需要思考一下:

1. 将可用block的第一个字节记录索引号,加firstAvailableBlock_,形成freelist的好处是什么?

2. 为什么要将chunk管理的block数量用unsigned char类型的数值上限(255)加以限制。

 

回答:

1. 这种freelist的设计技巧在于,融于数据结构本身,无须占用额外内存,且提供了一种高效的方法chunk中可用block。

2. 假定chunk改为class template,如下

template<typename T>
struct Chunk {
   void Init(std::size_t blockSize, T blocks);
   void Release();
   void * Allocate(std::size_t blockSize);
   void Deallocate(void *p, std::size_t blockSize);
   unsigned char *pData_;
   T firstAvailableBlock_, blocksAvailable_;
};

template<typename T>
void Chunk<T>::Init(std::size_t blockSize, T blocks){
   pData_ = new unsigned char[blockSize * blocks];
   firstAvailableBlock_ = 0;
   blocksAvailable_ = blocks;
   T i = 0;
   for (; i != blocks; p+=blockSize)
      *p = ++i;
}

按此修改后,如果T被unsigned short替换的话,最高值是65535, 但是我们无法分配比sizeof(unsigned short)小的内存,但是这不是十分大的问题,可以将比之小的内存对齐成sizeof(unsigned short)大小再分配;当然这会引起Internal fragmentation。

 

另一个问题是齐位问题,假如分配block大小是5个字节,那么区块的索引该用什么类型呢?unsigned short,还是unsigned int,在将指向这样一个5字节block的指针转换为unsigned int,会引发不确定的行为。即使按照特定的字节序进行读取这5个字节中的4个字节转换为unsigned int,也要费一番周折,付出的开销或许已经抵消了由此设计而带来的效率提升。

 

所以限定一下blocks的数量为255,保持在类型unsigned char的数值上限,不失为一个明智之举。一来,chunks不是那么大,二来char的大小是1个字节,无齐位问题,即使原始内存的指针也是指向unsigned char的。

 

分配函数Allocate()的动作就是取出firstAvailableBlock_所代表的区块,然后调整firstAvailableBlock_,使其指向下一个可用区块。

void* FixedAllocator::Chunk::Allocate(std::size_t blockSize)
{
    if (!blocksAvailable_) return 0;
    
    assert((firstAvailableBlock_ * blockSize) / blockSize == 
        firstAvailableBlock_);

    unsigned char* pResult =
        pData_ + (firstAvailableBlock_ * blockSize);
    firstAvailableBlock_ = *pResult;
    --blocksAvailable_;
    
    return pResult;
}

注意这个allocate的成本很小,因为不需要查找。

 

归还函数Deallocate:

void FixedAllocator::Chunk::Deallocate(void* p, std::size_t blockSize)
{
    assert(p >= pData_);

    unsigned char* toRelease = static_cast<unsigned char*>(p);
    // Alignment check
    assert((toRelease - pData_) % blockSize == 0);

    *toRelease = firstAvailableBlock_;
    firstAvailableBlock_ = static_cast<unsigned char>(
        (toRelease - pData_) / blockSize);
    // Truncation check
    assert(firstAvailableBlock_ == (toRelease - pData_) / blockSize);

    ++blocksAvailable_;
}

 配合看Allocate和Deallocate函数实现时,需要注意几点:

1. pData永远都是指向分配chunks的起始地址。

2. firstAvailableBlock_永远记录的是第一个可用block的索引号

3. 也是最重要的一点,(经过多次allocation和deallocation之后)每个block的第一个字节保存的是下一个可用block的索引号。

4. 一次内存请求只是一个block size,不能一次分配多个block。

 

FixedAllocator

小型对象分配器的第二层由FixedAllocator构成。FixedAllocator负责分配和归还“特定大小的block”,其大小不受限于chunk,因为FixedAllocator将新建的chunk对象放在一个vector中。任何一个内存请求,会找出一个适当的chunk满足,如果都被占用了,就新建一个chunk加入到vector中。下面是FixedAllocator的定义:

class FixedAllocator{
private:
   void DoDeallocate(void *p);
   bool MakeNewChunk(void);
   Chunk * VicinityFind(void *p) const;
   FixedAllocator(const FixedAllocator&); //copy ctor, but not implemented;
   FixedAllocator& operator=(const FixedAllocator&);// copy assignment, not implemented;
   
   typedef std::vector<Chunk> Chunks;
   typedef Chunks::iterator ChunkIter;
   typedef Chunks::const_iterator ChunkCIter;

   static unsigned char MinObjectsPerChunk_;
   // Fewest # of objects managed by a chunk;
   static unsigned char MaxObjectsPerChunk_;
   // maximum # of objects manged by a chunk;

   std::size_t blockSize_;  //FixedAllocator manages chunks owning blockSize_ specific blocks only.
   unsigned char numBlocks_;

   Chunks chunks;  // Container of Chunks
   Chunk *allocChunk_; //ptr to chunk used for last/next allocation.
   Chunk *deallocChunk_; // ptr to chunk used for last/next deallocation.
   Chunk * emptyChunk; // ptr to the only empty chunk if there is one, else null...

public:
    FixedAllocator();
   ~FixedAllocator();
    void Initialize(std::size_t blockSize, std::size_t pageSize);
    void *Allocate(void);
    bool Deallocate(void *p, Chunk *hint);
    inline std::size_t BlockSize() const { return blockSize_; }
    bool TrimEmptyChunk(void); // releases the memory used by empty chunk.
    bool TrimChunkList(void); // Returns count of empty Chunks held by this allocator.
    std::size_t CountEmptyChunks(void) const;
    bool IsCorrupt(void) const;
    const Chunk * HasBlock(void *p) const;
    inline Chunk * HasBlock(void *p){
        return const_cast<Chunk *>(
             const_cast< const FixedAllocator *>( this )->HasBlock(p) );
    }
};

 

FixedAllocator::allocate()

其中,allocChunk_指针指向“最近一次分配所使用的chunk”。每次分配请求都会先查询该指针所指的chunk,如果尚有空闲空间,分配请求将由此chunk分配获得。否则会触发一次线性查找。无论上述哪种情况,allocChunk_都会更新,指向刚找到的或新添加的chunk。这样可以提高下次分配速度。以下是loki_0.1.7的FixedAlloctor::Allocate(void)的实现:

void * FixedAllocator::Allocate( void )
{

    if ( ( NULL == allocChunk_ ) || allocChunk_->IsFilled() ) 
    //初始状态或者最近一次分配所用的chunk已被占用
    {
        if ( NULL != emptyChunk_ ) //有可用的chunk,并且是空的,直接从这个chunk分配内存
        {
            allocChunk_ = emptyChunk_;
            emptyChunk_ = NULL;
        }
        else //只能遍历chunks,看是否有空闲chunk可用
        {
            for ( ChunkIter i( chunks_.begin() ); ; ++i )
            {
                if ( chunks_.end() == i ) 
             // 如果没有,只能创建一个新的chunk,并加入到chunks,
             // 同时修改allocChunk_,deallocChunk_指针
                {
                    if ( !MakeNewChunk() )
                        return NULL;
                    break;
                }
                if ( !i->IsFilled() ) //如果chunks中某个chunk(first-fit)有空闲block,分配内存。
                {
                    allocChunk_ = &*i;
                    break;
                }
            }
        }
    }
    else if ( allocChunk_ == emptyChunk_) 
    //无空emptyChunk时,修改指针emptyChunk_,置空。
        emptyChunk_ = NULL;

    void * place = allocChunk_->Allocate( blockSize_ );

    return place;
}

 

FixedAllocator::deallocated(void *p, chunk *hint);

内存归还(deallocate)比较麻烦,因为不知道待还block属于哪个chunk的。是的,我们可以遍历chunks, 检查指针是否落在pData_和pData_+blockSize_ * numBlocks_之间。找到对应的chunk后,就在这个chunk内做deallocated动作。不过这么做,需要耗费线性时间归还内存。因此loki做了一些优化,设一个成员变量deallocChunk_指针,指向归还动作所用的最后那个chunk对象。任何归还动作都必须先检查这个deallocChunk_所指向的chunk,如果是错误的chunk,再进行线性搜索,搜到以后做deallocate动作,最后修改deallocChunk_指针。

 

bool FixedAllocator::Deallocate( void * p, Chunk * hint )
{
    Chunk * foundChunk = ( NULL == hint ) ? VicinityFind( p ) : hint;
    if ( NULL == foundChunk )
        return false;

    deallocChunk_ = foundChunk;
    DoDeallocate(p);

    return true;
}

Chunk * FixedAllocator::VicinityFind( void * p ) const
{
    if ( chunks_.empty() ) return NULL;
    assert(deallocChunk_);

    const std::size_t chunkLength = numBlocks_ * blockSize_;
    Chunk * lo = deallocChunk_;
    Chunk * hi = deallocChunk_ + 1;
    const Chunk * loBound = &chunks_.front();
    const Chunk * hiBound = &chunks_.back() + 1;

    // Special case: deallocChunk_ is the last in the array
    if (hi == hiBound) hi = NULL;  //边界条件

    for (;;)
    {
        if (lo)
        {
            if ( lo->HasBlock( p, chunkLength ) ) return lo;
            if ( lo == loBound )
            {
                lo = NULL;
                if ( NULL == hi ) break;
            }
            else --lo;
        }

        if (hi)
        {
            if ( hi->HasBlock( p, chunkLength ) ) return hi;
            if ( ++hi == hiBound )
            {
                hi = NULL;
                if ( NULL == lo ) break;
            }
        }
    }

    return NULL;
}

void FixedAllocator::DoDeallocate(void* p)
{
    // Show that deallocChunk_ really owns the block at address p.
    assert( deallocChunk_->HasBlock( p, numBlocks_ * blockSize_ ) );
    // Either of the next two assertions may fail if somebody tries to
    // delete the same block twice.
    assert( emptyChunk_ != deallocChunk_ );
    assert( !deallocChunk_->HasAvailable( numBlocks_ ) );
    // prove either emptyChunk_ points nowhere, or points to a truly empty Chunk.
    assert( ( NULL == emptyChunk_ ) || ( emptyChunk_->HasAvailable( numBlocks_ ) ) );

    // call into the chunk, will adjust the inner list but won't release memory
    deallocChunk_->Deallocate(p, blockSize_);

    if ( deallocChunk_->HasAvailable( numBlocks_ ) ) //如果deallocChunk是个空chunk
    {
        assert( emptyChunk_ != deallocChunk_ );
        // deallocChunk_ is empty, but a Chunk is only released if there are 2
        // empty chunks.  Since emptyChunk_ may only point to a previously
        // cleared Chunk, if it points to something else besides deallocChunk_,
        // then FixedAllocator currently has 2 empty Chunks.
        if ( NULL != emptyChunk_ )
        {
            // If last Chunk is empty, just change what deallocChunk_
            // points to, and release the last.  Otherwise, swap an empty
            // Chunk with the last, and then release it.
            Chunk * lastChunk = &chunks_.back();
            if ( lastChunk == deallocChunk_ )
                deallocChunk_ = emptyChunk_;
            else if ( lastChunk != emptyChunk_ )
                std::swap( *emptyChunk_, *lastChunk );
            assert( lastChunk->HasAvailable( numBlocks_ ) );
            lastChunk->Release();
            chunks_.pop_back();
            if ( ( allocChunk_ == lastChunk ) || allocChunk_->IsFilled() ) 
                allocChunk_ = deallocChunk_;
        }
        emptyChunk_ = deallocChunk_;
    }

    // prove either emptyChunk_ points nowhere, or points to a truly empty Chunk.
    assert( ( NULL == emptyChunk_ ) || ( emptyChunk_->HasAvailable( numBlocks_ ) ) );
}
 

注意VicinityFind函数,里面有两个迭代器,lo和hi,(不明白为什么不用std::iterator包装一下)每次迭代分别向上和向下进行。(这样实现,有点divide-and-conquer的感觉哈)。

 

在DoDeallocate函数实现中,当deallocChunk_是空的时候,并且emptyChunk_也存在时,说明FixedAllocator此时很有同时存在两个emptyChunk了(除了deallcoChunk_与emptyChunk_都指向同一个chunk时之外),这种情况势必要释放一个emptyChunk。根据语句 lastChunk->Release(); chunks_.pop_back();可以判断出释放动作发生在chunks vector的最后一个chunk对象上(lastChunk_)。因此,在此之前需要根据以下两种具体情况修改deallocChunk_和emptyChunk_指针:

1. 如果deallocChunk指向最后一个chunk,那么将deallocChunk指向原emptyChunk。

2. 如果lastChunk不是emptyChunk指向的那个chunk,那么交换两个chunk(不是指针交换)。

 

这么做的原因是为了避免一种边界条件(allocChunk_和deallocChunk_均指向同一个chunk,且该chunk已无空闲空间)。如果按照常规的allocate()的逻辑,会新建一个chunk,并加入到chunks中,并在新建的chunk上完成内存分配。可以恰好,这个刚分配完的内存又要归还。这时deallocChunk_找到新建的chunk,完成归还动作后,发现是空的chunk,就释放掉这个新建的chunk。如果这个逻辑如此反复,性能损耗不可小觑。所以只有在发现两个空的chunk时,才会归还其中之一。

 

SmallObjAllocator

loki的allocator的第三层便是这个SmallObjAllocator,它能够分配任意大小的对象。SmallObjAllocator聚集了N个FixedAllocator。当SmallObjAllocator接收到一个分配请求时,会将请求分配给最佳匹配的FixedAllocator,否则就转给缺省的::operator new. 所以SmallObjAllocator的声明如下:

class LOKI_EXPORT SmallObjAllocator
    {
    protected:
        SmallObjAllocator( std::size_t pageSize, std::size_t maxObjectSize,
            std::size_t objectAlignSize );

        ~SmallObjAllocator( void );

    public:       
        void * Allocate( std::size_t size, bool doThrow );
        void Deallocate( void * p, std::size_t size );
                //多了一个参数,表明待还内存大小
        void Deallocate( void * p );
        inline std::size_t GetMaxObjectSize() const
        { return maxSmallObjectSize_; }

        /// Returns # of bytes between allocation boundaries.
        inline std::size_t GetAlignment() const { return objectAlignSize_; }

        bool TrimExcessMemory( void );
        bool IsCorrupt( void ) const;

    private:
        SmallObjAllocator( void );
        SmallObjAllocator( const SmallObjAllocator & );
        SmallObjAllocator & operator = ( const SmallObjAllocator & );
        Loki::FixedAllocator * pool_; //array instead of vector. (modification)

        /// Largest object size supported by allocators. 超过这值就转发到默认的::operator new
        const std::size_t maxSmallObjectSize_;

        /// Size of alignment boundaries.
        const std::size_t objectAlignSize_;
    };

  void * SmallObjAllocator::Allocate( std::size_t numBytes, bool doThrow )

void * SmallObjAllocator::Allocate( std::size_t numBytes, bool doThrow )
{
    if ( numBytes > GetMaxObjectSize() )
        return DefaultAllocator( numBytes, doThrow );

    assert( NULL != pool_ );
    if ( 0 == numBytes ) numBytes = 1;
    const std::size_t index = GetOffset( numBytes, GetAlignment() ) - 1;
    const std::size_t allocCount = GetOffset( GetMaxObjectSize(), GetAlignment() );
    (void) allocCount;
    assert( index < allocCount );

    FixedAllocator & allocator = pool_[ index ];
    assert( allocator.BlockSize() >= numBytes );
    assert( allocator.BlockSize() < numBytes + GetAlignment() );
    void * place = allocator.Allocate();

    if ( ( NULL == place ) && TrimExcessMemory() )
        place = allocator.Allocate();

    if ( ( NULL == place ) && doThrow )
    {
#ifdef _MSC_VER
        throw std::bad_alloc( "could not allocate small object" );
#else
        // GCC did not like a literal string passed to std::bad_alloc.
        // so just throw the default-constructed exception.
        throw std::bad_alloc();
#endif
    }
    return place;
}

 pool_是个FixedAllocator数组,为了能够简单而且高效地查找到哪个FixedAllocator负责哪个大小的内存块,所以数组下标index处理index大小的内存块。但是事情总是没有那么完美的,不是吗?也许某个应用程序只需产生4字节和32字节的大小的两种对象,再没其他的了。但是pool_还是要分配负责除此两个大小以外的FixedAllocator。

void SmallObjAllocator::Deallocate( void * p, std::size_t numBytes )

void SmallObjAllocator::Deallocate( void * p, std::size_t numBytes )
{
    if ( NULL == p ) return;
    if ( numBytes > GetMaxObjectSize() )
    {
        DefaultDeallocator( p );
        return;
    }
    assert( NULL != pool_ );
    if ( 0 == numBytes ) numBytes = 1;
    const std::size_t index = GetOffset( numBytes, GetAlignment() ) - 1;
    const std::size_t allocCount = GetOffset( GetMaxObjectSize(), GetAlignment() );
    (void) allocCount;
    assert( index < allocCount );
    FixedAllocator & allocator = pool_[ index ];
    assert( allocator.BlockSize() >= numBytes );
    assert( allocator.BlockSize() < numBytes + GetAlignment() );
    const bool found = allocator.Deallocate( p, NULL );
    (void) found;
    assert( found );
}

void SmallObjAllocator::Deallocate( void * p )
//这个函数因为只有一个待还区块的指针,所以不得已需要遍历FixedAllocator,找到对应负责的FixedAllocator。
{
    if ( NULL == p ) return;
    assert( NULL != pool_ );
    FixedAllocator * pAllocator = NULL;
    const std::size_t allocCount = GetOffset( GetMaxObjectSize(), GetAlignment() );
    Chunk * chunk = NULL;

    for ( std::size_t ii = 0; ii < allocCount; ++ii )
    {
        chunk = pool_[ ii ].HasBlock( p );
        if ( NULL != chunk )
        {
            pAllocator = &pool_[ ii ];
            break;
        }
    }
    if ( NULL == pAllocator )
    {
        DefaultDeallocator( p );
        return;
    }

    assert( NULL != chunk );
    const bool found = pAllocator->Deallocate( p, chunk );
    (void) found;
    assert( found );
}

 SmallObjAllocator只是FixedAllocator的简单包装。

SmallObject

第四层,它将第三层提供的功能做更便捷于是用的包装。SmallObject重载系统默认的::operator new 和 ::operator delete。 这样,每生成一个SmallObject对象,重载后的行为便会将分配请求发送给底层的FixedAllocator。所以定义如下

template
    <
        template <class, class> class ThreadingModel = LOKI_DEFAULT_THREADING_NO_OBJ_LEVEL,
        std::size_t chunkSize = LOKI_DEFAULT_CHUNK_SIZE,
        std::size_t maxSmallObjectSize = LOKI_MAX_SMALL_OBJECT_SIZE,
        std::size_t objectAlignSize = LOKI_DEFAULT_OBJECT_ALIGNMENT,
        template <class> class LifetimePolicy = LOKI_DEFAULT_SMALLOBJ_LIFETIME,
        class MutexPolicy = LOKI_DEFAULT_MUTEX
    >
    class SmallObject : public SmallObjectBase< ThreadingModel, chunkSize,
            maxSmallObjectSize, objectAlignSize, LifetimePolicy, MutexPolicy >
    {

    public:
        virtual ~SmallObject() {}
    protected:
        inline SmallObject( void ) {}

    private:
        /// Copy-constructor is not implemented.
        SmallObject( const SmallObject & );
        /// Copy-assignment operator is not implemented.
        SmallObject & operator = ( const SmallObject & );
    }; // end class SmallObject

template
    <
        template <class, class> class ThreadingModel,
        std::size_t chunkSize,
        std::size_t maxSmallObjectSize,
        std::size_t objectAlignSize,
        template <class> class LifetimePolicy,
        class MutexPolicy
    >
    class SmallObjectBase
    {

    public:        
     
        typedef AllocatorSingleton< ThreadingModel, chunkSize,
            maxSmallObjectSize, objectAlignSize, LifetimePolicy > ObjAllocatorSingleton;
    
    private:
        typedef ThreadingModel< ObjAllocatorSingleton, MutexPolicy > MyThreadingModel;
        typedef typename ObjAllocatorSingleton::MyAllocatorSingleton MyAllocatorSingleton;
        
    public:

        static void * operator new ( std::size_t size, const std::nothrow_t & ) throw ()
        {
            typename MyThreadingModel::Lock lock;
            (void)lock; // get rid of warning
            return MyAllocatorSingleton::Instance().Allocate( size, false );
        }

        /// Placement single-object new merely calls global placement new.
        inline static void * operator new ( std::size_t size, void * place )
        {
            return ::operator new( size, place );
        }

        static void operator delete ( void * p, std::size_t size ) throw ()
        {
            typename MyThreadingModel::Lock lock;
            (void)lock; // get rid of warning
            MyAllocatorSingleton::Instance().Deallocate( p, size );
        }

        static void operator delete ( void * p, const std::nothrow_t & ) throw()
        {
            typename MyThreadingModel::Lock lock;
            (void)lock; // get rid of warning
            MyAllocatorSingleton::Instance().Deallocate( p );
        }

        /// Placement single-object delete merely calls global placement delete.
        inline static void operator delete ( void * p, void * place )
        {
            ::operator delete ( p, place );
        }

    protected:
        inline SmallObjectBase( void ) {}
        inline SmallObjectBase( const SmallObjectBase & ) {}
        inline SmallObjectBase & operator = ( const SmallObjectBase & )
        { return *this; }
        inline ~SmallObjectBase() {}
    }; // end class SmallObjectBase

 在上述实现过程中,作者Andrei又运用了一个C++编译器的技巧。在重载::operator delete(void *p, size_t size)时,实际上是需要SmallObject提供被析构对象的大小(运用了C++编译器删除对象前,即时产生一些代码计算被删除对象的大小),提供给SmallObject重载delete函数的“大小”参数来源于此。因此SmallObject提供了一个虚析构函数。因此,从SmallObject派生的任何类都会继承这个虚析构函数。

 

另一个问题就是对于整个程序而言,只需要一个唯一的SmallObjAllocator,因此SmallObject在包装第三层时,使用了Singleton模式。(Singleton不在本篇讨论范围内,在此不加描述)。

 

小结:

       正在写测试程序,对比性能;待续。。。

你可能感兴趣的:(对象)