OGRE内存分配策略

本文介绍了OGRE渲染引擎中的内存分配器,前言介绍了OGRE版本、编译环境、工具等信息,接着介绍了OGRE渲染引擎中的内存分配和策略的设计、实现。

一. 前言

分析的ogre源代码版本是

2013-12-30-fig1

操作系统:windows 7 x64

编译工具:CMake

编译器:Visual Studio 2008

编译完成后会生成一个OgreBuildSettings.h文件,包括一些用户自定义的信息,举个简单的例子,下面的几个宏表示编译源码,使它支持D3D9渲染和OpenGL渲染功能,编译的插件有BSP、OCTREE、PCZ、PFX、CG,使用小端字节序。

1
2
3
4
5
6
7
8
9
10
#define OGRE_BUILD_RENDERSYSTEM_D3D9
/* #undef OGRE_BUILD_RENDERSYSTEM_D3D11 */
#define OGRE_BUILD_RENDERSYSTEM_GL
/* #undef OGRE_BUILD_RENDERSYSTEM_GLES */
/* #undef OGRE_BUILD_RENDERSYSTEM_GLES2 */
#define OGRE_BUILD_PLUGIN_BSP
#define OGRE_BUILD_PLUGIN_OCTREE
#define OGRE_BUILD_PLUGIN_PCZ
#define OGRE_BUILD_PLUGIN_PFX
#define OGRE_BUILD_PLUGIN_CG

OGRE的配置信息,包括分配器的选择、是否提供多线程支持、系统和编译器相关的一些宏定义等,一般保存在下面的几个头文件中。

OgreBuildSettings.h OgreConfig.h OgrePlatform.h OgreStdHeaders.h OgreStableHeaders.h

OgrePrerequisites.h

二. Ogre内存分配

2.1 主要的文件

Ogre渲染引擎与内存管理相关的几个文件有:

OgreAlignedAllocator.h OgreAlignedAllocator.cpp OgreMemoryAllocatedObject.h OgreMemoryAllocatorConfig.h OgreMemoryNedAlloc.h        //使用nedmalloc内存分配器,未使用内存池 OgreMemoryNedAlloc.cpp OgreMemoryNedPooling.h      //使用nedmalloc内存分配器,使用内存池,是OGRE默认的分配器 OgreMemoryNedPooling.cpp OgreMemoryStdAlloc.h        //系统自带的内存分配器 OgreMemorySTLAllocator.h OgreMemoryTracker.h         //用于追踪内存分配和释放,例如记录是否发生内存泄漏等问题

OgreMemoryTracker.cpp

2.2 OGRE的三种内存分配方法和实现

OGRE定义了4种内存分配方法,STD表示系统自带的内存分配器,NED和NEDPOOLING都使用到了nedmalloc内存分配器(对内存分配器的介绍,参见文章《内存分配器浅谈》),USER表示用户自定义的内存分配器,源码中为空。

1
2
3
4
5
// define the memory allocator configuration to use
#define OGRE_MEMORY_ALLOCATOR_STD 1
#define OGRE_MEMORY_ALLOCATOR_NED 2
#define OGRE_MEMORY_ALLOCATOR_USER 3
#define OGRE_MEMORY_ALLOCATOR_NEDPOOLING 4

OGRE通过宏OGRE_MEMORY_ALLOCATOR来选择分配器的类型,默认情况下它的值是4,选择nedmalloc分配器的内存池的方法。

三种不同的内存分配器都需要实现接口相同但名字不同的类,基本结构如下所示:

1
2
3
4
5
6
7
8
9
10
11
12
13
class _OgreExport AllocateNamePolicy
{
public :
     static inline void * allocateBytes( size_t count,
         const char * file = 0, int line = 0, const char * func = 0);
     static inline void deallocateBytes( void * ptr);
     /// Get the maximum size of a single allocation
     static inline size_t getMaxAllocationSize();
 
private :
     // No instantiation
     AllocateNamePolicy ();
};

如果采用STD内存分配器,则实现的分配策略类名是StdAllocPolicy和StdAlignedAllocPolicy,类的实现在文件OgreAlignedAllocator.h,OgreAlignedAllocator.cpp和OgreMemoryStdAlloc.h。

如果采用NED,则是NedAllocPolicy和NedAlignedAllocPolicy,类的实现在文件OgreMemoryNedAlloc.h    和OgreMemoryNedAlloc.cpp中。

如果采用NEDPOOLING,则是StdAllocPolicy和StdAlignedAllocPolicy,类的实现在文件OgreMemoryNedAlloc.h和OgreMemoryNedAlloc.cpp中。

2.3 OGRE的内存分配策略

内存分配策略,指的是一种设计模式,将自定义的分存分配器应用到对象内存的分配和释放上。

接下来,以系统自带的内存分配器,来说明OGRE的内存分配策略,此时OGRE_MEMORY_ALLOCATOR的值为1。

分配的内存共有两类:非内存对齐的和内存对齐的。而所谓的内存对齐的解释,可以参考【2】【3】两篇文章,有非常详细的介绍。

OGRE中非内存对齐的内存分配代码,如下所示,#if OGRE_MEMORY_TRACKER到#endif之间的内容与内存追踪器相关,不影响内存的分配和释放。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
class _OgreExport StdAllocPolicy
{
public :
     static inline void * allocateBytes( size_t count,
#if OGRE_MEMORY_TRACKER
         const char * file = 0, int line = 0, const char * func = 0
#else
         const char *  = 0, int  = 0, const char * = 0
#endif
         )
     {
         void * ptr = malloc (count);
#if OGRE_MEMORY_TRACKER
         // this alloc policy doesn't do pools
         MemoryTracker::get()._recordAlloc(ptr, count, 0, file, line, func);
#endif
         return ptr;
     }
     static inline void deallocateBytes( void * ptr)
     {
#if OGRE_MEMORY_TRACKER
         MemoryTracker::get()._recordDealloc(ptr);
#endif
         free (ptr);
     }
 
     /// Get the maximum size of a single allocation
     static inline size_t getMaxAllocationSize()
     {
         return std::numeric_limits< size_t >::max();
     }
private :
     // no instantiation
     StdAllocPolicy()
     { }
};

内存对齐的分配代码如下所示:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
template < size_t Alignment = 0>
class StdAlignedAllocPolicy
{
public :
     // compile-time check alignment is available.
     typedef int IsValidAlignment
         [Alignment <= 128 && ((Alignment & (Alignment-1)) == 0) ? +1 : -1];
 
     static inline void * allocateBytes( size_t count,
#if OGRE_MEMORY_TRACKER
         const char * file = 0, int line = 0, const char * func = 0
#else
         const char *  = 0, int  = 0, const char * = 0
#endif
         )
     {
         void * ptr = Alignment ? AlignedMemory::allocate(count, Alignment)
             : AlignedMemory::allocate(count);
#if OGRE_MEMORY_TRACKER
         // this alloc policy doesn't do pools
         MemoryTracker::get()._recordAlloc(ptr, count, 0, file, line, func);
#endif
         return ptr;
     }
 
     static inline void deallocateBytes( void * ptr)
     {
#if OGRE_MEMORY_TRACKER
         MemoryTracker::get()._recordDealloc(ptr);
#endif
         AlignedMemory::deallocate(ptr);
     }
 
     /// Get the maximum size of a single allocation
     static inline size_t getMaxAllocationSize()
     {
         return std::numeric_limits< size_t >::max();
     }
private :
     // No instantiation
     StdAlignedAllocPolicy()
     { }
};

使用系统自带的分配器,实现对齐内存的分配,参考类AlignedMemory,代码实现参见文件AlignedMemory.h和AlignedMemory.cpp,主要的代码如下所示:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
/** Class to provide aligned memory allocate functionality.
@remarks
     All SIMD processing are friendly with aligned memory, and some SIMD routines
     are designed for working with aligned memory only. If the data are intended to
     use SIMD processing, it's need to be aligned for better performance boost.
     In additional, most time cache boundary aligned data also lead to better
     performance even if didn't used SIMD processing. So this class provides a couple
     of functions for allocate aligned memory.
@par
     Anyways, in general, you don't need to use this class directly, Ogre internally
     will take care with most SIMD and cache friendly optimisation if possible.
@par
     This isn't a "one-step" optimisation, there are a lot of underlying work to
     achieve performance boost. If you didn't know what are you doing or what there
     are going, just ignore this class.
@note
     This class intended to use by advanced user only.
*/
class _OgreExport AlignedMemory
{
public :
     /** Allocate memory with given alignment.
         @param
             size The size of memory need to allocate.
         @param
             alignment The alignment of result pointer, must be power of two
             and in range [1, 128].
         @return
             The allocated memory pointer.
         @par
             On failure, exception will be throw.
     */
     static void * allocate( size_t size, size_t alignment);
 
     /** Allocate memory with default platform dependent alignment.
         @remarks
             The default alignment depend on target machine, this function
             guarantee aligned memory according with SIMD processing and
             cache boundary friendly.
         @param
             size The size of memory need to allocate.
         @return
             The allocated memory pointer.
         @par
             On failure, exception will be throw.
     */
     static void * allocate( size_t size);
 
     /** Deallocate memory that allocated by this class.
         @param
             p Pointer to the memory allocated by this class or <b>NULL</b> pointer.
         @par
             On <b>NULL</b> pointer, nothing happen.
     */
     static void deallocate( void * p);
};
void * AlignedMemory::allocate( size_t size, size_t alignment)
{
     assert (0 < alignment && alignment <= 128 && Bitwise::isPO2(alignment));
 
     unsigned char * p = new unsigned char [size + alignment];
     size_t offset = alignment - ( size_t (p) & (alignment-1));
 
     unsigned char * result = p + offset;
     result[-1] = (unsigned char )offset;
 
     return result;
}
//---------------------------------------------------------------------
void * AlignedMemory::allocate( size_t size)
{
     return allocate(size, OGRE_SIMD_ALIGNMENT);
}
//---------------------------------------------------------------------
void AlignedMemory::deallocate( void * p)
{
     if (p)
     {
         unsigned char * mem = (unsigned char *)p;
         mem = mem - mem[-1];
         delete [] mem;
     }
}

深入分析对齐内存的分配,函数void* AlignedMemory::allocate(size_t size, size_t alignment)分配内存,alignment规定了对齐的字节数,需要在[1,128]范围内且是2的幂。

以4字节对齐为例,分配100字节的大小,即有size = 100,alignment = 4,代码分析如下所示:

1
2
3
4
5
6
7
8
9
10
11
12
     assert (0 < alignment && alignment <= 128 && Bitwise::isPO2(alignment));
         // alignment=4满足条件,程序继续进行
unsigned char * p = new unsigned char [size + alignment];
     //假设返回的地址p = 0x006b8f33
     size_t offset = alignment - ( size_t (p) & (alignment-1));
         //此时offset = 1,不管p的值是什么,offset的取值范围是[1,4]
unsigned char * result = p + offset;
     //返回的指针result = 0x006b8f34,是4字节对齐的
     result[-1] = (unsigned char )offset;
         //把offset保存在地址0x006b8f33所在的字节中,因为在内存释放时,需要使用到内存分配的原始地址0x006b8f33
return result;
     //返回结果,算法结束

在理解了对齐内存的分配方法,对该内存的释放代码就容易理解了。

OGRE使用模板类来重载new、new[]、delete、delete[]等内存分配和释放的方法,如下面的代码所示,其中特别需要注意到的是placement new和placement delete,对C++中new和delete的特殊用法,参考【1】中的解释。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
/** Superclass for all objects that wish to use custom memory allocators
     when their new / delete operators are called.
     Requires a template parameter identifying the memory allocator policy
     to use (e.g. see StdAllocPolicy).
*/
template < class Alloc>
class _OgreExport AllocatedObject
{
public :
     explicit AllocatedObject()
     { }
 
     ~AllocatedObject()
     { }
 
     /// operator new, with debug line info
     void * operator new ( size_t sz, const char * file, int line, const char * func)
     {
         return Alloc::allocateBytes(sz, file, line, func);
     }
 
     void * operator new ( size_t sz)
     {
         return Alloc::allocateBytes(sz);
     }
 
     /// placement operator new
     void * operator new ( size_t sz, void * ptr)
     {
         ( void ) sz;
         return ptr;
     }
 
     /// array operator new, with debug line info
     void * operator new [] ( size_t sz, const char * file, int line, const char * func )
     {
         return Alloc::allocateBytes(sz, file, line, func);
     }
 
     void * operator new [] ( size_t sz )
     {
         return Alloc::allocateBytes(sz);
     }
 
     void operator delete ( void * ptr )
     {
         Alloc::deallocateBytes(ptr);
     }
 
     // Corresponding operator for placement delete (second param same as the first)
     void operator delete ( void * ptr, void * )
     {
         Alloc::deallocateBytes(ptr);
     }
 
     // only called if there is an exception in corresponding 'new'
     void operator delete ( void * ptr, const char * , int , const char *  )
     {
         Alloc::deallocateBytes(ptr);
     }
 
     void operator delete [] ( void * ptr )
     {
         Alloc::deallocateBytes(ptr);
     }
 
     void operator delete [] ( void * ptr, const char * , int , const char *  )
     {
         Alloc::deallocateBytes(ptr);
     }
};

至此,OGRE内存分配的各个小模块都解释完了,接下来要把它们串起来。举个例子,如果使用STD作为内存分配器,来实现内存分配,使用new和delete来分配内存,就会调用重载的new和delete函数,如下面的代码片段所示。

1
2
3
4
AllocatedObject<StdAllocPolicy>* p1 = new AllocatedObject<StdAllocPolicy>;
AllocatedObject< StdAlignedAllocPolicy<4> >* p2 = new AllocatedObject< StdAlignedAllocPolicy<4> >;
delete p1;
delete p2;

如果使用前面已定义的内存分配器来分配内存类,举个例子容易说明,如下面所示。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
using namespace Ogre;
namespace Ogre
{
     class Node: public AllocatedObject<StdAllocPolicy>
     {
     public :
         Node()
         {
             printf ( "Node Constructor\n" );
         }
         ~Node()
         {
             printf ( "Node Destructor\n" );
         }
     };
}
 
int main( )
{
     Node* p = new Node;
     delete p;
     return 0;
}

输出结果是

Node Constructor

Node Destructor

代码的执行路径如下所示:

1
2
3
4
5
6
void * AllocatedObject::operator new ( size_t sz);
void * StdAllocPolicy::allocateBytes( size_t count, const char * file = 0, int line = 0, const char * func = 0;);
Node::Node();
void AllocatedObject::operator delete ( void * ptr );
void StdAllocPolicy::deallocateBytes( void * ptr);
Node::~Node();

对内存分配策略进行总结,如下图所示,其中Alloc是任意实现了下面三个接口的类。

1
2
3
4
5
static inline void * allocateBytes( size_t count,
         const char * file = 0, int line = 0, const char * func = 0);
     static inline

你可能感兴趣的:(C++,c,源码,内存分配,OGRE)