本文介绍了OGRE渲染引擎中的内存分配器,前言介绍了OGRE版本、编译环境、工具等信息,接着介绍了OGRE渲染引擎中的内存分配和策略的设计、实现。
分析的ogre源代码版本是
操作系统:windows 7 x64
编译工具:CMake
编译器:Visual Studio 2008
编译完成后会生成一个OgreBuildSettings.h文件,包括一些用户自定义的信息,举个简单的例子,下面的几个宏表示编译源码,使它支持D3D9渲染和OpenGL渲染功能,编译的插件有BSP、OCTREE、PCZ、PFX、CG,使用小端字节序。
1
2
3
4
5
6
7
8
9
10
|
#define OGRE_BUILD_RENDERSYSTEM_D3D9
/* #undef OGRE_BUILD_RENDERSYSTEM_D3D11 */
#define OGRE_BUILD_RENDERSYSTEM_GL
/* #undef OGRE_BUILD_RENDERSYSTEM_GLES */
/* #undef OGRE_BUILD_RENDERSYSTEM_GLES2 */
#define OGRE_BUILD_PLUGIN_BSP
#define OGRE_BUILD_PLUGIN_OCTREE
#define OGRE_BUILD_PLUGIN_PCZ
#define OGRE_BUILD_PLUGIN_PFX
#define OGRE_BUILD_PLUGIN_CG
|
OGRE的配置信息,包括分配器的选择、是否提供多线程支持、系统和编译器相关的一些宏定义等,一般保存在下面的几个头文件中。
OgreBuildSettings.h OgreConfig.h OgrePlatform.h OgreStdHeaders.h OgreStableHeaders.hOgrePrerequisites.h
Ogre渲染引擎与内存管理相关的几个文件有:
OgreAlignedAllocator.h OgreAlignedAllocator.cpp OgreMemoryAllocatedObject.h OgreMemoryAllocatorConfig.h OgreMemoryNedAlloc.h //使用nedmalloc内存分配器,未使用内存池 OgreMemoryNedAlloc.cpp OgreMemoryNedPooling.h //使用nedmalloc内存分配器,使用内存池,是OGRE默认的分配器 OgreMemoryNedPooling.cpp OgreMemoryStdAlloc.h //系统自带的内存分配器 OgreMemorySTLAllocator.h OgreMemoryTracker.h //用于追踪内存分配和释放,例如记录是否发生内存泄漏等问题OgreMemoryTracker.cpp
OGRE定义了4种内存分配方法,STD表示系统自带的内存分配器,NED和NEDPOOLING都使用到了nedmalloc内存分配器(对内存分配器的介绍,参见文章《内存分配器浅谈》),USER表示用户自定义的内存分配器,源码中为空。
1
2
3
4
5
|
// define the memory allocator configuration to use
#define OGRE_MEMORY_ALLOCATOR_STD 1
#define OGRE_MEMORY_ALLOCATOR_NED 2
#define OGRE_MEMORY_ALLOCATOR_USER 3
#define OGRE_MEMORY_ALLOCATOR_NEDPOOLING 4
|
OGRE通过宏OGRE_MEMORY_ALLOCATOR来选择分配器的类型,默认情况下它的值是4,选择nedmalloc分配器的内存池的方法。
三种不同的内存分配器都需要实现接口相同但名字不同的类,基本结构如下所示:
1
2
3
4
5
6
7
8
9
10
11
12
13
|
class
_OgreExport AllocateNamePolicy
{
public
:
static
inline
void
* allocateBytes(
size_t
count,
const
char
* file = 0,
int
line = 0,
const
char
* func = 0);
static
inline
void
deallocateBytes(
void
* ptr);
/// Get the maximum size of a single allocation
static
inline
size_t
getMaxAllocationSize();
private
:
// No instantiation
AllocateNamePolicy ();
};
|
如果采用STD内存分配器,则实现的分配策略类名是StdAllocPolicy和StdAlignedAllocPolicy,类的实现在文件OgreAlignedAllocator.h,OgreAlignedAllocator.cpp和OgreMemoryStdAlloc.h。
如果采用NED,则是NedAllocPolicy和NedAlignedAllocPolicy,类的实现在文件OgreMemoryNedAlloc.h 和OgreMemoryNedAlloc.cpp中。
如果采用NEDPOOLING,则是StdAllocPolicy和StdAlignedAllocPolicy,类的实现在文件OgreMemoryNedAlloc.h和OgreMemoryNedAlloc.cpp中。
内存分配策略,指的是一种设计模式,将自定义的分存分配器应用到对象内存的分配和释放上。
接下来,以系统自带的内存分配器,来说明OGRE的内存分配策略,此时OGRE_MEMORY_ALLOCATOR的值为1。
分配的内存共有两类:非内存对齐的和内存对齐的。而所谓的内存对齐的解释,可以参考【2】【3】两篇文章,有非常详细的介绍。
OGRE中非内存对齐的内存分配代码,如下所示,#if OGRE_MEMORY_TRACKER到#endif之间的内容与内存追踪器相关,不影响内存的分配和释放。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
|
class
_OgreExport StdAllocPolicy
{
public
:
static
inline
void
* allocateBytes(
size_t
count,
#if OGRE_MEMORY_TRACKER
const
char
* file = 0,
int
line = 0,
const
char
* func = 0
#else
const
char
* = 0,
int
= 0,
const
char
* = 0
#endif
)
{
void
* ptr =
malloc
(count);
#if OGRE_MEMORY_TRACKER
// this alloc policy doesn't do pools
MemoryTracker::get()._recordAlloc(ptr, count, 0, file, line, func);
#endif
return
ptr;
}
static
inline
void
deallocateBytes(
void
* ptr)
{
#if OGRE_MEMORY_TRACKER
MemoryTracker::get()._recordDealloc(ptr);
#endif
free
(ptr);
}
/// Get the maximum size of a single allocation
static
inline
size_t
getMaxAllocationSize()
{
return
std::numeric_limits<
size_t
>::max();
}
private
:
// no instantiation
StdAllocPolicy()
{ }
};
|
内存对齐的分配代码如下所示:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
|
template
<
size_t
Alignment = 0>
class
StdAlignedAllocPolicy
{
public
:
// compile-time check alignment is available.
typedef
int
IsValidAlignment
[Alignment <= 128 && ((Alignment & (Alignment-1)) == 0) ? +1 : -1];
static
inline
void
* allocateBytes(
size_t
count,
#if OGRE_MEMORY_TRACKER
const
char
* file = 0,
int
line = 0,
const
char
* func = 0
#else
const
char
* = 0,
int
= 0,
const
char
* = 0
#endif
)
{
void
* ptr = Alignment ? AlignedMemory::allocate(count, Alignment)
: AlignedMemory::allocate(count);
#if OGRE_MEMORY_TRACKER
// this alloc policy doesn't do pools
MemoryTracker::get()._recordAlloc(ptr, count, 0, file, line, func);
#endif
return
ptr;
}
static
inline
void
deallocateBytes(
void
* ptr)
{
#if OGRE_MEMORY_TRACKER
MemoryTracker::get()._recordDealloc(ptr);
#endif
AlignedMemory::deallocate(ptr);
}
/// Get the maximum size of a single allocation
static
inline
size_t
getMaxAllocationSize()
{
return
std::numeric_limits<
size_t
>::max();
}
private
:
// No instantiation
StdAlignedAllocPolicy()
{ }
};
|
使用系统自带的分配器,实现对齐内存的分配,参考类AlignedMemory,代码实现参见文件AlignedMemory.h和AlignedMemory.cpp,主要的代码如下所示:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
|
/** Class to provide aligned memory allocate functionality.
@remarks
All SIMD processing are friendly with aligned memory, and some SIMD routines
are designed for working with aligned memory only. If the data are intended to
use SIMD processing, it's need to be aligned for better performance boost.
In additional, most time cache boundary aligned data also lead to better
performance even if didn't used SIMD processing. So this class provides a couple
of functions for allocate aligned memory.
@par
Anyways, in general, you don't need to use this class directly, Ogre internally
will take care with most SIMD and cache friendly optimisation if possible.
@par
This isn't a "one-step" optimisation, there are a lot of underlying work to
achieve performance boost. If you didn't know what are you doing or what there
are going, just ignore this class.
@note
This class intended to use by advanced user only.
*/
class
_OgreExport AlignedMemory
{
public
:
/** Allocate memory with given alignment.
@param
size The size of memory need to allocate.
@param
alignment The alignment of result pointer, must be power of two
and in range [1, 128].
@return
The allocated memory pointer.
@par
On failure, exception will be throw.
*/
static
void
* allocate(
size_t
size,
size_t
alignment);
/** Allocate memory with default platform dependent alignment.
@remarks
The default alignment depend on target machine, this function
guarantee aligned memory according with SIMD processing and
cache boundary friendly.
@param
size The size of memory need to allocate.
@return
The allocated memory pointer.
@par
On failure, exception will be throw.
*/
static
void
* allocate(
size_t
size);
/** Deallocate memory that allocated by this class.
@param
p Pointer to the memory allocated by this class or <b>NULL</b> pointer.
@par
On <b>NULL</b> pointer, nothing happen.
*/
static
void
deallocate(
void
* p);
};
void
* AlignedMemory::allocate(
size_t
size,
size_t
alignment)
{
assert
(0 < alignment && alignment <= 128 && Bitwise::isPO2(alignment));
unsigned
char
* p =
new
unsigned
char
[size + alignment];
size_t
offset = alignment - (
size_t
(p) & (alignment-1));
unsigned
char
* result = p + offset;
result[-1] = (unsigned
char
)offset;
return
result;
}
//---------------------------------------------------------------------
void
* AlignedMemory::allocate(
size_t
size)
{
return
allocate(size, OGRE_SIMD_ALIGNMENT);
}
//---------------------------------------------------------------------
void
AlignedMemory::deallocate(
void
* p)
{
if
(p)
{
unsigned
char
* mem = (unsigned
char
*)p;
mem = mem - mem[-1];
delete
[] mem;
}
}
|
深入分析对齐内存的分配,函数void* AlignedMemory::allocate(size_t size, size_t alignment)分配内存,alignment规定了对齐的字节数,需要在[1,128]范围内且是2的幂。
以4字节对齐为例,分配100字节的大小,即有size = 100,alignment = 4,代码分析如下所示:
1
2
3
4
5
6
7
8
9
10
11
12
|
assert
(0 < alignment && alignment <= 128 && Bitwise::isPO2(alignment));
// alignment=4满足条件,程序继续进行
unsigned
char
* p =
new
unsigned
char
[size + alignment];
//假设返回的地址p = 0x006b8f33
size_t
offset = alignment - (
size_t
(p) & (alignment-1));
//此时offset = 1,不管p的值是什么,offset的取值范围是[1,4]
unsigned
char
* result = p + offset;
//返回的指针result = 0x006b8f34,是4字节对齐的
result[-1] = (unsigned
char
)offset;
//把offset保存在地址0x006b8f33所在的字节中,因为在内存释放时,需要使用到内存分配的原始地址0x006b8f33
return
result;
//返回结果,算法结束
|
在理解了对齐内存的分配方法,对该内存的释放代码就容易理解了。
OGRE使用模板类来重载new、new[]、delete、delete[]等内存分配和释放的方法,如下面的代码所示,其中特别需要注意到的是placement new和placement delete,对C++中new和delete的特殊用法,参考【1】中的解释。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
|
/** Superclass for all objects that wish to use custom memory allocators
when their new / delete operators are called.
Requires a template parameter identifying the memory allocator policy
to use (e.g. see StdAllocPolicy).
*/
template
<
class
Alloc>
class
_OgreExport AllocatedObject
{
public
:
explicit
AllocatedObject()
{ }
~AllocatedObject()
{ }
/// operator new, with debug line info
void
* operator
new
(
size_t
sz,
const
char
* file,
int
line,
const
char
* func)
{
return
Alloc::allocateBytes(sz, file, line, func);
}
void
* operator
new
(
size_t
sz)
{
return
Alloc::allocateBytes(sz);
}
/// placement operator new
void
* operator
new
(
size_t
sz,
void
* ptr)
{
(
void
) sz;
return
ptr;
}
/// array operator new, with debug line info
void
* operator
new
[] (
size_t
sz,
const
char
* file,
int
line,
const
char
* func )
{
return
Alloc::allocateBytes(sz, file, line, func);
}
void
* operator
new
[] (
size_t
sz )
{
return
Alloc::allocateBytes(sz);
}
void
operator
delete
(
void
* ptr )
{
Alloc::deallocateBytes(ptr);
}
// Corresponding operator for placement delete (second param same as the first)
void
operator
delete
(
void
* ptr,
void
* )
{
Alloc::deallocateBytes(ptr);
}
// only called if there is an exception in corresponding 'new'
void
operator
delete
(
void
* ptr,
const
char
* ,
int
,
const
char
* )
{
Alloc::deallocateBytes(ptr);
}
void
operator
delete
[] (
void
* ptr )
{
Alloc::deallocateBytes(ptr);
}
void
operator
delete
[] (
void
* ptr,
const
char
* ,
int
,
const
char
* )
{
Alloc::deallocateBytes(ptr);
}
};
|
至此,OGRE内存分配的各个小模块都解释完了,接下来要把它们串起来。举个例子,如果使用STD作为内存分配器,来实现内存分配,使用new和delete来分配内存,就会调用重载的new和delete函数,如下面的代码片段所示。
1
2
3
4
|
AllocatedObject<StdAllocPolicy>* p1 =
new
AllocatedObject<StdAllocPolicy>;
AllocatedObject< StdAlignedAllocPolicy<4> >* p2 =
new
AllocatedObject< StdAlignedAllocPolicy<4> >;
delete
p1;
delete
p2;
|
如果使用前面已定义的内存分配器来分配内存类,举个例子容易说明,如下面所示。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
|
using
namespace
Ogre;
namespace
Ogre
{
class
Node:
public
AllocatedObject<StdAllocPolicy>
{
public
:
Node()
{
printf
(
"Node Constructor\n"
);
}
~Node()
{
printf
(
"Node Destructor\n"
);
}
};
}
int
main( )
{
Node* p =
new
Node;
delete
p;
return
0;
}
|
输出结果是
Node ConstructorNode Destructor
代码的执行路径如下所示:
1
2
3
4
5
6
|
void
* AllocatedObject::operator
new
(
size_t
sz);
void
* StdAllocPolicy::allocateBytes(
size_t
count,
const
char
* file = 0,
int
line = 0,
const
char
* func = 0;);
Node::Node();
void
AllocatedObject::operator
delete
(
void
* ptr );
void
StdAllocPolicy::deallocateBytes(
void
* ptr);
Node::~Node();
|
对内存分配策略进行总结,如下图所示,其中Alloc是任意实现了下面三个接口的类。
1
2
3
4
5
|
static
inline
void
* allocateBytes(
size_t
count,
const
char
* file = 0,
int
line = 0,
const
char
* func = 0);
static
inline
|