改善.NET大内存对象的管理

Asdocumented elsewhere, .NET memory management consists of two heaps: the SmallObject Heap (SOH) for most situations, and the Large Object Heap (LOH) forobjects around 80 KB or larger.

The SOH isgarbage collected and compacted in clever ways I won't go into here.

The LOH, onthe other hand, is garbage collected, but not compacted. If you're working withlarge objects, then listen up! Since the LOH is not compacted, you'll find thateither:

   A.you'rerunning an x86 program and you get OutOfMemory exceptions,or

   B.you're running x64,and memory usage becomes unacceptably high. Welcome to the world of afragmented heap, just like the good old C++ days.

 正如别的地方所写的,.NET存储管理由两个堆组成,适用于大多数情形小对象堆(SOH)与适用于80KB(或更大)的大对象堆(LOH)

SOH是用以垃圾收集并且以我很难理解的方式进行压缩。

另一方面,LOH,也是用来垃圾收集的,但未被压缩。如果你正在操作大型对象,那么要听好了!既然LOH未被压缩,你将会遇到以下两种情形的其中一种:

    A.你正在运行一个x86程序并且你获得了OutOfMemory 异常,或者

    B.你正在运行x64,并且可供使用的内存很少。欢迎来到碎片堆的世界,就像美妙古老的C++时代。

So wheredoes this leave us? Let's code our way out of this jam!

If you workwith streams much, you know that MemoryStream uses aninternal byte array and wraps it inStream clothing.Memory shenanigans made easy! However, if a MemoryStream's buffergets big, that buffer ends up on the LOH, and you're in trouble. So let'simprove on things.

Our firstattempt was to create a Stream-like classthat used a MemoryStream up to64 KB, then switched to aFileStream with atemp file after that. Sadly, disk I/O killed the throughput of our application.And just lettingMemoryStreams grow andgrow caused unacceptably high memory usage.

So let'smake a new Stream-derivedclass, and instead of one internal byte array, let's go with a list of bytearrays, none large enough to end up on the LOH. Simple enough. And let's keep aglobal ConcurrentQueue ofthese little byte arrays for our own buffer recycling scheme.

MemoryStreams are reallyuseful for their dual role of stream and buffer so you can do things like...

string str = Encoding.UTF8.GetString(memStream.GetBuffer(), 0, (int)memStream.Length);

So let'salso work with MemoryStreams, and let'skeep another ConcurrentQueue ofthese streams for our own recycling scheme. When a stream to recycle is toobig, let's chop it down before enqueuing it. As long as the streams stay under8X of little buffer size, we just let it ride, and the folks requesting streamsget something with Capacity between the little buffer size and 8X the littlebuffer size. If a stream ends up on the LOH, we pull it back to the SOH when itgets recycled.

Finally, forx86 apps, you should have a timer run garbage collection with LOHdefragmentation like so:

GCSettings.LargeObjectHeapCompactionMode = GCLargeObjectHeapCompactionMode.CompactOnce;
GC.Collect();

We run thisonce a minute for one of our x86 applications. It only takes a few millisecondsto run, and, in conjunction with buffer and stream recycling, memory usagestays manageable.

Check outthe attached class library for the Stream-derivedclass, BcMemoryStream, and theoverall memory manager, BcMemoryMgr. There's athird class, an IDisposable classcall MemoryStreamUse, whichmanages recycling of MemoryStreams.

You justtell BcMemoryMgr how big you want the littlebuffers to be and how many buffers and MemoryStreams torecycle. A buffer size of 8 KB is good because 8 X 8 KB -> 64 KB max sizefor MemoryStreams, which is under the 80 KB LOHthreshold. You have to make peace with the max number of objects to keep in therecycling queues. For an x86 program and 8 KB buffers and heavy MemoryStream use,you might not want to allow a worst case of 8 X 8 KB X 10,000 -> 640 MB toget locked up in this system. With a maximum queue length of 1,000, you're onlycommitting to a max of 64 MB, which seems a low price to pay for buffer andstream recycling. To be clear, the memory is not pre-allocated; the max countis just how large the recycling ConcurrentQueuescan getbefore buffers and streams are let go for normal garbage collection.

Let's lookat each class in detail.

Let's startwith BcMemoryMgr. It's a small static class.It has ConcurrentQueues forrecycling buffers and streams, the buffer size, and the max queue length.

Looking atthe member functions, you Init with the buffer size and max queue length, andit calls a SelfTestroutine thattests the class library. If the tests don't pass, the code doesn't run...poorman's unit testing. Note that you can specify a zero max queue length, in whichcase you'll get no recycling, just normal garbage collection. There are bufferfunctions AllocBuffer and FreeBuffer ...anybody remember malloc and free? There are stream functions AllocStream and FreeStreamFreeStream chopsthe stream down if it's too big before enqueuing it for reuse.

BcMemoryStream is Stream-derived andimplements pretty much the same interface as MemoryStream. Onenotable exception is that you cannot set the Capacity property.Instead, there is a Reset functionyou can call to free all buffers in the class, returning the Capacity to zero. Thefun code is in Read and Write;Buffer.BulkCopy camein handy. There are extension routines for working with strings. Thesewere handy when writing BcMemoryMgr's SelfTest routine.

BcMemoryStreamUse is asmall IDisposable class uses BcMemoryMgr toallocate a MemoryStream in itsconstructor and free it in its Dispose method. Stream recycling isfun and easy!

BcMemoryBufferUser issimilar to BcMemoryStream, just forbyte arrays. Buffer recycling is fun and easy!

BcMemoryTest puts BcMemoryStream and MemoryStream up toa test, hashing all files in a directory and its subdirectories. For each file,it starts with a FileStream, then CopyTo's intoeither MemoryStream orBcMemoryTest, the doesthe hashing. In our tests, we use a leafy and varied test directory with lotsof built binaries. The performance for the total run time for BcMemoryStream wasabout 25% faster thanMemoryStream. So notonly did we solve the LOH problem, we get a better end result. Hooray!

Finally,there's a little config fileaddition that is absolutely necessary for server applications:

<?xmlversion="1.0"encoding="utf-8"?><configuration>
    ...
    <runtime>
        <gcServerenabled="true"/>
    </runtime></configuration>

所以这回把我们带到什么地方?让我们编码来走出这个困境!

    如果你经常处理流,你就会知道内存流(MemoryStream)用一个内部的字节数组并且将其包装进流(Stream中。内存小把戏会使这个更简单!但是,如果一个内存流(MemoryStream的缓冲变大,这个缓冲将会变成LOH,并且你将会陷入麻烦,让我们来改善这些问题。

    我们的第一个尝试是创建一个流类型的类,这个类至多会用64KB的内存流,然后改成用附有暂时文档的文件流。可悲的是,磁盘I/O会严重影响我们应用程序的吞吐量。并且让内存流增长会引起不可接受的内存高使用率。

    所以让我们用字节数组列表来创建一个流驱动的类,而不是用一个内部的字节数组,字节数组列表不会大到变成LOH.这够简单啦。并且让我们为我们的缓冲回收计划来设计一个这些小字节数组的全局同步队列。

    内存流对于他自己流和缓冲的双重角色是很有用的,所以你可以这样做......

string str = Encoding.UTF8.GetString(memStream.GetBuffer(), 0, (int)memStream.Length);

    所以让我们也来操作内存流,并且为我们自己的回收计划保存另一个这些流的同步队列。当一个要回收的流很大时,在我们将其入队钱截断它。只要流的小缓冲大小小于8X,我们就让它运行,并且请求流的人们会取得在小缓冲大小和8X小缓冲大小之间的容量的信息。如果一个流最终变成LOH,当它被回收时,我们就将其变回SOH

    最终,对于x86应用程序,你应该有一个像这样的操作LOH碎片的定时垃圾回收机制:

GCSettings.LargeObjectHeapCompactionMode = GCLargeObjectHeapCompactionMode.CompactOnce;
GC.Collect();

    对于我们的x86应用程序我们会每一分钟运行它。这只需要花费几毫秒来运行,并且,与缓冲和流回收一起写作,内存使用处在可管理的状态。

    为这个流驱动的类查出附加的类库,BcMemoryStream和综合的内存管理器,BcMemoryMgr。还有第三个类一个叫MemoryStreamUseIDisposable 类,它将会管理内存流的回收。

    你只要告诉BcMemoryMgr 你想要多大的小缓冲,多少缓冲和内存流用来回收。一个8KB的缓冲尺寸是好的因为 8 X 8 KB->64KB,对于内存流来说是最大尺寸,这是在80KB LOH之下的。你必须与对象的最大数量和平相处来将它们保存在回收队列中。对于一个x86程序和8 KB 缓冲与重内存流使用,你将不会想允许最坏的情形8 X 8 KB X10,000 -> 640 MB 出现在系统中。你的队列最大长度有1000,你只能占用64MB,这对于缓冲和流回收来说代价较低。更清楚地说,内存不是预先分配的;最大的数量是在缓冲和流被用作普通垃圾收集之前同步队列回收能够取得的最大的数量。

    让我们详细地看一下每个类。

     BcMemoryMgr开始。这是一个小的静态类。对于回收缓冲和刘它有同步队列,缓冲大小和最大队列长度。

    看看成员函数,你从缓冲大小和最大队列长度开始,并且它会吊影一个测试类库的SelfTest如果测试未通过,代码不会运行。。真是糟糕的人为单元测试啊。注意到你可以确定一个以0为最大长度的队列,在这种情形下你不能回收,只能做普通垃圾回收。有缓冲函数AllocBuffer  FreeBuffer...你们每个人都记得malloc和free函数吗?有流函数AllocStream and FreeStreamFreeStream会截断流如果在为了重用而将流如对之前流太大。

    BcMemoryStream 是流驱动的并且操作和MemoryStream一样相同的接口。一个值得注意的例外是你不能设置容量属性。相反,有一个充值函数,你能调用它来释放在类中的所有缓冲,将容量变为0。这些有趣的代码在读和写中;Buffer.BulkCopy很容易得到。对于操作字符串也有很多额外惯例。当撰写 BcMemoryMgr SelfTest自测试时这些都是很容易得到的。

    BcMemoryStreamUse是一个用BcMemoryMgr在构造函数中来分配一个内存流并在Dispose方法中释放内存流的小IDisposable类。

    BcMemoryBufferUserBcMemoryStream是相似的,当然只是对于字节数组来说。缓冲回收是有趣并且简单的。

BcMemoryTest 会测试 BcMemoryStream MemoryStream在一个目录以及子目录中哈希排列所有文件。对于每个文件,都会从文件流开始,然后 将其复制进MemoryStream 或者BcMemoryTest中,然后做哈希排列。在我们的测试中,我们以大量被构建的双体来运用一个叶的和变化的测试目录。BcMemoryStream的总运行时间比MemoryStream大约快25%.所以,最后,有一个小型配置文件在下面,这个文件对于服务器应用程序时绝对必要的。

最终,对于服务器应用程序有一个绝对需要的配置文件。

<?xmlversion="1.0"encoding="utf-8"?><configuration>
    ...
    <runtime>
        <gcServerenabled="true"/>
    </runtime></configuration>

你可能感兴趣的:(内存,管理,大对象)