SQLOS的内存管理器和SQL SERVER的缓冲池

from:http://blogs.msdn.com/b/slavao/archive/2005/02/11/371063.aspx

SQLOS's memory managerconsists of several components such as memory nodes, memory clerks, memorycaches, and memory objects. Fig 1. depicts memory manager components and theirrelationship:

 

SQLOS的内存管理器由以下几部分组成,即内存节点,memoryclerks,内存缓存和内存对象。如图1描绘了内存管理的几部分已经它们之间的关系

 

                              ----------------  

                             | MemoryObject | 

                              ----------------         

                                        |
                                        |
                                        V
                              ----------------  

                             | PageAllocator | 

                              ----------------         

                             /                   \

                           /                       \

                          \/                      \/

           ---------------                 ---------------  

          | MemoryClerk |                 |    Caches    |

           ---------------                  --------------- 

                              \                    /   

                               \/                \/

                            ----------------  

                             | MemoryNode | 

                             ----------------         

               Fig.1

 

 

Memory Node

内存节点

A memory node is notexposed to memory manager clients. It is internal SQLOS's object. The majorgoal for a memory node is to provide locality of allocation. It consistsof several memory allocators. There are three major types of allocators. The firsttype is a set of page allocators. The second type is virtual allocatorleveraging Windows VirtualAlloc APIs. The third type is a shared memoryallocator which is fully based on Window's file mapping APIs.

 

内存节点并不会直接暴露给内存管理器客户端。它是SQLOS的内部对象。内存节点的主要目的是提供分配的地方。它由几个分配器组成。这里有三个主要的分配器类型。第一个是一组页分配器。第二个是使用windows VirtualAlloc APIs的虚拟分配器。第三个是完全基于windows的filemapping APIs的共享内存分配器。 

 

The page allocators arethe most commonly used allocators in SQLOS memory manager. The reason they arecalled page allocators is because they allocate memory in multiple of SQLOS'spage. A size of a page is 8k, the same as a size of database page in SQLServer. As you will learn further this is not coincidence.

 

页分配器是在SQLOS内存管理器中最常使用的分配器。之所以被称为页分配器是因为他们分配内存时按照SQLOS的页大小的倍数分配的。页的大小是8K,SQL SERVER中数据库页的大小一样。正如你在下面会学到的这并不是巧合。 

 

There are four differenttypes of page allocators. Single page allocator, multi page allocator, largepage allocator and reserved page allocator. Single page allocator can onlyprovide one page at the time. Multiple page allocator, as you might guess,provides a set of pages at the time. Large page allocator could be used toallocate large pages. SQLOS and SQL Server use large pages to minimize TLBmisses when accessing hot data structures. Currently large pages are supportedonly on IA64 or x64 hardware with at least 8GB of RAM. A size of a large pageon IA64 is 16MB. The last type of page allocators reserved could be used toallocate special set of pages reserved for emergency, i.e. when SQLOS is low onmemory. Fig2. shows the memory node and its allocators.

 

这里有四个不同类型的页分配器。单页分配器,多页分配器,大型页分配器以及保留也分配器。单页分配器在同一时间只能提供一个页。多页分配器,就像你猜想的那样,在同一时间可以提供一个页集合。大型页分配器可以用于分配大型页。当访问常用数据架构时SQLOS和SQLSERVER使用大型页来最小化页表缓冲的失败。当前大型页只在IA64或X64硬件架构上被支持,并要求最少8GB的内存。在IA64上大型页的大小是16MB。最后一种保留页分配器,在紧急情况下分配保留的页,例如,当SQLOS内存不足时。配图2展示了内存节点和分配器。

 

           ----------------------              ----------------               ---------------------------
          | Large PageAllocator |<--------| Memory Node |--------->| Reserved PageAllocator  |
           ---------------------           /   ---------------- \           ---------------------------
                                                 /               |              \

                                               /                 |                \

                                             /                  |                  \

                                           /                    |                     \

                                         \/                     \/                     \/

                     --------------------     ----------------------    ----------------------         
                    | VM& SM Allocator |    | Single Page Allocator|    | Multi Page Allocator |
                    --------------------      ----------------------      ---------------------- 
           Fig. 2

 

At this point SQL Serverdoesn't have a dmv, dynamic management view, that would dump a set of allmemory nodes and information about their allocators. Dbcc memorystatus, discussedfurther, comes pretty close but it dumps information about cpu nodes not aboutmemory nodes. You might remember that CPU nodes are proper subset of memorynodes. It means that information presented by dbcc memorystatus is sufficientto understand memory distribution on the system.

 

在这点上SQL SERVER没有动态管理视图来显示所有的内存节点以及所有的分配器的信息。将在之后讨论的DBCC内存状态有一点接近,但是它显示的是CPU节点的信息而不是内存节点。你可能记得CPU节点是内存节点的一个子集。这意味着,使用DBCC内存状态足以理解内存在系统中的分配。

 

Memory Clerks

注:Clerk不知道怎么翻译

Memory nodes are hiddenfrom memory manager users. If a client of memory manager needs to allocatememory it first creates a memory clerk. There are four types of memory clerkssuch as generic, cache store, user store and object store. The latter three abit convoluted. Along with memory clerk functionality they provide datacaching.

 

内存节点对内存管理器的用户是隐藏的。如果一个内存管理器的客户需要分配内存,那么它需要先创建一个Memory Clerk 。这里有四种类型的Memory Clerk,通用,缓存储藏,用户储藏,对象储藏。后三者有一点令人费解。只是从Memory Clerk功能来看它们都提供了数据缓存。

 

One can think of amemory clerk as a bag of statistics. It supports the same type of allocators asmemory nodes as well as it enables large memory consumers to hook into memorybrokerage infrastructure. (I will describe infrastructure in one of the next posts).There are several global memory clerks provided by SQLOS. SQLOS's middle andlarge memory consumers are encouraged to use their own clerk so that one couldunderstand memory consumption by a component. Memory clerks infrastructureenables us to track and control amount of memory consumed by a memorycomponent. Each CPU node has a list of memory clerks that we can safely walkduring runtime. SQL Server implements sys.dm_os_memory_clerks dmv to dump clerkinformation. In addition combined clerk information could be derived from dbccmemory status.

 

你可以想象Memory Clerk是一个统计数值的包。它和内存节点支持的分配器类型相同,并且他还允许大型内存用户hook into memory brokerage infrastructure(注:这两句完全无法理解)(我会在下面的某个章节描述infrastructure)。其中有几个由SQLOS提供的全局Memory Clerk 。SQLOS的中型和大型内存用户鼓励用户使用它们拥有的clerk这样就可以监测某个组件的内存消耗。Memory clerksinfrastructure 允许我们追踪和控制由一个内存组件消耗的内存。每一个CPU节点有一个memory clerks的列表让我们可以在运行时安全的进行。SQL SERVER提供sys.dm_os_memory_clerks动态管理是同来显示clerk信息。此外可以使用DBCC内存状态获取合计的CLERK信息。

 

Memory objects

内存对象

SQLOS's memory object isa heap. A memory object requires a memory clerk to allocate its memory. Wesupport three types of memory objects. A variable memory objects is a regularheap. An incremental memory object is a mark/shrink heap. This allocationpolicy is very handy during compilations and execution processes. Usually bothof the processes happen in two phases. First phase is to grow memory usage andthe second is to shrink memory usage. If the process is isolated we don't haveto call any of destructors when freeing memory. It significantly improvesperformance. The last type of memory object is fixed size. As you can guesscomponents can use such policy when they need to allocate objects of a givensize.

 

SQLOS的内存对象是一个堆。一个内存对象需要一个memoryclerk 来分配它们的内存。我们支持三种类型的内存对象。变量内存对象是一个普通的堆。增量内存对象是一个增长/缩减堆。这个分配器策略在编译和执行过程中是非常便利的。通常这两个过程都发生在两个阶段。第一个阶段是增长内存使用,第二个阶段则是压缩内存使用。如果过程是孤立的,那么我们在释放内存时不用调用任何析构函数。它显著的提高了性能。最后一张内存对象是固定大小的。正如你料想的组件可以使用这个政策来分配那些给出大小的对象。

 

A payload for a givenmemory object is 8kb. It is exactly the same as a SQLOS's page size. It alsomeans that a memory object could be created from memory clerk leveraging singlepage allocator. (This is yet another very important point! Keep this in minduntil I will cover SQL Server's Buffer Pool) SQL Server exposes a dmv to dumpall memory objects in its process: sys.dm_os_memory_objects.

 

一个给定内存对象的载荷是8KB。与SQLOS页的大小完全相等。这也意味着内存对象可以在memoryclerk中调用单页分配器来创建。(这又是另外一个非常重要的点!记住这一点知道我讲述SQL SERVER 的缓冲池)SQL SERVER提供一个动态视图来存放过程中的所有内存对象:sys.dm_os_memory_objects。

 

If you notice bothmemory clerks and memory objects dmvs expose page allocator column. Also Idepicted page allocator in Fig.1. Under the hood memory object uses memoryclerks's page allocator interface to allocate pages. This is useful to knowwhen you want to join memory clerk and memory object dmvs.

 

如果你注意到memory clerk和内存对象的动态管理视图所包含的页分配器列。也在我的配图1中描绘了。在这之下,内存对象使用memory clerk的也分配器接口分配页。当你在连接memory clerk和内存对象的动态管理视图时这很有用。

 

So far I have describedhow SQLOS's memory manager structured inside. Now it is time to start talkinghow all this fits into SQL Server.

 

到目前位置我描述了SQLOS内部的内存管理是如何架构的。现在是时候开始讲述全部这些东西是如何融入 SQL SERVER的

 

Buffer Pool

缓冲池

Now we got to the pointwhere the life becomes very interesting. In this part all the pieces that Icovered so far including memory management should start fall in their places.

 

现在我们到达了使事情变得有趣的点。在这部分到目前为止我描述的内存管理的各个部分开始进入它们的角色。

 

Remember SQL Server hastwo memory settings that you can control using sp_conifigure. They are max andmin server memory. I am not sure if you know but these two setting reallycontrol the size of the buffer pool. They do not control overall amount ofphysical memory consumed by SQL Server. In reality we can't control amount ofmemory consumed by SQL Server because there could be external components loadedinto server's process.

 

记住,使用sp_conifigure你可以控制SQL SERVER的两个内存设置。它们是最大和最小服务器内存。我并不确定你是否知道这个两个设置实际上控制的是缓冲池的大小。它们并不控制SQL SERVER所消耗的内存,因为在服务器的处理过程中会加载额外的组件。

 

When SQL Server starts,during initialization, Buffer Pool first decides how much of VAS it needs toreserve for its usage. It bases its decision on the amount of physical memory,RAM, present on the box. If amount of physical memory is equal or larger thanamount of VAS it can use, remember that VAS is limited resource especially onx86, it will leave 256MB of VAS for external components plus a number ofthreads SQL Server is configured to use multiplied by 512KB. You might rememberthat 512KB is SQL Server's thread stack size. In default configuration withphysical memory larger than 2GB, Buffer Pool will leave 256MB+256*512KB = 384MBof VAS space. Some people name this region as MemToLeave but in reality it isin correct. SQL Server might end up using this part of VAS itself and I willshow you how it could happen latter on. You might also remember -g parameterthat some people recommend to use when SQL Server starts outputting "Can'tReserve Virtual Address Space" errors. First 256MB is exactly what -gparameter controls. If you specify -g 512MB, amount of VAS that BP won't use is512MB+256*512KB = 640MB. There is no point in specifying -g 256MB. This inputparameter is the same as default value.

 

当SQL SERVER 启动,处在初始化时,缓冲池首先考虑它需要保留多少VAS(注:VirtualAddress Space虚拟空间地址)以供其使用。这取决于物理内存大大小。如果物理内存和它可以使用的VAS相等或是大于,记住VAS在X86平台上是有限的资源,它会保留256MB的VAS用于外部组件使用再加上一些被SQL SERVER配置为使用512KB倍数的内存的线程。你可能记得512KB是SQL SERVER线程堆的大小。在默认配置下,当物理内存大于2GB时,缓冲池会保留256MB+256*512KB = 384MB的VAS空间。一些人将这个范围命名为MemToLeave(保留内存?),但是实际上这是不正确的。SQLSERVER 可能最终使用了这部分的VAS而且在后面我也会向你们展示这是如何发生的。你可能还记得一些人建议当SQLSERVER启动时抛出"Can'tReserve Virtual Address Space"这个错误时使用 –g 参数。首先 -g 参数默认控制的是256MB。如果你指定 –g 512MB,缓冲池没有使用的VAS大小就是512MB+256*512KB= 640MB。在这里指定-g256MB就没有任何意义了。这个输入参数和默认值是一样的。

 

Once BP decides amountof VAS it will use. It reserves all of it right away. To observe such behavioryou might want to monitor SQL Server's virtual bytes from perfmon or you coulduse vasummary view I talked about in my previous posts. In normal case BufferPool can't get this much memory in one chunk so if you take a closer look atSQL Server's VAS you will see several large regions reserved. This behavior isvery different from many other servers that you might have seen. Some peoplereport it as a VAS leak in SQL Server. In reality this behavior is by design.

 

一旦缓冲池确定它使用VAS的大小后,它马上就全部存储起来。遵循这个特性,你可能会想要perfmon(注:这里指的应该是性能监视器)或我在我上篇文章中提到的vasummaryview(注:应该是某种视图)来监视SQL SERVER的虚拟字节(或虚拟内存)。在普通情况下缓冲池不能从一块内存中获取这么大的内存,因此如果你仔细看SQL SERVER的VAS你就会发现有几个被保留的大区域。这个特性和你在其他几种数据库服务器那里看到的有非常大的不同。一些人将这当做AQL SERVER的内存泄露而报告。实际上这个特性是故意设计成这样的。

 

Buffer Pool commitspages on demand. Depending on internal memory requirements and external memorystate, it calculates its target, amount of memory it thinks it should commitbefore it can get into memory pressure. To keep system out of paging target isconstantly recalculated. Target memory can't exceed max memory that representsmax server memory settings. Even if you set min server memory equal to maxserver memory Buffer Pool will only commit its memory on demand. You canobserve this behavior by monitoring corresponding profiler event.

 

缓冲池根据需求增加页的数量。根据内部存储器的需求以及外部存储的状况,它会计算它的目标——为了避免进入内存压力它需要提高多少内存。为了保持系统不必分页,目标被不断的重复计算。目标内存不能超过最大服务器内存设置的值。甚至你将最小服务器内存和最大服务器内存设置成一样,缓冲池也只会按照需求提高它的内存。你可以通过监视相因的profiler事件来观察这一特性。

 

The size of SQL Serverdatabase page is 8KB. Buffer Pool is a cache of data pages. Consequently BufferPool operates on pages of 8KB in size. It commits and decommits memory blocksof 8KB granularity only. If external components decide to borrow memory out ofBuffer Pool they can only get blocks of 8KB in size. These blocks are notcontinues in memeory.  Interesting, right? It means that Buffer Poolcan be used as underneath memory manager forSQL Server components as long asthey allocate buffers of 8KB. (Sometimes pages allocated from BP are referredas stolen)

 

SQL SERVER数据库页的大小是8KB。缓冲池是数据页的缓存。所以,缓冲池按照8KB这个大小对页进行操作。它只按照8KB这个粒度来增加或减少内存块。如果外部组件决定从缓冲池借用内存,那么也只能按照8KB的大小获取内存块。这些块在内存中并不是连续。很有趣,对不对?这意味着缓冲池可以作为其下SQL SERVER组件的内存管理器,只要他们分配的缓冲区是8KB。(有些时候从缓冲中分配页被称为窃取)

 

Here is where SQLOS andBuffer Pool meet. See Fig.3

 

                -----------------
                | MemoryNode   |
                -----------------
                          |
                          |
                          V
          ------------------------
          | Single Page Allocator  |
           ------------------------
                          |
                          |
                          V
                -----------------
                |  Buffer Pool    |
               -----------------

 

 

Fig. 3

 

SQLOS' memory managercan be dynamically configured to use specific single page allocator. This isexactly what SQL Server does during a startup it configures Buffer Pool to beSQLOS's single page allocator. From that point on all dynamic single pageallocations are provided by Buffer Pool. For example remember that memoryobject's payload is 8KB. When a component creates a memory object theallocation is served by SQLOS's single page allocator which is BP.

 

SQLOS的内存管理器可以被动态配置为使用特定的单页分配器。这正是SQLSERVER在启动时所做的,它将缓冲池配置为SQLOS的单页分配器。从这点上看,所有的动态单页分配都是由缓冲池提供的。例如,上面提到的内存对象的载荷是8KB。当一个组件创建一个内存对象时,它的分配被送至SQLOS的单页分配器,即缓冲池。

 

When describing thememory manager I mentioned that every large component has its own memory clerk.It means that Buffer Pool has its own memory clerk as well. How is it possible,BP leverages SQLOS memory clerk but SQLOS' memory manager relies on BP? This iscommon chicken and egg problem that you often can observe in operating systems.The key here is that Buffer Pool never uses any type of page allocator fromSQLOS. It only leverages Virtual and AWE SQLOS's interfaces.

 

当描述内存管理器时,我提到每一个大型组件都有各自的memory clerk。这意味着缓冲池也有独立的memory clerk。这怎么可能?缓冲池使用SQLOS的memoryclerk,而SQLOS的内存管理器又依赖于缓冲池?这是一个通常你都会在操作系统中发现的鸡和蛋的问题。这里的关键是缓冲池从不从SQLOS使用任何类型的分配器。它只使用SQLOS的Virtual和 AWE(Address Windowing Extensions 地址窗口化扩展插件)接口。

 

           -----------------
          |    BufferPool    |
          -----------------
                    |
                    |
                    V
     --------------------------
     | Memory Clerk (VM/AWE) |
     --------------------------
                    |
                    |
                    V
          -----------------
          | MemoryNode   |
          -----------------

Fig. 4

 

All SQL Server'scomponents optimized for 8KB allocations so that they can allocate memorythrough SQLOS's single page allocator and consequently through Buffer Pool.However there are cases when a component requires large buffers. If it happensallocation will be either satisfied by memory node's multi page allocator or byvirtual allocator. As you might guess that memory will be allocated outside ofBuffer Pool. This is exactly why I don’t like term MemToLeave, SQL Server doesallocate memory out of that area!

 

所有的SQL SERVER的组件被针对8KB的分配进行优化,这样他们就可以通过SQLOS的单页分配器进行分配内存,因此也可以通过缓冲池。然而,当组件需要较大的内存的情况下。它发生的分配又不满足内存节点的多页分配器或者虚拟分配器时。正如你可能会猜想的那样内存会在缓冲池之外进行分配。这就是为什么我不喜欢保留内存这个术语,SQL SERVER在那个区域内也会进行内存分配!

 

Buffer Pool and AWEmechanism

缓冲池和AWE机制

When describing SQLOSmemory manager and Buffer Pool, the discussion would be incomplete withoutdescribtion of how AWE fits in all of this. It is really important tounderstand how Buffer Pool allocates its memory when SQL Server configured touse AWE mechanisms. First, please remember, BP leverages SQLOS's memory clerkinterfaces to allocate both VAS and physical pages through AWE. Second, thereare several differences that you need to keep in mind. First BP reserves VAS in4MB chunks instead of "single" large region. This enables SQL Serverto release VAS when process is under VAS pressure. (We didn't have all bits andpieces to do this when server is not configured to use AWE mechanisms). Then itallocates all of its memory using AWE mechanism on demand. This is very bigdifference between SQL2000 and Yukon. In SQL Server 2000 BP would allocateall of its memory when using AWE mechanism right a way.

 

当阐述SQLOS内存管理器和缓冲池时,讨论的内容不可能完全抛开AWE机制是如何融合在其中的问题。这对于理解当SQL SERVER被配置为使用AWE机制时缓冲池如何分配它的内存也很重要。第一,请记住,缓冲池使用SQLOS的memoryclerk接口来从AWE之中分配VAS和物理内存。第二,其中有几个不同点你需要记在脑子里。首先,缓冲池使用4MB的块大小保留VAS而不是单一的一大片区域。这可以让SQL SERVER在处理器处在VAS压力时释放VAS(在服务器并没有配置为AWE机制时我们完全不需要做到这一点。注:因为没有配置AWE机制时物理内存的大小和VAS的大小是一致的,不会存在未被标记地址的内存)。然后它的所有内存都是根据需要使用AWE机制进行分配。这同SQL2000和Yukon(注:应该是某种数据库服务器)之间有着非常大的区别。在SQL SERVER 2000中,缓冲池会立即使用AWE机制分配它所有的内存。

 

Buffer Pool is apreferred memory allocator for the whole server. In AWE mode it allocates itsmemory leveraging AWE mechanism. It means that all allocations allocatedthrough SQLOS's single page allocator will come from pages allocated throughAWE. This is what many people really missing. Let me make the point again: WhenServer is configured for AWE mode, most of it allocations are allocated throughAWE mechanism. This is exactly the reason why you won't see private bytes andmemory usage growing for SQL Server in this mode.

 

对于整个服务器,缓冲池是一个首选的内存分配器。在AWE模式下它使用AWE机制分配它的内存。这意味着通过SQLOS 单页分配器分配的内存都来自于AWE分配的页。这一点让很多人非常迷惑。让我们再次强调一点:当服务器被配置为AWE模式时,大部分的分配都通过AWE机制进行分配。这就是为什么在这个模式下你看到不到SQL SERVER的私有字节和内存使用的增长。

 

Since data pages are userelative addressing, i.e. self contained, Buffer Pool can map and unmap theminto and out of process's VAS. Other components could have done the same ifthey were not relying on the actual allocation address. Unfortunately there areno components right now other than BP that can take advantage of AWE mechanism.

 

 

 

Future posts

I haven't completeddiscussion about SQLOS memory management yet . There is still much to talkabout. In my next posts I will cover SQLOS caches and handling of memorypressure. It is also really important to look at dbcc memory status and relateddmvs.

你可能感兴趣的:(数据库)