详解slab机制(4) slab初始化

2.4Slab初始化:

最后才描述初始化,原因就是在不理解slab的原理和使用,不理解cache,空看初始化会白费很多时间和精力;

其实看到这里,应该可以对slab的初始化能猜出大概了,在start_kernelmm_init函数中调用函数kmem_cache_init初始化slab,为了能创建各种长度的cache,尤其是struct arraystruct kmem_list3这两个长度的“规则”的cache,供后续创建每个新的cache时方便使用kmalloc直接申请,注意最早的cachekmem_cache结构体长度的cache同样是通过全局变量(cache_cache)实现的;

另外注意下一些名词,普通缓存和专用缓存,所谓普通缓存就是初始化阶段创建的这些,包括kmem_cachearraykmem_list3在内的长度从32419430420个长度分档的cache称为普通缓存,它们可以直接为其他模块使用kmalloc轻松匹配找到其相应的cache,即kmalloc无需创建cache,参数输入要申请的内存长度即可,而专用缓存是指申请者自己通过调用函数kmem_cache_create创建一个新的长度的cache即称为专用缓存,没什么特别的;

void __init kmem_cache_init(void)

{

         size_t left_over;

         struct cache_sizes *sizes;

         struct cache_names *names;

         int i;

         int order;

         int node;

 

         if (num_possible_nodes() == 1)

                   use_alien_caches = 0;

    /*初始化每个node的所有slab中的三个链表

      全局静态变量cache_cache,这个变量是用来管理所有缓存的kmem_cache的,

      也就是说,在初始化阶段,将会创建一个slab,用来存放所有缓存的kmem_cache*/

         for (i = 0; i < NUM_INIT_LISTS; i++) {

                   kmem_list3_init(&initkmem_list3[i]);

                   if (i < MAX_NUMNODES)

                            cache_cache.nodelists[i] = NULL;

         }

    /*initkmem_list3[0]cache_cache关联上,CACHE_CACHEindex值为0*/

         set_up_list3s(&cache_cache, CACHE_CACHE);

 

         /*

          * Fragmentation resistance on low memory - only use bigger

          * page orders on machines with more than 32MB of memory.

          */

         /*totalram_pages是记录系统实际存在物理内存的总页数,如果大于32M,

      才可以创建高阶指数内存页数的高速缓存内存对象*/

         if (totalram_pages > (32 << 20) >> PAGE_SHIFT)

                   slab_break_gfp_order = BREAK_GFP_ORDER_HI;

 

         /* Bootstrap is tricky, because several objects are allocated

          * from caches that do not exist yet:

          * 1) initialize the cache_cache cache: it contains the struct

          *    kmem_cache structures of all caches, except cache_cache itself:

          *    cache_cache is statically allocated.

          *    Initially an __init data area is used for the head array and the

          *    kmem_list3 structures, it's replaced with a kmalloc allocated

          *    array at the end of the bootstrap.

          * 2) Create the first kmalloc cache.

          *    The struct kmem_cache for the new cache is allocated normally.

          *    An __init data area is used for the head array.

          * 3) Create the remaining kmalloc caches, with minimally sized

          *    head arrays.

          * 4) Replace the __init data head arrays for cache_cache and the first

          *    kmalloc cache with kmalloc allocated arrays.

          * 5) Replace the __init data for kmem_list3 for cache_cache and

          *    the other cache's with kmalloc allocated memory.

          * 6) Resize the head arrays of the kmalloc caches to their final sizes.

          */

 

         node = numa_node_id();

/*第一步,创建struct kmem_cache所在的cache链表cache_chain,由全局变量cache_cache指向,这里只是初始化数据结构,并未真正创建这些

  对象,要待分配时才创建。全局变量cache_chain是内核slab cache链表的表头*/

         /* 1) create the cache_cache */

    /*初始化保存所有slab cache的全局链表cache_chain*/

         INIT_LIST_HEAD(&cache_chain);

    /*cache_cache加入到slab cache链表表头cache_chain*/

         list_add(&cache_cache.next, &cache_chain);

 

    /*设置cache着色基本单位为cache line的大小:32字节(L1_CACHE_BYTES)*/

         cache_cache.colour_off = cache_line_size();

    /*初始化cache_cachelocal cache,同样这里也不能使用kmalloc,需要使用静态分配的全局变量initarray_cache*/

         cache_cache.array[smp_processor_id()] = &initarray_cache.cache;

    /*初始化slab链表 ,用全局变量initkmem_list3*/

         cache_cache.nodelists[node] = &initkmem_list3[CACHE_CACHE + node];

 

         /*

          * struct kmem_cache size depends on nr_node_ids, which

          * can be less than MAX_NUMNODES.

          */

         /*buffer_size保存slab中对象的大小,首先计算struct kmem_cache实际大小,首先不计入不包含nodelists的大小,

           nr_node_ids为内存节点个数(UMA下为1)

           所以nodelists偏移 + struct kmem_list3的大小即为struct kmem_cache的大小*/

         cache_cache.buffer_size = offsetof(struct kmem_cache, nodelists) +

                                      nr_node_ids * sizeof(struct kmem_list3 *);

#if DEBUG

         cache_cache.obj_size = cache_cache.buffer_size;

#endif

    /*buffer_size32字节对齐*/

         cache_cache.buffer_size = ALIGN(cache_cache.buffer_size,

                                               cache_line_size());

    /*计算对象大小的倒数,用于计算对象在slab中的索引*/

         cache_cache.reciprocal_buffer_size =

                   reciprocal_value(cache_cache.buffer_size);

 

    /*计算cache_cache中的对象数目,num不为0意味着创建struct kmem_cache对象成功,退出

      cache_line_size()值为32*/

         for (order = 0; order < MAX_ORDER; order++) {

                   cache_estimate(order, cache_cache.buffer_size,

                            cache_line_size(), 0, &left_over, &cache_cache.num);

                   if (cache_cache.num)

                            break;

         }

         BUG_ON(!cache_cache.num);

    /*gfporder表示本slab包含2^gfporder个页面*/

         cache_cache.gfporder = order;

    /*着色区的大小,以colour_off为单位*/

         cache_cache.colour = left_over / cache_cache.colour_off;

    /*slab管理对象的大小*/

         cache_cache.slab_size = ALIGN(cache_cache.num * sizeof(kmem_bufctl_t) +

                                           sizeof(struct slab), cache_line_size());

 

         /* 2+3) create the kmalloc caches */

    /*malloc_sizes保存各级别普通高速缓存大小*/

         sizes = malloc_sizes;

    /*cache_names保存各级别普通高速缓存名称*/

         names = cache_names;

 

         /*

          * Initialize the caches that provide memory for the array cache and the

          * kmem_list3 structures first.  Without this, further allocations will

          * bug.

          */

    /*首先创建struct array_cachestruct kmem_list3所用的general cache,它们是后续初始化动作的基础

      INDEX_AC是计算本地高速缓存所用的struct arraycache_init对象在kmalloc size中的索引,

      即属于哪一级别大小的general cache,创建此大小级别的cachelocal cache所用*/

         sizes[INDEX_AC].cs_cachep = kmem_cache_create(names[INDEX_AC].name,

                                               sizes[INDEX_AC].cs_size,

                                               ARCH_KMALLOC_MINALIGN,

                                               ARCH_KMALLOC_FLAGS|SLAB_PANIC,

                                               NULL);

    /*如果struct kmem_list3struct arraycache_init对应的kmalloc size索引不同,

      即大小属于不同的级别,则创建struct kmem_list3所用的cache,否则共用一个cache*/

         if (INDEX_AC != INDEX_L3) {

                   sizes[INDEX_L3].cs_cachep =

                            kmem_cache_create(names[INDEX_L3].name,

                                     sizes[INDEX_L3].cs_size,

                                     ARCH_KMALLOC_MINALIGN,

                                     ARCH_KMALLOC_FLAGS|SLAB_PANIC,

                                     NULL);

         }

    /*创建完上述两个general cache后,slab early init阶段结束,在此之前,不允许创建外置式slab*/

         slab_early_init = 0;

   

    /*循环创建各级别(2^0-2^1213个级别)的普通高速缓存,每个普通高速缓存应包含2(1DMA1个常规,arm没有DMA)*/

         while (sizes->cs_size != ULONG_MAX) {

                   /*

                    * For performance, all the general caches are L1 aligned.

                    * This should be particularly beneficial on SMP boxes, as it

                    * eliminates "false sharing".

                    * Note for systems short on memory removing the alignment will

                    * allow tighter packing of the smaller caches.

                    */

                   if (!sizes->cs_cachep) {

                            sizes->cs_cachep = kmem_cache_create(names->name,

                                               sizes->cs_size,

                                               ARCH_KMALLOC_MINALIGN,

                                               ARCH_KMALLOC_FLAGS|SLAB_PANIC,

                                               NULL);

                   }

#ifdef CONFIG_ZONE_DMA

                   sizes->cs_dmacachep = kmem_cache_create(

                                               names->name_dma,

                                               sizes->cs_size,

                                               ARCH_KMALLOC_MINALIGN,

                                               ARCH_KMALLOC_FLAGS|SLAB_CACHE_DMA|

                                                        SLAB_PANIC,

                                               NULL);

#endif

                   sizes++;

                   names++;

         }

         /* 4) Replace the bootstrap head arrays */

    /*第四步,用kmalloc对象替换静态分配的全局变量。到目前为止一共使用了两个全局local cache,一个是cache_cache

      local cache指向initarray_cache.cache,另一个是malloc_sizes[INDEX_AC].cs_cacheplocal cache指向

      initarray_generic.cache,参见setup_cpu_cache函数。这里替换它们*/

         {

                   struct array_cache *ptr;

        /*申请cache_cache所用local cache的空间*/

                   ptr = kmalloc(sizeof(struct arraycache_init), GFP_NOWAIT);

 

                   BUG_ON(cpu_cache_get(&cache_cache) != &initarray_cache.cache);

        /*复制原cache_cachelocal cache,即initarray_cache,到新的位置*/

                   memcpy(ptr, cpu_cache_get(&cache_cache),

                          sizeof(struct arraycache_init));

                   /*

                    * Do not assume that spinlocks can be initialized via memcpy:

                    */

                   spin_lock_init(&ptr->lock);

        /*cache_cache的本地高速缓存指向新的位置*/

                   cache_cache.array[smp_processor_id()] = ptr;

       

       

        /*申请malloc_sizes[INDEX_AC].cs_cachep所用local cache的空间*/

                   ptr = kmalloc(sizeof(struct arraycache_init), GFP_NOWAIT);

 

                   BUG_ON(cpu_cache_get(malloc_sizes[INDEX_AC].cs_cachep)

                          != &initarray_generic.cache);

        /*复制原本地高速缓存到新分配的位置*/

                   memcpy(ptr, cpu_cache_get(malloc_sizes[INDEX_AC].cs_cachep),

                          sizeof(struct arraycache_init));

                   /*

                    * Do not assume that spinlocks can be initialized via memcpy:

                    */

                   spin_lock_init(&ptr->lock);

 

                   malloc_sizes[INDEX_AC].cs_cachep->array[smp_processor_id()] =

                       ptr;

         }

         /* 5) Replace the bootstrap kmem_list3's */

    /*第五步,与第四步类似,用kmalloc的空间替换静态分配的slab三链*/

         {

                   int nid;

       

                   for_each_online_node(nid) {

            /*复制struct kmem_cacheslab三链*/

                            init_list(&cache_cache, &initkmem_list3[CACHE_CACHE + nid], nid);

            /*复制struct arraycache_initslab三链*/

                            init_list(malloc_sizes[INDEX_AC].cs_cachep,

                                       &initkmem_list3[SIZE_AC + nid], nid);

            /*复制struct kmem_list3slab三链*/

                            if (INDEX_AC != INDEX_L3) {

                                     init_list(malloc_sizes[INDEX_L3].cs_cachep,

                                                 &initkmem_list3[SIZE_L3 + nid], nid);

                            }

                   }

         }

         g_cpucache_up = EARLY;

}

现在来看一看slab的初始化,源码的注释分为3个阶段:

1、              通过初始化全局变量cache_cache,创造第一个cache,这为后续创建其他的cache提供了“规则”,注意所有的cache都是挂在链表cache_chain下,而cache_cache就是该链表的第一个节点;有了struct kmem_cache长度的“规则”的cache后,就可以从slab申请kmem_cache的内存了,这为创建其他“规则”的cache打下了基础;

2、              接下来陆续创建包括struct arraycache_initstruct kmem_list3在内的长度由32419430420cache,它们都是所谓的普通缓存,注意下标识初始化进度的全局变量g_cpucache_up在这期间的变迁,由NONE->PARTIAL_AC->PARTIAL_L3,前面细致描述过;

3、              通过kmalloc申请原先由全局变量模拟的cache,包括struct arraycache_initstruct kmem_list3(分别是initarray_cacheinitkmem_list3);这时slab初始化就完成了,其他模块都可以通过kmalloc轻松获取对应的物理内存了,初始化进度的全局变量g_cpucache_up置为EARLY

start_kernel中后续调用函数kmem_cache_init_late,将初始化进度的全局变量g_cpucache_up置为FULL,彻底完成slab初始化。

 

总结,slab源码的难度大一些,但还是很重要的,海量的内核模块都在使用,加深对它的理解,对于内核的理解有不小的影响。Slab真正有效的部分一个是它提供了分配小段内存的机制,另外它申请的物理内存不是返回给伙伴系统而是驻留在slab内部,这对代码执行高效的使用硬件cache是非常有效的,这是它很关键的一个特征。

你可能感兴趣的:(linux,内存分配,kmalloc,slab,kmem_cache)