三个内存区域zone,分别为ZONE_DMA,ZONE_NORMAL和ZONE_HIGHMEM,在e500上ZONE_NORMAL实际划分的内存为空,所以所有低端内存(low_memory)都划分到ZONE_DMA中;
内存区域的划分信息保存在max_zone_pfns中,其定义如下:
static unsigned long max_zone_pfns[MAX_NR_ZONES] = {
[0 ... MAX_NR_ZONES - 1] = ~0UL
};
每个zone都有一个max_zone_pfn对应来限定内存页帧上限,同时作为更高内存区域内存页帧的下限。在paging_init函数中计算各个zone区域的划分信息:
/*
* paging_init() sets up the page tables - in fact we've already done this.
*/
void __init paging_init(void)
{
unsigned long long total_ram = memblock_phys_mem_size();
phys_addr_t top_of_ram = memblock_end_of_DRAM();
enum zone_type top_zone;
#ifdef CONFIG_PPC32
unsigned long v = __fix_to_virt(__end_of_fixed_addresses - 1);
unsigned long end = __fix_to_virt(FIX_HOLE);
for (; v < end; v += PAGE_SIZE)
map_page(v, 0, 0); /* XXX gross */
#endif
#ifdef CONFIG_HIGHMEM
map_page(PKMAP_BASE, 0, 0); /* XXX gross */
pkmap_page_table = virt_to_kpte(PKMAP_BASE);
kmap_pte = virt_to_kpte(__fix_to_virt(FIX_KMAP_BEGIN));
kmap_prot = PAGE_KERNEL;
#endif /* CONFIG_HIGHMEM */
printk(KERN_DEBUG "Top of RAM: 0x%llx, Total RAM: 0x%llx\n",
(unsigned long long)top_of_ram, total_ram);
printk(KERN_DEBUG "Memory hole size: %ldMB\n",
(long int)((top_of_ram - total_ram) >> 20));
#ifdef CONFIG_HIGHMEM
top_zone = ZONE_HIGHMEM;
limit_zone_pfn(ZONE_NORMAL, lowmem_end_addr >> PAGE_SHIFT);
#else
top_zone = ZONE_NORMAL;
#endif
limit_zone_pfn(top_zone, top_of_ram >> PAGE_SHIFT);
zone_limits_final = true;
free_area_init_nodes(max_zone_pfns);
mark_nonram_nosave();
}
首先看函数尾部,分别调用两次limit_zone_pfn对ZONE_NORMAL和ZONE_HIGHMEM页帧进行限制,这里需要注意的一点是因为没有对ZONE_DMA进行特殊处理,因此会导致ZONE_DMA占用所有低端内存,ZONE_NORMAL为空;
ZONE_HIGHMEM的设置用到了top_of_ram,为RAM的最高空间,通过memblock可以查询到,在内存节点的添加以及后面的一些运算过程中会设置物理内存的上限;
ZONE_NORMAL的设置用到了lowmem_end_addr即低端内存的末尾,其数值又下面计算过程决定:
在MMU_init中:
lowmem_end_addr = memstart_addr + total_lowmem;
其值直接依赖于total_lowmem,该计算过程进行了两次,分别在adjust_total_lowmem之前和之后,而在adjust_total_lowmem中对__max_low_memory进行了计算,从而将可能进一步限定total_lowmem的取值,那么可以看到total_lowmem(在可能被__max_low_memory覆盖之前)和__max_low_memory的最小值决定了lowmem_end_addr的值,并且此时将更小者赋值给了total_lowmem作为最新值;total_lowmem即为RAM的大小;
__max_low_memory通过下面过程取得:
#define CONFIG_LOWMEM_SIZE 0x30000000
#define MAX_LOW_MEM CONFIG_LOWMEM_SIZE
/* max amount of low RAM to map in */
unsigned long __max_low_memory = MAX_LOW_MEM;
其中CONFIG_LOWMEM_SIZE为内核.config中定义的大小;
/*
* vmalloc=size forces the vmalloc area to be exactly 'size'
* bytes. This can be used to increase (or decrease) the low
* memory area. Thus this can be also used to decrease (or increase)
* low memory area.
*/
static int __init early_vmalloc(char *arg)
{
unsigned long vmalloc_reserve = memparse(arg, NULL);
PRT("vmalloc_reserve = 0x%lx", vmalloc_reserve);
if (vmalloc_reserve < SZ_16M) {
vmalloc_reserve = SZ_16M;
PRT("vmalloc area too small, limiting to %luMB\n",
vmalloc_reserve >> 20);
}
if (vmalloc_reserve > VMALLOC_RESERVE_MAX) {
vmalloc_reserve = VMALLOC_RESERVE_MAX;
PRT("vmalloc area is too big, limiting to %luMB\n",
vmalloc_reserve >> 20);
}
/* low memory aligned 16M*/
PRT("__max_low_memory = 0x%lx", __max_low_memory);
__max_low_memory = __pa(IOREMAP_TOP) - VMALLOC_OFFSET - vmalloc_reserve;
__max_low_memory &= ~(SZ_16M - 1);
PRT("__max_low_memory = 0x%lx", __max_low_memory);
return 0;
}
early_param("vmalloc", early_vmalloc);
early_vmalloc函数根据vmalloc参数解析出vmalloc_reserve的大小,大小有一定的限制,指定最小16M,最大为1G的内核虚拟地址空间除去一个VMALLOC_OFFSET和64M,64M应该是作为系统低端保障内存的空间大小;
接下来直接算出__max_low_memory的大小,并进行16M对其:
__max_low_memory = __pa(IOREMAP_TOP) - VMALLOC_OFFSET - vmalloc_reserve;
__max_low_memory &= ~(SZ_16M - 1);
取__max_low_memory和total_lowmem中较小者,然后通过map_mem_in_cams接口进行映射建立tlb条目,将可以映射的最大RAM作为最终结果重新赋值给__max_low_memory;可以看到如果tlb条目可映射的RAM数量较小,则__max_low_memory会进一步在此被修改变小
按照上面1/2/3条逐步进行限制,如果bootargs未配置vmalloc参数则无第2条执行;
回到最开始的部分,通过__max_low_memory参数最终的值和total_lowmem,在MMU_init中确定total_lowmem的最终值,从而决定了lowmem_end_addr的大小,相对total_lowmem偏移memstart_addr,在此偏移为0;进而决定了ZONE_DMA和ZONE_NORMAL的内存布局