linux内存管理e820map

启动过程中,内核先后使用的内存分配器有:early_res,bootmem,zone allocator;后一个内存分配器启用之后,前一个内存分配器不再使用。
early_res是内存最早使用的内存分配器

arch/x86/kernel/e820.c:

   1 /*
   2  * Handle the memory map.
   3  * The functions here do the job until bootmem takes over.



内核获取内存信息
1、实模式下用BIOS提供的中断服务获取物理内存信息,并保存在boot_params的e820_map字段中。
2、将实模式下的boot_params复制到保护模式的boot_params中
3、预留内存段信息存放到early_res中,最多20个
4、将boot_params的e820_map值复制到e820中
5、bootmem初始化,标记所有内存为已使用
6、bootmem从e820_map中取得可用内存,注册到bootmem中
7、bootmem将出现在early_res预留内存段的内存标记为已使用

 

1、实模式下获取BIOS提供的物理内存信息
boot/header.S->main->detect_memory->detect_memory_e820
调用BIOS提供的0x15中断服务获取物理内存信息


2、实模式boot_params复制到保护模式boot_params中
获取实模式数据过程:
a、切换保护模式,boot/header.S->main.c->pmjump.S
b、内核解压,boot/compressed/head_32.S
c、参数复制,kernel/head_32.S
跳转到压缩内核时传递的参数boot_params放在esi中,而esi在解压过程中不变,从而传给解压后内核的启动函数startup_32;
arch/x86/kernel/head_32.S:

110 /*
111  * Copy bootup parameters out of the way.
112  * Note: %esi still has the pointer to the real-mode data.
113  * With the kexec as boot loader, parameter segment might be loaded beyond
114  * kernel image and might not even be addressable by early boot page tables.
115  * (kexec on panic case). Hence copy out the parameters before initializing
116  * page tables.
117  */
118         movl $pa(boot_params),%edi
119         movl $(PARAM_SIZE/4),%ecx
120         cld
121         rep


 

3、预留内存区间
内核在未启动bootmem之前,使用early_res存放预留内存区间;在bootmem启动时将预留区间注入到bootmem中。
early_res分配内存主要有:内核代码数据、页表、BOOTMAP(bootmem中使用的位图)等
arch/x86/kernel/head32.c:
34         reserve_early(__pa_symbol(&_text), __pa_symbol(&__bss_stop), "TEXT DATA BSS");


4、e820取出boot_params中e820map数据
start_kernel->setup_arch->setup_memory_map

dmesg | less
BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000000 - 000000000009f800 (usable)
 BIOS-e820: 000000000009f800 - 00000000000a0000 (reserved)
 BIOS-e820: 00000000000ca000 - 00000000000cc000 (reserved)
 BIOS-e820: 00000000000dc000 - 0000000000100000 (reserved)
 BIOS-e820: 0000000000100000 - 000000007fef0000 (usable)
 BIOS-e820: 000000007fef0000 - 000000007feff000 (ACPI data)
 BIOS-e820: 000000007feff000 - 000000007ff00000 (ACPI NVS)
 BIOS-e820: 000000007ff00000 - 0000000080000000 (usable)
 BIOS-e820: 00000000e0000000 - 00000000f0000000 (reserved)
 BIOS-e820: 00000000fec00000 - 00000000fec10000 (reserved)
 BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved)
 BIOS-e820: 00000000fffe0000 - 0000000100000000 (reserved)


 

5、bootmem初始化
start_kernel->setup_arch->initmem_init

6、bootmem注册可用内存
a、将e820中可用内存放入active_regions中(start_kernel->setup_arch->initmem_init->e820_register_active_regions)
b、将active_regions中的可用内存释放到bootmem中(start_kernel->setup_arch->initmem_init->setup_bootmem_allocator->setup_node_bootmem->free_bootmem_with_active_regions)

7、bootmem保留预留内存区间
start_kernel->setup_arch->initmem_init->setup_bootmem_allocator->setup_node_bootmem->early_res_to_bootmem

dmesg | less
  mapped low ram: 0 - 375fe000
  low ram: 0 - 375fe000
  node 0 low ram: 00000000 - 375fe000
  node 0 bootmap 00014000 - 0001aec0
(9 early reservations) ==> bootmem [0000000000 - 00375fe000]
  #0 [0000000000 - 0000001000]   BIOS data page ==> [0000000000 - 0000001000]
  #1 [0000001000 - 0000002000]    EX TRAMPOLINE ==> [0000001000 - 0000002000]
  #2 [0000006000 - 0000007000]       TRAMPOLINE ==> [0000006000 - 0000007000]
  #3 [0000400000 - 0000befe50]    TEXT DATA BSS ==> [0000400000 - 0000befe50]
  #4 [000009f800 - 0000100000]    BIOS reserved ==> [000009f800 - 0000100000]
  #5 [0000bf0000 - 0000bf81c8]              BRK ==> [0000bf0000 - 0000bf81c8]
  #6 [0000010000 - 0000014000]          PGTABLE ==> [0000010000 - 0000014000]
  #7 [0000bf9000 - 0001941182]      NEW RAMDISK ==> [0000bf9000 - 0001941182]
  #8 [0000014000 - 000001b000]          BOOTMAP ==> [0000014000 - 000001b000]


 

=======================================================

e820map,BIOS提供的物理内存map
I.处理e820中重叠部分
sanitize_e820_map对e820map中重叠的部分进行处理,将重叠部分归并到最高优先级,并从低优先级区间中删除;在此过程中可能出现合并与分割。

 173 /*
 174  * Sanitize the BIOS e820 map.
 175  *
 176  * Some e820 responses include overlapping entries. The following
 177  * replaces the original e820 map with a new one, removing overlaps,
 178  * and resolving conflicting memory types in favor of highest
 179  * numbered type.
 180  *
 181  * The input parameter biosmap points to an array of 'struct
 182  * e820entry' which on entry has elements in the range [0, *pnr_map)
 183  * valid, and which has space for up to max_nr_map entries.
 184  * On return, the resulting sanitized e820 map entries will be in
 185  * overwritten in the same location, starting at biosmap.
 186  *
 187  * The integer pointed to by pnr_map must be valid on entry (the
 188  * current number of valid entries located at biosmap) and will
 189  * be updated on return, with the new number of valid entries
 190  * (something no more than max_nr_map.)
 191  *
 192  * The return value from sanitize_e820_map() is zero if it
 193  * successfully 'sanitized' the map entries passed in, and is -1
 194  * if it did nothing, which can happen if either of (1) it was
 195  * only passed one map entry, or (2) any of the input map entries
 196  * were invalid (start + size < start, meaning that the size was
 197  * so big the described memory range wrapped around through zero.)
 198  *
 199  *      Visually we're performing the following
 200  *      (1,2,3,4 = memory types)...
 201  *
 202  *      Sample memory map (w/overlaps):
 203  *         ____22__________________
 204  *         ______________________4_
 205  *         ____1111________________
 206  *         _44_____________________
 207  *         11111111________________
 208  *         ____________________33__
 209  *         ___________44___________
 210  *         __________33333_________
 211  *         ______________22________
 212  *         ___________________2222_
 213  *         _________111111111______
 214  *         _____________________11_
 215  *         _________________4______
 216  *
 217  *      Sanitized equivalent (no overlap):
 218  *         1_______________________
 219  *         _44_____________________
 220  *         ___1____________________
 221  *         ____22__________________
 222  *         ______11________________
 223  *         _________1______________
 224  *         __________3_____________
 225  *         ___________44___________
 226  *         _____________33_________
 227  *         _______________2________
 228  *         ________________1_______
 229  *         _________________4______
 230  *         ___________________2____
 231  *         ____________________33__
 232  *         ______________________4_
 233  */
 234 
 235 int __init sanitize_e820_map(struct e820entry *biosmap, int max_nr_map,
 236                              u32 *pnr_map)



 

=======================================================

early_res内存预留区间

I.early_res数据结构
arch/x86/kernel/e820.c

 724 /*
 725  * Early reserved memory areas.
 726  */
 727 #define MAX_EARLY_RES 20
 728 
 729 struct early_res {
 730         u64 start, end;
 731         char name[16];
 732         char overlap_ok;
 733 };
 734 static struct early_res early_res[MAX_EARLY_RES] __initdata = {
 735         { 0, PAGE_SIZE, "BIOS data page" },     /* BIOS data page */
 736         {}
 737 };


 

start:预留区间起始地址
end:预留区间终止地址,0标识数组结束
name:预留区间名称
overlap_ok:可重叠

II.查找重叠的内存区间:
arch/x86/kernel/e820.c

 739 static int __init find_overlapped_early(u64 start, u64 end)
 740 {
 741         int i;
 742         struct early_res *r;
 743 
 744         for (i = 0; i < MAX_EARLY_RES && early_res[i].end; i++) {
 745                 r = &early_res[i];
 746                 if (end > r->start && start < r->end)
 747                         break;
 748         }
 749 
 750         return i;
 751 }


 

查找预留内存区间数组,存在与区间(start,end)重叠的返回重叠区间索引,不存在重叠返回结束区间索引。

III.预留内存区间
1、预留内存区间

 838 static void __init __reserve_early(u64 start, u64 end, char *name,
 839                                                 int overlap_ok)
 840 {
 841         int i;
 842         struct early_res *r;
 843 
 844         i = find_overlapped_early(start, end);
 845         if (i >= MAX_EARLY_RES)
 846                 panic("Too many early reservations");
 847         r = &early_res[i];
 848         if (r->end)
 849                 panic("Overlapping early reservations "
 850                       "%llx-%llx %s to %llx-%llx %s\n",
 851                       start, end - 1, name?name:"", r->start,
 852                       r->end - 1, r->name);
 853         r->start = start;
 854         r->end = end;
 855         r->overlap_ok = overlap_ok;
 856         if (name)
 857                 strncpy(r->name, name, sizeof(r->name) - 1);
 858 }


 

添加一个新区间到early_res数组的尾部,如果出现重叠或超出预留区间最大值MAX_EARLY_RES则内核panic

2、删除可重叠区间中重叠部分

 771 /*
 772  * Split any existing ranges that:
 773  *  1) are marked 'overlap_ok', and
 774  *  2) overlap with the stated range [start, end)
 775  * into whatever portion (if any) of the existing range is entirely
 776  * below or entirely above the stated range.  Drop the portion
 777  * of the existing range that overlaps with the stated range,
 778  * which will allow the caller of this routine to then add that
 779  * stated range without conflicting with any existing range.
 780  */
 781 static void __init drop_overlaps_that_are_ok(u64 start, u64 end)
 782 {
 783         int i;
 784         struct early_res *r;
 785         u64 lower_start, lower_end;
 786         u64 upper_start, upper_end;
 787         char name[16];
 788 
 789         for (i = 0; i < MAX_EARLY_RES && early_res[i].end; i++) {
 790                 r = &early_res[i];
 791 
 792                 /* Continue past non-overlapping ranges */
 793                 if (end <= r->start || start >= r->end)
 794                         continue;
 795 
 796                 /*
 797                  * Leave non-ok overlaps as is; let caller
 798                  * panic "Overlapping early reservations"
 799                  * when it hits this overlap.
 800                  */
 801                 if (!r->overlap_ok)
 802                         return;
 803 
 804                 /*
 805                  * We have an ok overlap.  We will drop it from the early
 806                  * reservation map, and add back in any non-overlapping
 807                  * portions (lower or upper) as separate, overlap_ok,
 808                  * non-overlapping ranges.
 809                  */
 810 
 811                 /* 1. Note any non-overlapping (lower or upper) ranges. */
 812                 strncpy(name, r->name, sizeof(name) - 1);
 813 
 814                 lower_start = lower_end = 0;
 815                 upper_start = upper_end = 0;
 816                 if (r->start < start) {
 817                         lower_start = r->start;
 818                         lower_end = start;
 819                 }
 820                 if (r->end > end) {
 821                         upper_start = end;
 822                         upper_end = r->end;
 823                 }
 824 
 825                 /* 2. Drop the original ok overlapping range */
 826                 drop_range(i);
 827 
 828                 i--;            /* resume for-loop on copied down entry */
 829 
 830                 /* 3. Add back in any non-overlapping ranges. */
 831                 if (lower_end)
 832                         reserve_early_overlap_ok(lower_start, lower_end, name);
 833                 if (upper_end)
 834                         reserve_early_overlap_ok(upper_start, upper_end, name);
 835         }
 836 }


 

a.计算出未重叠的前半部分与后半部分
b.释放掉原区间
c.将未重叠的前半部分与后半部分以可重叠的方式加入到预留内存区间

3.分配可重叠区间

 860 /*
 861  * A few early reservtations come here.
 862  *
 863  * The 'overlap_ok' in the name of this routine does -not- mean it
 864  * is ok for these reservations to overlap an earlier reservation.
 865  * Rather it means that it is ok for subsequent reservations to
 866  * overlap this one.
 867  *
 868  * Use this entry point to reserve early ranges when you are doing
 869  * so out of "Paranoia", reserving perhaps more memory than you need,
 870  * just in case, and don't mind a subsequent overlapping reservation
 871  * that is known to be needed.
 872  *
 873  * The drop_overlaps_that_are_ok() call here isn't really needed.
 874  * It would be needed if we had two colliding 'overlap_ok'
 875  * reservations, so that the second such would not panic on the
 876  * overlap with the first.  We don't have any such as of this
 877  * writing, but might as well tolerate such if it happens in
 878  * the future.
 879  */
 880 void __init reserve_early_overlap_ok(u64 start, u64 end, char *name)
 881 {
 882         drop_overlaps_that_are_ok(start, end);
 883         __reserve_early(start, end, name, 1);
 884 }


 

首先释放出可重叠区间的重叠部分,然后做以可重叠的方式做区间预留;可重叠是指以后的预留过程中,该区间可以重叠,而不是重叠以前的区间。

4、分配不可重叠区间

 886 /*
 887  * Most early reservations come here.
 888  *
 889  * We first have drop_overlaps_that_are_ok() drop any pre-existing
 890  * 'overlap_ok' ranges, so that we can then reserve this memory
 891  * range without risk of panic'ing on an overlapping overlap_ok
 892  * early reservation.
 893  */
 894 void __init reserve_early(u64 start, u64 end, char *name)
 895 {
 896         if (start >= end)
 897                 return;
 898 
 899         drop_overlaps_that_are_ok(start, end);
 900         __reserve_early(start, end, name, 0);
 901 }


 

首先释放出可重叠区间的重叠部分,然后以不可重叠的方式做区间预留

IV.释放内存区间

 753 /*
 754  * Drop the i-th range from the early reservation map,
 755  * by copying any higher ranges down one over it, and
 756  * clearing what had been the last slot.
 757  */
 758 static void __init drop_range(int i)
 759 {
 760         int j;
 761 
 762         for (j = i + 1; j < MAX_EARLY_RES && early_res[j].end; j++)
 763                 ;
 764 
 765         memmove(&early_res[i], &early_res[i + 1],
 766                (j - 1 - i) * sizeof(struct early_res));
 767 
 768         early_res[j - 1].end = 0;
 769 }



释放第i个区间,并将i后的所有区间向前移动sizeof(struct early_res)字节

你可能感兴趣的:(k-mm)