mmap代码在这里。这是完成的最后一个实验了,因为网络部分在计算机网络实验中有所涉及,就没再做了。只能说,完结撒花吧!感谢xv6陪伴的这一个学期。
都最后一个实验了,相关参考资料就不多提了吧~还是提一下:
我的博客OS实验xv6 6.S081 开坑中给出了一些有用的参考资料,大家可以参考。
老样子,先给实验指导书。
本次实验主要完成两个任务:
mmap
munmap
什么是mmap
呢?在Linux Shell中输入man mmap
后会得到下述解答:
mmap主要的工作就是将文件描述符fd
对应的文件映射到进程的虚拟内存中,映射的起始地址从addr
开始,如果addr
为0,则表明kernel应该自行为其分配地址,length
表明映射的长度,prot
表明文件的权限,本次实验只要求实现PROT_READ
和PROT_WRITE
,flags
代表映射的类型,有很多类型,但在本次实验中我们只需要处理两种类型:MAP_SHARED
和MAP_PRIVATE
。其中可以简单的理解为MAP_SHARED
要将映射到内存中的文件回写,而MAP_PRIVATE
则不。
如何完成mmap
呢?从Hints入手:
下面,我们开始逐一解读这些Hints的含义:
mmap
和munmap
添加系统调用,并添加一些相关的标志位,例如PROT_READ
等;mmap
会返回映射到虚拟空间的起始地址,但该地址并没有进行页映射(mappages
),也就是说mmap
并不做内存映射的工作,因此当用户进程访问到这地址时会出现缺页异常,我们需要在trap.c
中捕获该异常,并为其懒加载一个页面,将文件的内容写进去,这才真正完成了文件到内存的映射;mmap
映射的每一块内存,这需要定义一个VMA
数据结构,VMA
要能够记录内存块的起始地址、长度、权限、相关文件等信息;mmap
,要在用户进程的地址空间中找到一块空闲的空间用于内存映射,这就需要好好设计VMA
和proc
;readi
将文件的内容读取到相应的内存地址中;munmap
,用uvmunmap
释放页面,并查看当前flags
,如果是MAP_SHARED
,还应该将内存的内容回写到文件中;exit
函数,使之能够像munmap
一样释放所有mmap
的区域;fork
,保证子进程和父进程有一样的映射区域。首先是syscall的实现,这个就不再做赘述了。我们就从设计VMA
开始吧,具体的VMA
该如何设计是参考了Linux中的VMA
/**
*
* Define a structure corresponding to the VMA (virtual memory area) described in Lecture 15,
* recording the address, length, permissions, file, etc. for a virtual memory range created by mmap
*
* RETRIVEFROM:
* https://sites.google.com/site/knsathyawiki/example-page/chapter-15-the-process-address-space#TOC-Virtual-Memory-Areas
*/
struct VMA
{
/** 说明VMA是否可用,1为可用,0为已被占用 */
int vm_valid;
uint64 vm_start;
uint64 vm_end;
/** Flags */
int vm_flags;
/** 页面权限,可写?可读? */
int vm_prot;
/** 指向某个文件 */
struct file* vm_file;
/** 文件描述符 */
int vm_fd;
};
接下来,我们修改proc
中的内容,使之能够管理VMA
,这里我们定义最多管理NVMA=100
个VAM
,代码如下:
/** Implementation of MMAP */
/** VMA管理数组 */
struct VMA vmas[NVMA];
好了,接下来就需要好好设计一下映射模式了:
How to find an unused region in the process’s address space in which to map the file?
我们将实现拉回到用户进程的内存布局中,如下图所示。可以看到,heap段便是我们可用的内存空间。我们可以让映射内存从trapframe
之下开始向下增长。
如何记录当前已经增长到哪个位置了呢?我们则需要一个当前最大VMA地址指针来记录。最终方案如下图所示:
思路既成,那么实现起来就非常快了。首先在proc
中添加字段:
/** 当前可用的最大的虚拟地址 从上往下*/
uint64 current_maxva;
接下来在allocproc
中初始化添加的proc
字段。这里要注意的是,xv6中提到:trampoline与trapframe均占用一个页面,因此初始的current_maxva
为PGROUNDDOWN(MAXVA - 2 * PGSIZE)
#define VMASTART PGROUNDDOWN(MAXVA - 2 * PGSIZE)
static struct proc*
allocproc(void)
{
......
/** Implementation of MMAP */
for (int i = NVMA - 1; i >= 0; i--)
{
p->vmas[i].vm_valid = 1;
}
p->current_maxva = VMASTART;
return p;
}
下面开始正式实现mmap
。基本思路是:首先找到第一个可用的VMA
,然后再根据current_maxva
和传入的参数为其设置各种字段。需要注意的是,vm_start
应该小于vm_end
,因为在之后的trap.c
中我们会利用kalloc
分配内存,kalloc
分配的内存是向上增长的,而我们想要的正是让内存从vm_start
向上增长。
uint64
sys_mmap(void){
uint64 addr;
int length;
int prot;
int flags;
int fd;
struct file* f;
int offset;
if(argaddr(0, &addr) < 0 || argint(1, &length) < 0 ||
argint(2, &prot) < 0 || argint(3, &flags) < 0 ||
argfd(4, &fd, &f) < 0 || argint(5, &offset) < 0){
return -1;
}
if(!f->writable && (prot & PROT_WRITE) && (flags & MAP_SHARED)){
return -1;
}
/**
*
* you can assume that addr will always be zero,
* meaning that the kernel should decide the virtual address at which to map the file
*
* offset it's the starting point in the file at which to map
*/
struct proc* p;
p = myproc();
struct VMA* vma = 0;
/** 从上往下找到第一个可用的vma */
for (int i = NVMA - 1; i >= 0; i--)
{
if(p->vmas[i].vm_valid){
vma = &p->vmas[i];
/** 置当前的imaxvma为i */
p->current_imaxvma = i;
break;
}
}
/**
* 1. VMA:START在下方
* 原因:kalloc是分配内存向上增长,因此start要在下方
* ->current_maxva 0x001xx END
* ............
* 0x000xx START
*
* ----------------------------------
* 2. 更新current_maxva
* 0x001xx END
* ............
* ->current_maxva 0x000xx START
* ----------------------------------
*/
if(vma){
/** 记得这里要用uint64,否则会做最高位拓展 */
printf("sys_mmap(): %p, length: %d\n",p->current_maxva, length);
uint64 vm_end = PGROUNDDOWN(p->current_maxva);
uint64 vm_start = PGROUNDDOWN(p->current_maxva - length);
printf("vm_start(): %p, vm_end: %p\n",vm_start, vm_end);
vma->vm_valid = 0;
vma->vm_fd = fd;
vma->vm_file = f;
vma->vm_flags = flags;
vma->vm_prot = prot;
vma->vm_end = vm_end;
vma->vm_start = vm_start;
/**
* mmap should increase the file's reference count
* so that the structure doesn't disappear when the file is closed (hint: see filedup).
*/
vma->vm_file->ref++;
p->current_maxva = vm_start;
}
else
{
return -1;
}
return vma->vm_start;
}
下面便转战trap.c
了,四个关键:第一,捕获缺页异常;第二,懒加载一个页面;第三,根据出错的虚拟地址找到相应的VMA
;第四,通过VMA
中保存的file
并利用readi
将文件内容读取到页面内。代码如下:
if(r_scause() == 13 || r_scause() == 15){
/**
* Implement Lazy allocation for mmap
*
* REASON: That is, mmap should not allocate physical memory or read the file
* */
struct proc* p = myproc();
uint64 va = PGROUNDDOWN(r_stval());
printf("MAXVA: %p, va: %p, current_max: %p\n",MAXVA, va, p->current_maxva);
/** 找到虚拟地址对应的vma */
struct VMA* vma = 0; /* = &p->vmas[p->current_ivma]; */
for (int i = NVMA; i >= 0; i--)
{
if(p->vmas[i].vm_start <= va && va <= p->vmas[i].vm_end){
vma = &p->vmas[i];
break;
}
}
if(vma == 0){
printf("usertrap(): not find vma \n");
p->killed = 1;
goto end;
}
if(va > vma->vm_end){
printf("usertrap(): va is greater than vm_end \n");
p->killed = 1;
goto end;
}
/** 内存向上增长 */
char* mem = (char *)kalloc();
if(mem == 0){
printf("usertrap(): no mem left\n");
p->killed = 1;
goto end;
}
printf("walk va %p result : %d \n",va, walkaddr(p->pagetable, va));
memset(mem, 0, PGSIZE);
/** Don't forget to set the permissions correctly on the page */
if(mappages(p->pagetable, va, PGSIZE, (uint64)mem, vma->vm_prot|PTE_U|PTE_X) < 0){
printf("usertrap(): cannot map\n");
kfree(mem);
p->killed = 1;
goto end;
}
/** 利用readi将文件内容映射到虚拟地址上,映射的文件开始地址偏移为 va - vma->vm_start */
struct file* f = vma->vm_file;
int offset = va - vma->vm_start;
ilock(f->ip);
//1表示读入用户内存,
readi(f->ip, 1, va, offset, PGSIZE);
iunlock(f->ip);
}
else
{
printf("usertrap(): unexpected scause %p (%s) pid=%d\n", r_scause(), scause_desc(r_scause()), p->pid);
printf(" sepc=%p stval=%p\n", r_sepc(), r_stval());
p->killed = 1;
}
}
好,现在我们已经完成了mmap
,接下来该实现munmap
了。munmap
就是释放mmap
的空间,当发现mmap
的flages
是MAP_SHARED
时我们要回写文件,否则不用(本次实验中是这样的)。另外我们我们还需要考虑一个问题,那就是内部碎片的问题,设想下面的情况。VMA[1]
被释放了,但是我们的current_maxva
并不能移动。那么在此后再进行mmap
时,按照我们的算法这块被释放的内存将永远不会被用到了。
为了解决这个问题,我们需要在释放后进行内存紧缩的操作,如下图所示:
代码实现如下:
uint64
sys_munmap(void){
uint64 addr;
int length;
if(argaddr(0, &addr) < 0 || argint(1, &length) < 0){
return -1;
}
printf("### sys_munmap: \n");
printf("addr: %p, length:%d, current:%p\n", addr, length, myproc()->current_maxva);
struct proc* p = myproc();
for (int i = NVMA - 1; i >= 0; i--)
{
if(p->vmas[i].vm_start <= addr && addr <= p->vmas[i].vm_end){
struct VMA* vma = &p->vmas[i];
/**
* 1. If an unmapped page has been modified and the file is mapped MAP_SHARED,
* write the page back to the file. Look at filewrite for inspiration.
*
*
* 2. However, mmaptest does not check that non-dirty pages are not written back;
* thus you can get away with writing pages back without looking at D bits.
*
*
* 1. unmap的时候,只会从一个vma的起始开始,因此,可以默认p->vmas[i].vm_start = addr,因此
* 我们后面vma->vm_start += length操作。
* 2. 就是说,碰到MAP_SHARED就回写,不用理会“写脏D位”
*
* 3. 指针current_maxva基于current_imaxvma紧缩,这样是一个折中的办法,而不是一直向下增长
*
* */
/** 首先要判断 */
if(walkaddr(p->pagetable, vma->vm_start)){
if(vma->vm_flags == MAP_SHARED){
printf("sys_munmap(): write back \n");
/** 回写文件 */
filewrite(vma->vm_file, vma->vm_start, length);
}
uvmunmap(p->pagetable, vma->vm_start, length ,1);
}
vma->vm_start += length;
printf("vma_start: %p, vma_end: %p\n", vma->vm_start, vma->vm_end);
if(vma->vm_start == vma->vm_end){
vma->vm_file->ref--;
/** 置该块可用 */
vma->vm_valid = 1;
}
/** Shrink */
int j;
/** 紧缩 p->current_maxva */
for (j = p->current_imaxvma; j < NVMA; j++)
{
if(!p->vmas[j].vm_valid){
p->current_maxva = p->vmas[j].vm_start;
p->current_imaxvma = j;
break;
}
}
if(j == NVMA){
p->current_maxva = VMASTART;
}
return 0;
}
}
return -1;
}
然后,修改exit
,与munmap
的实现基本一样,不过因为它是释放所有VMA
及其映射的内存,因此不需要考虑内存紧缩问题,代码如下:
void
exit(int status)
{
struct proc *p = myproc();
if(p == initproc)
panic("init exiting");
// Close all open files.
for(int fd = 0; fd < NOFILE; fd++){
if(p->ofile[fd]){
struct file *f = p->ofile[fd];
fileclose(f);
p->ofile[fd] = 0;
}
}
/**
* Unmap all the maped regions
* 从下往上回收
* */
struct VMA* vma;
for (int i = 0; i < NVMA; i++)
{
if(!p->vmas[i].vm_valid){
vma = &p->vmas[i];
vma->vm_valid = 1;
int totsz = vma->vm_end - vma->vm_start;
if(walkaddr(p->pagetable, vma->vm_start)){
if(vma->vm_flags == MAP_SHARED){
printf("sys_munmap(): write back \n");
filewrite(vma->vm_file, vma->vm_start, totsz);
}
uvmunmap(p->pagetable, vma->vm_start, totsz,1);
}
vma->vm_start += totsz;
if(vma->vm_start == vma->vm_end){
vma->vm_file->ref--;
}
}
}
p->current_maxva = VMASTART;
.......
}
最后,完成fork
时proc
中添加字段的复制:
int
fork(void)
{
int i, pid;
struct proc *np;
struct proc *p = myproc();
// Allocate process.
if((np = allocproc()) == 0){
return -1;
}
// Copy user memory from parent to child.
if(uvmcopy(p->pagetable, np->pagetable, p->sz) < 0){
freeproc(np);
release(&np->lock);
return -1;
}
np->sz = p->sz;
np->parent = p;
/** Modify fork to ensure that the child has the same mapped regions as the parent. */
np->current_maxva = p->current_maxva;
np->current_imaxvma = p->current_imaxvma;
for (int i = NVMA - 1; i >= 0; i--)
{
if(p->vmas[i].vm_file)
p->vmas[i].vm_file->ref++;
np->vmas[i].vm_end = p->vmas[i].vm_end;
np->vmas[i].vm_fd = p->vmas[i].vm_fd;
np->vmas[i].vm_file = p->vmas[i].vm_file;
np->vmas[i].vm_flags = p->vmas[i].vm_flags;
np->vmas[i].vm_prot = p->vmas[i].vm_prot;
np->vmas[i].vm_start = p->vmas[i].vm_start;
np->vmas[i].vm_valid = p->vmas[i].vm_valid;
}
......
}
这次的usertests
运行得也太快了吧,心里不踏实,看一下xv6.out
:
xv6 kernel is booting
virtio disk init 0
init: starting sh
$ usertests
usertests starting
test reparent2: OK
test pgbug: OK
test sbrkbugs: usertrap(): unexpected scause 0x000000000000000c (instruction page fault) pid=3207
sepc=0x00000000000044b0 stval=0x00000000000044b0
usertrap(): unexpected scause 0x000000000000000c (instruction page fault) pid=3208
sepc=0x00000000000044b0 stval=0x00000000000044b0
OK
test badarg: OK
test reparent: OK
test twochildren: OK
test forkfork: OK
test forkforkfork: OK
test argptest: OK
test createdelete: OK
test linkunlink: OK
test linktest: OK
test unlinkread: OK
test concreate: OK
test subdir: OK
test fourfiles: OK
test sharedfd: OK
test exectest: OK
test bigargtest: OK
test bigwrite: OK
test bsstest: OK
test sbrkbasic: OK
test sbrkmuch: OK
test kernmem: MAXVA: 0x0000004000000000, va: 0x0000000080000000, current_max: 0x0000003fffffe000
usertrap(): not find vma
MAXVA: 0x0000004000000000, va: 0x000000008000c000, current_max: 0x0000003fffffe000
usertrap(): not find vma
MAXVA: 0x0000004000000000, va: 0x0000000080018000, current_max: 0x0000003fffffe000
usertrap(): not find vma
MAXVA: 0x0000004000000000, va: 0x0000000080024000, current_max: 0x0000003fffffe000
usertrap(): not find vma
MAXVA: 0x0000004000000000, va: 0x0000000080030000, current_max: 0x0000003fffffe000
usertrap(): not find vma
MAXVA: 0x0000004000000000, va: 0x000000008003d000, current_max: 0x0000003fffffe000
usertrap(): not find vma
MAXVA: 0x0000004000000000, va: 0x0000000080049000, current_max: 0x0000003fffffe000
usertrap(): not find vma
MAXVA: 0x0000004000000000, va: 0x0000000080055000, current_max: 0x0000003fffffe000
usertrap(): not find vma
MAXVA: 0x0000004000000000, va: 0x0000000080061000, current_max: 0x0000003fffffe000
usertrap(): not find vma
MAXVA: 0x0000004000000000, va: 0x000000008006d000, current_max: 0x0000003fffffe000
usertrap(): not find vma
MAXVA: 0x0000004000000000, va: 0x000000008007a000, current_max: 0x0000003fffffe000
usertrap(): not find vma
MAXVA: 0x0000004000000000, va: 0x0000000080086000, current_max: 0x0000003fffffe000
usertrap(): not find vma
MAXVA: 0x0000004000000000, va: 0x0000000080092000, current_max: 0x0000003fffffe000
usertrap(): not find vma
MAXVA: 0x0000004000000000, va: 0x000000008009e000, current_max: 0x0000003fffffe000
usertrap(): not find vma
MAXVA: 0x0000004000000000, va: 0x00000000800aa000, current_max: 0x0000003fffffe000
usertrap(): not find vma
MAXVA: 0x0000004000000000, va: 0x00000000800b7000, current_max: 0x0000003fffffe000
usertrap(): not find vma
MAXVA: 0x0000004000000000, va: 0x00000000800c3000, current_max: 0x0000003fffffe000
usertrap(): not find vma
MAXVA: 0x0000004000000000, va: 0x00000000800cf000, current_max: 0x0000003fffffe000
usertrap(): not find vma
MAXVA: 0x0000004000000000, va: 0x00000000800db000, current_max: 0x0000003fffffe000
usertrap(): not find vma
MAXVA: 0x0000004000000000, va: 0x00000000800e7000, current_max: 0x0000003fffffe000
usertrap(): not find vma
MAXVA: 0x0000004000000000, va: 0x00000000800f4000, current_max: 0x0000003fffffe000
usertrap(): not find vma
MAXVA: 0x0000004000000000, va: 0x0000000080100000, current_max: 0x0000003fffffe000
usertrap(): not find vma
MAXVA: 0x0000004000000000, va: 0x000000008010c000, current_max: 0x0000003fffffe000
usertrap(): not find vma
MAXVA: 0x0000004000000000, va: 0x0000000080118000, current_max: 0x0000003fffffe000
usertrap(): not find vma
MAXVA: 0x0000004000000000, va: 0x0000000080124000, current_max: 0x0000003fffffe000
usertrap(): not find vma
MAXVA: 0x0000004000000000, va: 0x0000000080131000, current_max: 0x0000003fffffe000
usertrap(): not find vma
MAXVA: 0x0000004000000000, va: 0x000000008013d000, current_max: 0x0000003fffffe000
usertrap(): not find vma
MAXVA: 0x0000004000000000, va: 0x0000000080149000, current_max: 0x0000003fffffe000
usertrap(): not find vma
MAXVA: 0x0000004000000000, va: 0x0000000080155000, current_max: 0x0000003fffffe000
usertrap(): not find vma
MAXVA: 0x0000004000000000, va: 0x0000000080162000, current_max: 0x0000003fffffe000
usertrap(): not find vma
MAXVA: 0x0000004000000000, va: 0x000000008016e000, current_max: 0x0000003fffffe000
usertrap(): not find vma
MAXVA: 0x0000004000000000, va: 0x000000008017a000, current_max: 0x0000003fffffe000
usertrap(): not find vma
MAXVA: 0x0000004000000000, va: 0x0000000080186000, current_max: 0x0000003fffffe000
usertrap(): not find vma
MAXVA: 0x0000004000000000, va: 0x0000000080192000, current_max: 0x0000003fffffe000
usertrap(): not find vma
MAXVA: 0x0000004000000000, va: 0x000000008019f000, current_max: 0x0000003fffffe000
usertrap(): not find vma
MAXVA: 0x0000004000000000, va: 0x00000000801ab000, current_max: 0x0000003fffffe000
usertrap(): not find vma
MAXVA: 0x0000004000000000, va: 0x00000000801b7000, current_max: 0x0000003fffffe000
usertrap(): not find vma
MAXVA: 0x0000004000000000, va: 0x00000000801c3000, current_max: 0x0000003fffffe000
usertrap(): not find vma
MAXVA: 0x0000004000000000, va: 0x00000000801cf000, current_max: 0x0000003fffffe000
usertrap(): not find vma
MAXVA: 0x0000004000000000, va: 0x00000000801dc000, current_max: 0x0000003fffffe000
usertrap(): not find vma
OK
test sbrkfail: MAXVA: 0x0000004000000000, va: 0x0000000000010000, current_max: 0x0000003fffffe000
usertrap(): not find vma
OK
test sbrkarg: OK
test validatetest: OK
test stacktest: MAXVA: 0x0000004000000000, va: 0x000000000000d000, current_max: 0x0000003fffffe000
usertrap(): not find vma
OK
test opentest: OK
test writetest: OK
test writebig: OK
test createtest: OK
test openiput: OK
test exitiput: OK
test iput: OK
test mem: OK
test pipe1: OK
test preempt: kill... wait... OK
test exitwait: OK
test rmdot: OK
test fourteen: OK
test bigfile: OK
test dirfile: OK
test iref: OK
test forktest: OK
test bigdir: OK
ALL TESTS PASSED
$ qemu-system-riscv64: terminating on signal 15 from pid 12749 (make)
应该是没啥问题了!OK,起飞!