Thus an operating system must fulfill three requirements: multiplexing, isolation, and interaction.
The first question one might ask when encountering an operating system is why have it at all?
monolithic kernel:
microkernel:
(p->kstack)
. The kernel stack is separate (and protected from user code) so that the kernel can execute even if a process has wrecked its user stack.bootloader loads the xv6 kernel into memory.
machine mode, CPU executes xv6 starting at _entry
(kernel/entry.S:7), The RISC-V starts with paging hardware disabled: virtual addresses map directly to physical addresses.
The loader loads the xv6 kernel into memory at physical address 0x80000000. 0:0x80000000 for IO devices.
The instructions at _entry
set up a stack(stack0
) so that xv6 can run C code. xv6 loads sp
with stack0 + 4096
, the top of stack, because the stack on RISC-V grows down. Now that the kernel has a stack, _entry calls into C code at start
(kernel/start.c:21).
Before jumping into supervisor mode: mret
: return to supervisor mode from machine mode. Function start
do: a. set mstatus register to supervisor mode; b. set mepc register using main
address; c. disables virtual address translation in supervisor mode by writing 0 into satp register; d. delegates all interrupts and exceptions to supervisor mode.
generate timer interrupts, now start
returns to supervisor mode by calling mret
. Now PC changes to main
main
creates the first process by calling userinit
(kernel/proc.c:226)
Once the kernel has completed exec
, it returns to user space in the /init
process. Init
(user/init.c:15) creates a new console device file if needed and then opens it as file descriptors 0, 1, and 2. Then it starts a shell on the console. The system is up.
Add a system call to xv6 that returns the amount of free memory available.
其实我们在上一个Lab已经基本介绍了系统调用的实现原理,这里进行尝试。不妨将这个syscall叫做lsmem。下面是代码的修改:
kernel/syscall.h
:
#define SYS_lsmem 22
kernel/syscall.c
:
// something...
extern uint64 sys_lsmem(void);
static uint64 (*syscalls[])(void) = {
[SYS_fork] sys_fork,
[SYS_exit] sys_exit,
[SYS_wait] sys_wait,
[SYS_pipe] sys_pipe,
[SYS_read] sys_read,
[SYS_kill] sys_kill,
[SYS_exec] sys_exec,
[SYS_fstat] sys_fstat,
[SYS_chdir] sys_chdir,
[SYS_dup] sys_dup,
[SYS_getpid] sys_getpid,
[SYS_sbrk] sys_sbrk,
[SYS_sleep] sys_sleep,
[SYS_uptime] sys_uptime,
[SYS_open] sys_open,
[SYS_write] sys_write,
[SYS_mknod] sys_mknod,
[SYS_unlink] sys_unlink,
[SYS_link] sys_link,
[SYS_mkdir] sys_mkdir,
[SYS_close] sys_close,
[SYS_lsmem] sys_lsmem,
};
// something
kernel/sysproc.c
:
uint64
sys_lsmem(void)
{
return (uint64)(kfmstat());
}
kernel/kalloc.c
:
// kernel free mem stat
int
kfmstat(void)
{
int res = 0;
struct run *r;
acquire(&kmem.lock);
r = kmem.freelist;
release(&kmem.lock);
while (r) {
res += PGSIZE;
r = r->next;
}
return res;
}
user/defs.h
中声明int kfmstat(void);
user/usys.pl
中加上entry("lsmem");
user/user.h
中声明int lsmem(void);
结果(还需要在user/
实现一个类似lsmem.c
这样的测试命令使用lsmem
这个syscall):
avaiable memory: 133386240B, 130260KB
约为127MB,qemu的模拟是128MB:
qemu-system-riscv64 -machine virt -bios none -kernel kernel/kernel -m 128M -smp 3 -nographic -drive file=fs.img,if=none,format=raw,id=x0 -device virtio-blk-device,drive=x0,bus=virtio-mmio-bus.0
基本准确。
这部分在之前Lab util的报告中已分析过,不赘述。
在执行syscall时,kernel在有些时候会从user space读取参数,在xv6中,可以通过argint, argaddr, argfd
等内核函数(可以简单理解为在kernel
文件夹下),通过argraw
从相应的寄存器中读取对应的参数。
有些情况下,kernel会直接通过user space传递的指针直接从user memory读写数据,这样会带来安全性的问题:用户传递的参数指针很可能有bug或者是恶意的,而且最关键的是xv6内核的页表和用户空间页表不共享,因此无法通过原始的load store指令直接从user space读取数据。
因此内核实现了一些安全的函数,来实现从user space移动数据到内核空间。fetchstr
是一个例子(kernel/syscall.c:25
),通过调用copyinstr
完成工作。
int
fetchstr(uint64 addr, char *buf, int max)
{
struct proc *p = myproc();
int err = copyinstr(p->pagetable, buf, addr, max);
if(err < 0)
return err;
return strlen(buf);
}
// Copy a null-terminated string from user to kernel.
// Copy bytes to dst from virtual address srcva in a given page table,
// until a '\0', or max.
// Return 0 on success, -1 on error.
int
copyinstr(pagetable_t pagetable, char *dst, uint64 srcva, uint64 max)
{
uint64 n, va0, pa0;
int got_null = 0;
while(got_null == 0 && max > 0){
va0 = PGROUNDDOWN(srcva); // #define PGROUNDDOWN(a) (((a)) & ~(PGSIZE-1))
pa0 = walkaddr(pagetable, va0); // Look up a virtual address, return the physical address
if(pa0 == 0)
return -1;
n = PGSIZE - (srcva - va0);
if(n > max)
n = max;
char *p = (char *) (pa0 + (srcva - va0));
while(n > 0){
if(*p == '\0'){
*dst = '\0';
got_null = 1;
break;
} else {
*dst = *p;
}
--n;
--max;
p++;
dst++;
}
srcva = va0 + PGSIZE;
}
if(got_null){
return 0;
} else {
return -1;
}
}
有一个问题是,xv6获取va对应的pa是通过获取对应page的首地址之后加offset这种方式获得的,所以直觉上可以有一个函数,类似uint64 va2pa(uint64 va)
直接返回pa,这种实现并不难,只需要在调用PTE2PA之后在或一个va的[11-0]即可,但是为什么没这么做,是因为va连续并不代表pa连续,即va上连续的两个page在pa上很可能不连续。