项目地址:Lab 4
在merge的时候发生了冲突,需要手动消除。
该内核支持多核,BSP用低物理地址来引导其他的CPU加载系统。
利用LAPIC做的三件事情:
Exercise 1. Implement mmio_map_region in kern/pmap.c. To see how this is used, look at the beginning of lapic_init in kern/lapic.c. You’ll have to do the next exercise, too, before the tests for mmio_map_region will run.
实现代码如下:
void *
mmio_map_region(physaddr_t pa, size_t size)
{
// Where to start the next region. Initially, this is the
// beginning of the MMIO region. Because this is static, its
// value will be preserved between calls to mmio_map_region
// (just like nextfree in boot_alloc).
static uintptr_t base = MMIOBASE;
// Reserve size bytes of virtual memory starting at base and
// map physical pages [pa,pa+size) to virtual addresses
// [base,base+size). Since this is device memory and not
// regular DRAM, you'll have to tell the CPU that it isn't
// safe to cache access to this memory. Luckily, the page
// tables provide bits for this purpose; simply create the
// mapping with PTE_PCD|PTE_PWT (cache-disable and
// write-through) in addition to PTE_W. (If you're interested
// in more details on this, see section 10.5 of IA32 volume
// 3A.)
//
// Be sure to round size up to a multiple of PGSIZE and to
// handle if this reservation would overflow MMIOLIM (it's
// okay to simply panic if this happens).
//
// Hint: The staff solution uses boot_map_region.
//
// Your code here:
size_t the_size = ROUNDUP(size,PGSIZE);
boot_map_region(kern_pgdir,base,the_size,pa,PTE_PCD|PTE_PWT|PTE_W);
void *ret = (void*)base;
base += the_size;
return ret;
//panic("mmio_map_region not implemented");
}
Exercise 2. Read boot_aps() and mp_main() in kern/init.c, and the assembly code in kern/mpentry.S. Make sure you understand the control flow transfer during the bootstrap of APs. Then modify your implementation of page_init() in kern/pmap.c to avoid adding the page at MPENTRY_PADDR to the free list, so that we can safely copy and run AP bootstrap code at that physical address. Your code should pass the updated check_page_free_list() test (but might fail the updated check_kern_pgdir() test, which we will fix soon).
实现代码:
void
page_init(void)
{
// LAB 4:
// Change your code to mark the physical page at MPENTRY_PADDR
// as in use
// The example code here marks all physical pages as free.
// However this is not truly the case. What memory is free?
// 1) Mark physical page 0 as in use.
// This way we preserve the real-mode IDT and BIOS structures
// in case we ever need them. (Currently we don't, but...)
// 2) The rest of base memory, [PGSIZE, npages_basemem * PGSIZE)
// is free.
// 3) Then comes the IO hole [IOPHYSMEM, EXTPHYSMEM), which must
// never be allocated.
// 4) Then extended memory [EXTPHYSMEM, ...).
// Some of it is in use, some is free. Where is the kernel
// in physical memory? Which pages are already in use for
// page tables and other data structures?
//
// Change the code to reflect this.
// NB: DO NOT actually touch the physical memory corresponding to
// free pages!
size_t i;
pages[0].pp_ref = 1;
pages[0].pp_link = NULL;
for (i=1; i<(MPENTRY_PADDR/PGSIZE); i++)
{
pages[i].pp_ref = 0;
pages[i].pp_link = page_free_list;
page_free_list = &pages[i];
}
extern unsigned char mpentry_start[],mpentry_end[];
size_t mp_size = ROUNDUP(mpentry_end-mpentry_start,PGSIZE);
for (; i<((MPENTRY_PADDR+mp_size)/PGSIZE); i++)
{
pages[i].pp_ref = 1;
pages[i].pp_link = NULL;
}
for (; i<npages_basemem; i++)
{
pages[i].pp_ref = 0;
pages[i].pp_link = page_free_list;
page_free_list = &pages[i];
}
for (i=IOPHYSMEM/PGSIZE; i<EXTPHYSMEM/PGSIZE; i++)
{
pages[i].pp_ref = 1;
pages[i].pp_link = NULL;
}
//for (; i
for (; i<PADDR(boot_alloc(0))/PGSIZE; i++)
{
pages[i].pp_ref = 1;
pages[i].pp_link = NULL;
}
for (; i<npages; i++)
{
pages[i].pp_ref = 0;
pages[i].pp_link = page_free_list;
page_free_list = &pages[i];
}
}
Question
1.Compare kern/mpentry.S side by side with boot/boot.S. Bearing in mind that kern/mpentry.S is compiled and linked to run above KERNBASE just like everything else in the kernel, what is the purpose of macro MPBOOTPHYS? Why is it necessary in kern/mpentry.S but not in boot/boot.S? In other words, what could go wrong if it were omitted in kern/mpentry.S?
Hint: recall the differences between the link address and the load address that we have discussed in Lab 1.
在引导其他CPU的时候,尚未开启保护模式,在实模式下需要真实的物理地址。由于此时BSP开启了保护模式,故为虚拟地址,需要MPBOOTPHYS这个宏来获取物理地址。
对每一个CPU而言,需要注意以下几个方面:
Exercise 3. Modify mem_init_mp() (in kern/pmap.c) to map per-CPU stacks starting at KSTACKTOP, as shown in inc/memlayout.h. The size of each stack is KSTKSIZE bytes plus KSTKGAP bytes of unmapped guard pages. Your code should pass the new check in check_kern_pgdir().
解答如下:
static void
mem_init_mp(void)
{
// Map per-CPU stacks starting at KSTACKTOP, for up to 'NCPU' CPUs.
//
// For CPU i, use the physical memory that 'percpu_kstacks[i]' refers
// to as its kernel stack. CPU i's kernel stack grows down from virtual
// address kstacktop_i = KSTACKTOP - i * (KSTKSIZE + KSTKGAP), and is
// divided into two pieces, just like the single stack you set up in
// mem_init:
// * [kstacktop_i - KSTKSIZE, kstacktop_i)
// -- backed by physical memory
// * [kstacktop_i - (KSTKSIZE + KSTKGAP), kstacktop_i - KSTKSIZE)
// -- not backed; so if the kernel overflows its stack,
// it will fault rather than overwrite another CPU's stack.
// Known as a "guard page".
// Permissions: kernel RW, user NONE
//
// LAB 4: Your code here:
//
uintptr_t now_base = KSTACKTOP-KSTKSIZE-KSTKGAP;
int i;
for (i=0; i<NCPU; i++)
{
if (now_base <= MMIOLIM)
panic("mem_init_mp:out of range\n");
boot_map_region(kern_pgdir,
now_base+KSTKGAP,
KSTKSIZE,
PADDR(percpu_kstacks[i]),
PTE_W|PTE_P);
now_base -= (KSTKSIZE+KSTKGAP);
}
}
Exercise 4. The code in trap_init_percpu() (kern/trap.c) initializes the TSS and TSS descriptor for the BSP. It worked in Lab 3, but is incorrect when running on other CPUs. Change the code so that it can work on all CPUs. (Note: your new code should not use the global ts variable any more.)
解答如下:
void
trap_init_percpu(void)
{
// The example code here sets up the Task State Segment (TSS) and
// the TSS descriptor for CPU 0. But it is incorrect if we are
// running on other CPUs because each CPU has its own kernel stack.
// Fix the code so that it works for all CPUs.
//
// Hints:
// - The macro "thiscpu" always refers to the current CPU's
// struct CpuInfo;
// - The ID of the current CPU is given by cpunum() or
// thiscpu->cpu_id;
// - Use "thiscpu->cpu_ts" as the TSS for the current CPU,
// rather than the global "ts" variable;
// - Use gdt[(GD_TSS0 >> 3) + i] for CPU i's TSS descriptor;
// - You mapped the per-CPU kernel stacks in mem_init_mp()
// - Initialize cpu_ts.ts_iomb to prevent unauthorized environments
// from doing IO (0 is not the correct value!)
//
// ltr sets a 'busy' flag in the TSS selector, so if you
// accidentally load the same TSS on more than one CPU, you'll
// get a triple fault. If you set up an individual CPU's TSS
// wrong, you may not get a fault until you try to return from
// user space on that CPU.
//
// LAB 4: Your code here:
thiscpu->cpu_ts.ts_esp0 = KSTACKTOP - (KSTKSIZE-KSTKGAP)*cpunum();
thiscpu->cpu_ts.ts_ss0 = GD_KD;
thiscpu->cpu_ts.ts_iomb = sizeof(struct Taskstate);
gdt[(GD_TSS0>>3)+cpunum()] = SEG16(STS_T32A,(uint32_t)(&(thiscpu->cpu_ts)),
sizeof(struct Taskstate) -1 , 0);
gdt[(GD_TSS0>>3)+cpunum()].sd_s = 0;
//ltr(GD_TSS0 + cpunum());
int num = cpunum();
ltr(GD_TSS0+(num<<3));
//下面这一行也可以
//ltr(GD_TSS0+sizeof(struct Segdesc)*num);
lidt(&idt_pd);
/*
// Setup a TSS so that we get the right stack
// when we trap to the kernel.
ts.ts_esp0 = KSTACKTOP;
ts.ts_ss0 = GD_KD;
ts.ts_iomb = sizeof(struct Taskstate);
// Initialize the TSS slot of the gdt.
gdt[GD_TSS0 >> 3] = SEG16(STS_T32A, (uint32_t) (&ts),
sizeof(struct Taskstate) - 1, 0);
gdt[GD_TSS0 >> 3].sd_s = 0;
// Load the TSS selector (like other segment selectors, the
// bottom three bits are special; we leave them 0)
ltr(GD_TSS0);
// Load the IDT
lidt(&idt_pd);
*/
}
注:thiscpu->cpu_ts.ts_esp0 = KSTACKTOP - (KSTKSIZE-KSTKGAP)*cpunum();
应该是thiscpu->cpu_ts.ts_esp0 = KSTACKTOP - (KSTKSIZE+KSTKGAP)*cpunum();
,这里写错了,我做到后面才发现这里写错了,血与泪的教训…不改了,以此警告自己…
Exercise 5. Apply the big kernel lock as described above, by calling lock_kernel() and unlock_kernel() at the proper locations.
解答如下:
//i386_init()
// Your code here:
lock_kernel();
// Starting non-boot CPUs
boot_aps();
//---------------------
//mp_main()
// Your code here:
lock_kernel();
sched_yield();
//---------------------
//trap()
// LAB 4: Your code here.
lock_kernel();
assert(curenv);
//---------------------
//env_run()
lcr3(PADDR(curenv->env_pgdir));
unlock_kernel();
env_pop_tf(&(curenv->env_tf));
Question
2.It seems that using the big kernel lock guarantees that only one CPU can run the kernel code at a time. Why do we still need separate kernel stacks for each CPU? Describe a scenario in which using a shared kernel stack will go wrong, even with the protection of the big kernel lock.
考虑以下:若只有一个内核栈,那么,当CPU1往内核栈里面压数据的时候,CPU2同时往内核栈里面压数据,那么数据就被破坏掉了。因为在发生INT中断的时候,此时并没有上锁。
Exercise 6. Implement round-robin scheduling in sched_yield() as described above. Don’t forget to modify syscall() to dispatch sys_yield().
解答如下(参考别人的):
void
sched_yield(void)
{
struct Env *idle;
// Implement simple round-robin scheduling.
//
// Search through 'envs' for an ENV_RUNNABLE environment in
// circular fashion starting just after the env this CPU was
// last running. Switch to the first such environment found.
//
// If no envs are runnable, but the environment previously
// running on this CPU is still ENV_RUNNING, it's okay to
// choose that environment.
//
// Never choose an environment that's currently running on
// another CPU (env_status == ENV_RUNNING). If there are
// no runnable environments, simply drop through to the code
// below to halt the cpu.
// LAB 4: Your code here.
struct Env *now = thiscpu->cpu_env;
int32_t startid = (now) ? ENVX(now->env_id): 0;
int32_t nextid;
size_t i;
// 当前没有任何环境执行,应该从0开始查找
for(i = 0; i < NENV; i++) {
nextid = (startid+i)%NENV;
if(envs[nextid].env_status == ENV_RUNNABLE) {
env_run(&envs[nextid]);
return;
}
}
// 循环一圈后,没有可执行的环境
if(envs[startid].env_status == ENV_RUNNING && envs[startid].env_cpunum == cpunum()) {
env_run(&envs[startid]);
}
// sched_halt never returns
sched_halt();
}
Question
3.In your implementation of env_run() you should have called lcr3(). Before and after the call to lcr3(), your code makes references (at least it should) to the variable e, the argument to env_run. Upon loading the %cr3 register, the addressing context used by the MMU is instantly changed. But a virtual address (namely e) has meaning relative to a given address context–the address context specifies the physical address to which the virtual address maps. Why can the pointer e be dereferenced both before and after the addressing switch?
在函数
static int
env_setup_vm(struct Env *e)
{
int i;
struct PageInfo *p = NULL;
// Allocate a page for the page directory
if (!(p = page_alloc(ALLOC_ZERO)))
return -E_NO_MEM;
// Now, set e->env_pgdir and initialize the page directory.
//
// Hint:
// - The VA space of all envs is identical above UTOP
// (except at UVPT, which we've set below).
// See inc/memlayout.h for permissions and layout.
// Can you use kern_pgdir as a template? Hint: Yes.
// (Make sure you got the permissions right in Lab 2.)
// - The initial VA below UTOP is empty.
// - You do not need to make any more calls to page_alloc.
// - Note: In general, pp_ref is not maintained for
// physical pages mapped only above UTOP, but env_pgdir
// is an exception -- you need to increment env_pgdir's
// pp_ref for env_free to work correctly.
// - The functions in kern/pmap.h are handy.
// LAB 3: Your code here.
p->pp_ref++;
e->env_pgdir = (pde_t *) page2kva(p); //刚分配的物理页作为页目录使用
memcpy(e->env_pgdir, kern_pgdir, PGSIZE); //继承内核页目录
// UVPT maps the env's own page table read-only.
// Permissions: kernel R, user R
e->env_pgdir[PDX(UVPT)] = PADDR(e->env_pgdir) | PTE_P | PTE_U;
return 0;
}
中,每个环境都复制了内核的虚拟地址空间,所以当切换页表的时候,不会发生错误。
Question
4.Whenever the kernel switches from one environment to another, it must ensure the old environment’s registers are saved so they can be restored properly later. Why? Where does this happen?
发生中断的时候,trapentry.S已经构建了Trapframe,保存了用户态的相关寄存器值。在trap()函数中:
if ((tf->tf_cs & 3) == 3) {
// Trapped from user mode.
// Acquire the big kernel lock before doing any
// serious kernel work.
// LAB 4: Your code here.
lock_kernel();
assert(curenv);
// Garbage collect if current enviroment is a zombie
if (curenv->env_status == ENV_DYING) {
env_free(curenv);
curenv = NULL;
sched_yield();
}
// Copy trap frame (which is currently on the stack)
// into 'curenv->env_tf', so that running the environment
// will restart at the trap point.
curenv->env_tf = *tf;
// The trapframe on the stack should be ignored from here on.
tf = &curenv->env_tf;
}
将保存的寄存器值赋值给当前进程的env_tf,退出内核态的时候,在env_run()函数中用env_tf来返回用户态。
实现Unix中的fork()函数。
Exercise 7. Implement the system calls described above in kern/syscall.c and make sure syscall() calls them. You will need to use various functions in kern/pmap.c and kern/env.c, particularly envid2env(). For now, whenever you call envid2env(), pass 1 in the checkperm parameter. Be sure you check for any invalid system call arguments, returning -E_INVAL in that case. Test your JOS kernel with user/dumbfork and make sure it works before proceeding.
sys_exofork:
static envid_t
sys_exofork(void)
{
// Create the new environment with env_alloc(), from kern/env.c.
// It should be left as env_alloc created it, except that
// status is set to ENV_NOT_RUNNABLE, and the register set is copied
// from the current environment -- but tweaked so sys_exofork
// will appear to return 0.
// LAB 4: Your code here.
struct Env* child;
int result = env_alloc(&child,curenv->env_id);
if (result < 0)
return result;
child->env_status = ENV_NOT_RUNNABLE;
child->env_tf = curenv->env_tf;
child->env_tf.tf_regs.reg_eax = 0;
return child->env_id;
//panic("sys_exofork not implemented");
}
sys_env_set_status:
static int
sys_env_set_status(envid_t envid, int status)
{
// Hint: Use the 'envid2env' function from kern/env.c to translate an
// envid to a struct Env.
// You should set envid2env's third argument to 1, which will
// check whether the current environment has permission to set
// envid's status.
// LAB 4: Your code here.
struct Env *e;
int result = envid2env(envid,&e,1);
if (result < 0)
return result;
assert((status == ENV_RUNNABLE) ||
(status == ENV_NOT_RUNNABLE));
e->env_status = status;
return 0;
//panic("sys_env_set_status not implemented");
}
sys_page_alloc:
static int
sys_page_alloc(envid_t envid, void *va, int perm)
{
// Hint: This function is a wrapper around page_alloc() and
// page_insert() from kern/pmap.c.
// Most of the new code you write should be to check the
// parameters for correctness.
// If page_insert() fails, remember to free the page you
// allocated!
// LAB 4: Your code here.
struct PageInfo *p;
struct Env *e;
if ((uintptr_t)va >= UTOP)
return -E_INVAL;
//if (!(perm&PTE_P) || !(per & PTE_P))
if ((perm & PTE_SYSCALL) == 0)
return -E_INVAL;
if (perm & (~PTE_SYSCALL))
return -E_INVAL;
p = page_alloc(ALLOC_ZERO);
int result = envid2env(envid,&e,1);
if (result < 0)
return -E_BAD_ENV;
if (p == NULL)
return -E_NO_MEM;
result = page_insert(e->env_pgdir,p,va,perm);
if (result < 0)
{
page_free(p);
return -E_NO_MEM;
}
return 0;
//panic("sys_page_alloc not implemented");
}
sys_page_map:
static int
sys_page_map(envid_t srcenvid, void *srcva,
envid_t dstenvid, void *dstva, int perm)
{
// Hint: This function is a wrapper around page_lookup() and
// page_insert() from kern/pmap.c.
// Again, most of the new code you write should be to check the
// parameters for correctness.
// Use the third argument to page_lookup() to
// check the current permissions on the page.
// LAB 4: Your code here.
struct Env *src,*dst;
struct PageInfo *srcpp;
pte_t *pte;
if ((perm & PTE_SYSCALL) == 0 || (perm & (~PTE_SYSCALL)))
return -E_INVAL;
if (envid2env(srcenvid,&src,1)<0
|| envid2env(dstenvid,&dst,1)<0)
return -E_BAD_ENV;
if ((uintptr_t)srcva>= UTOP || (uintptr_t)dstva >= UTOP)
return -E_INVAL;
if (((uintptr_t)srcva & 0xfff) || (uintptr_t)dstva & 0xfff)
return -E_INVAL;
if (!(srcpp = page_lookup(src->env_pgdir,srcva,&pte)))
return -E_INVAL;
if (((*pte)&PTE_W) == 0 && (perm & PTE_W))
return -E_INVAL;
if (page_insert(dst->env_pgdir,srcpp,dstva,perm) < 0)
return -E_NO_MEM;
return 0;
//panic("sys_page_map not implemented");
}
sys_page_unmap:
static int
sys_page_unmap(envid_t envid, void *va)
{
// Hint: This function is a wrapper around page_remove().
// LAB 4: Your code here.
struct Env *e;
if (envid2env(envid,&e,true) < 0)
return -E_BAD_ENV;
if ((uintptr_t)va >= UTOP)
return -E_INVAL;
page_remove(e->env_pgdir,va);
return 0;
//panic("sys_page_unmap not implemented");
}
在syacall()函数中进行注册:
int32_t
syscall(uint32_t syscallno, uint32_t a1, uint32_t a2, uint32_t a3, uint32_t a4, uint32_t a5)
{
// Call the function corresponding to the 'syscallno' parameter.
// Return any appropriate return value.
// LAB 3: Your code here.
// panic("syscall not implemented");
int32_t ret = 0;
switch (syscallno) {
case SYS_cputs:
sys_cputs((const char*)a1,a2);
break;
case SYS_cgetc:
ret = sys_cgetc();
break;
case SYS_env_destroy:
ret = sys_env_destroy(a1);
break;
case SYS_getenvid:
ret = sys_getenvid();
break;
case SYS_yield:
sys_yield();
break;
case SYS_exofork:
ret = (int32_t)sys_exofork();
break;
case SYS_env_set_status:
ret = sys_env_set_status(a1,a2);
break;
case SYS_page_alloc:
ret = sys_page_alloc(a1,(void*)a2,a3);
break;
case SYS_page_map:
ret = sys_page_map(a1,(void*)a2,a3,(void*)a4,a5);
break;
case SYS_page_unmap:
ret = sys_page_unmap(a1,(void*)a2);
break;
default:
return -E_INVAL;
}
return ret;
}
执行make grade
,得到如下结果:
dumbfork: OK (0.8s)
Part A score: 5/5
至此,Part A完成了。
END.