在kern/env.c里面能够看到这仨全局变量.
用意在于:
Once JOS gets up and running, the envs pointer points to an array of Env structures representing all the environments in the system. In our design, the JOS kernel will support a maximum of NENV simultaneously active environments, although there will typically be far fewer running environments at any given time. (NENV is a constant #define 'd in inc/env.h.) Once it is allocated, the envs array will contain a single instance of the Env data structure for each of the NENV possible environments.
The JOS kernel keeps all of the inactive Env structures on the env_free_list . This design allows easy
allocation and deallocation of environments, as they merely have to be added to or removed from the
free list.
The kernel uses the curenv symbol to keep track of the currently executing environment at any given time. During boot up, before the first environment is run, curenv is initially set to NULL .
Allocating the Environments Array
In lab 2, you allocated memory in mem_init() for the pages[] array, which is a table the kernel uses to
keep track of which pages are free and which are not. You will now need to modify mem_init() further to
allocate a similar array of Env structures, called envs .
Creating and Running Environments
要知道到目前为止,JOS是没有文件系统的,那么如果在JOS上运行用户程序怎么办?——直接和内核镜像结合在一起,在编译时期就做好. 后面lab4就会让我们自己完善好文件系统.
补全env.c缺失的部分.
inc/env.h里面又关于env.c要用到的结构体或者宏定义的相关信息
这里进程运行的状态还可以联想linux做对比,linux中进程状态又TASK_RUNNING,TASK_INTERRUPTIBLE,TASK_UNINTERRUPTIBLE,TASK_STOPPED等状态
env_init部分 对envs指向的 struct Env数组做初始化操作,注意这里temp--,这种初始化的方向是故意的,
为了确保最后env_free_list指向最开始的结构体即envs[0]。
The JOS kernel keeps all of the inactive Env structures on the env_free_list . This design allows easy
allocation and deallocation of environments, as they merely have to be added to or removed from the free list.
void env_init(void) { // Set up envs array // LAB 3: Your code here. int temp = 0; env_free_list = NULL; cprintf("NENV -1 : %u\n",NENV -1); for(temp = NENV -1;temp >= 0;temp--) { envs[temp].env_id = 0; envs[temp].env_parent_id = 0; envs[temp].env_type = ENV_TYPE_USER; envs[temp].env_status = 0; envs[temp].env_runs = 0; envs[temp].env_pgdir = NULL; envs[temp].env_link = env_free_list; env_free_list = &envs[temp]; } cprintf("env_free_list : 0x%08x, &envs[temp]: 0x%08x\n",env_free_list,&envs[temp]); // Per-CPU part of the initialization env_init_percpu(); }
env_setup_vm部分为进程分配内存空间!单独的分配page directory—— p
static int env_setup_vm(struct Env *e) { int i; struct PageInfo *p = NULL; // Allocate a page for the page directory if (!(p = page_alloc(ALLOC_ZERO))) return -E_NO_MEM; // LAB 3: Your code here. (p->pp_ref)++; pde_t* page_dir = page2kva(p); memcpy(page_dir,kern_pgdir,PGSIZE); e->env_pgdir = page_dir; // UVPT maps the env's own page table read-only. // Permissions: kernel R, user R e->env_pgdir[PDX(UVPT)] = PADDR(e->env_pgdir) | PTE_P | PTE_U; return 0; }
region_alloc 为environment *e 开辟len byte大小的物理空间,并将va虚拟地址开始的len长度大小的空间和物理空间建立映射关系(其实干活儿的还是page_insert 哈哈~).
static void region_alloc(struct Env *e, void *va, size_t len) { // LAB 3: Your code here. // (But only if you need it for load_icode.) // // Hint: It is easier to use region_alloc if the caller can pass // 'va' and 'len' values that are not page-aligned. // You should round va down, and round (va + len) up. // (Watch out for corner-cases!) va = ROUNDDOWN(va,PGSIZE); len = ROUNDUP(len,PGSIZE); struct PageInfo *pp; int ret = 0; for(;len > 0; len -= PGSIZE, va += PGSIZE) { pp = page_alloc(0); if(!pp) { panic("region_alloc failed!\n"); } ret = page_insert(e->env_pgdir,pp,va,PTE_U | PTE_W); if(ret) { panic("region_alloc failed!\n"); } } }
static void load_icode(struct Env *e, uint8_t *binary) { // LAB 3: Your code here. struct Elf* elfhdr = (struct Elf*)binary; struct Proghdr* ph,*eph; if(elfhdr->e_magic != ELF_MAGIC) { panic("elf header's magic is not correct\n"); } ph = (struct Proghdr*)((uint8_t*)elfhdr + elfhdr->e_phoff); eph = ph + elfhdr->e_phnum; lcr3(PADDR(e->env_pgdir)); for(;ph < eph; ph++) { if(ph->p_type != ELF_PROG_LOAD) { continue; } if(ph->p_filesz > ph->p_memsz) { panic("file size is great than memory size\n"); } region_alloc(e,(void*)ph->p_va,ph->p_memsz); memmove((void*)ph->p_va,binary+ph->p_offset,ph->p_filesz); memset((void*)ph->p_va + ph->p_filesz,0,(ph->p_memsz - ph->p_filesz)); } e->env_tf.tf_eip = elfhdr->e_entry; // Now map one page for the program's initial stack // at virtual address USTACKTOP - PGSIZE. // LAB 3: Your code here. lcr3(PADDR(kern_pgdir)); region_alloc(e,(void*)USTACKTOP - PGSIZE,PGSIZE); }
env_create 函数
void env_create(uint8_t *binary, enum EnvType type) { // LAB 3: Your code here. int ret = 0; struct Env *e = NULL; ret = env_alloc(&e,0); if(ret < 0) { panic("env_create: %e\n",r); } load_icode(e,binary); e->env_type = type; }
env_run 函数是正真进程切换的地方.
To run an environment, the kernel must set up the CPU with both the saved registers and the appropriate address space.
void env_run(struct Env *e) { // LAB 3: Your code here. // panic("env_run not yet implemented"); if(curenv && curenv->env_status == ENV_RUNNING) { curenv->env_status = ENV_RUNNABLE; } curenv = e; e->env_status = ENV_RUNNING; e->env_runs++; lcr3(PADDR(e->env_pgdir)); env_pop_tf(&(e->env_tf)); }
这里的env_pop_tf实现了进程的真正切换,原理就是依据之前进程已经设置好的trapframe,然后把这个进程保存好的属于自己的trapframe通过弹栈的形式,输出到各个寄存器当中,实现进程环境的替换,而这里面也包括 eip,
也就意味着,当从env_pop_tf里面的iret返回的时候,就开始从调用结构体e描述的进程开始运行了.
If all goes well, your system should enter user space and execute the hello binary until it makes a system call with the int instruction. At that point there will be trouble, since JOS has not set up the hardware to allow any kind of transition from user space into the kernel. When the CPU discovers that it is not set up to handle this
system call interrupt, it will generate a general protection exception, find that it can't handle that, generate a double fault exception, find that it can't handle that either, and finally give up with what's known as a "triple fault". Usually,you would then see the CPU reset and the system reboot.
So .如果完成了以上部分,JOS启动时回不断的重启重启,这是正常现象....表慌,没错,接着做就是了.
如果没有出现这种现象引起的重启... 骚年, debug吧!... 你代码有问题.
在lab3中, 在kern/Makefrag里面会看到-b 的链接选项
The -b binary option on the linker command line causes these files to be linked in as "raw" uninterpreted binary files rather than as regular .o files produced by the compile
编译成功的时候,会发现在 ./lab/obj/kern/kernel.sym里面会有这样一段:
我不会说我这这破问题这儿折腾了两天了, 问候一下神兽.
在0x800b56这个地方(sys_cputs)的入口地方
当触发这个中断的时候系统就会重启. 如果你不能执行到这一步,就说明之前的地址空间配置还没有做好....
好的, 下一关!
一定记得搞清楚啥叫中断和异常:
这里还是推荐去看CSAPP的第八章:
http://blog.csdn.net/cinmyheart/article/details/38132521
Basics of Protected Control Transfer
Exceptions and interrupts are both "protected control transfers," which cause the processor to switch from user to kernel mode (CPL=0) without giving the usermode code any opportunity to interfere with the functioning of the kernel or other environments.
In Intel's terminology, an interrupt is a protected control transfer that is caused by an asynchronous event usually external to the processor, such as notification of external device I/O activity.
An exception, in contrast, is a protected control transfer caused synchronously by the currently running code, for example due to a divide by zero or an invalid memory access.
如果不知道啥是CPL, 可以移步这里,看看 http://blog.csdn.net/cinmyheart/article/details/40075257
为什么要有中断和异常这两种机制 ?
In order to ensure that these protected control transfers are actually protected, the processor's interrupt/exception mechanism is designed so that the code currently running when the interrupt or exception occurs does not get to choose arbitrarily where the kernel is entered or how. Instead, the processor ensures that the kernel can be entered only under carefully controlled conditions. On the x86, two mechanisms work together to provide this protection.
Handling Interrupts and Exceptions
At this point, the first int $0x30 system call instruction in user space is a dead end: once the processor gets into user mode, there is no way to get back out.
前面那个 int 0x30中断挂掉, 进入了用户空间就回不来鸟...所以现在要想办法让他回来...
Types of Exceptions and Interrupts (这里讨论了异常和中断的区别)
All of the synchronous exceptions that the x86 processor can generate internally use interrupt vectors between 0 and 31, and therefore map to IDT entries 031. For example, a page fault always causes an exception through vector 14. Interrupt vectors greater than 31 are only used by software interrupts, which can be generated by the int instruction, or asynchronous hardware interrupts, caused by external devices when they need attention.
同步的是异常, 异步的是中断. 异常的中断向量在0~31之间, 而中断的中断向量大于31.
The Interrupt Descriptor Table. The processor ensures that interrupts and exceptions can only cause the kernel to be entered at a few specific, well defined entrypoints determined by the kernel itself, and not by the code running when the interrupt or exception is taken.
处理器确保不管是中断还是异常,这两种进入内核的方式必须足够规范(welldefined),他们进入内核的入口都必须是很明确的,而不是由触发中断的用户提供入口地址.
于是乎256中不同的中断或是异常入口都有各自的中断向量(interrupt vector)
The x86 allows up to 256 different interrupt or exception entry points into the kernel, each with a different interrupt vector. A vector is a number between 0 and 255. An interrupt's vector is determined by the source of the interrupt: different devices, error conditions, and application requests to the kernel generate interrupts with different vectors.
The Task State Segment.
为什么会有TSS?
The processor needs a place to save the old processor state before the interrupt or exception occurred, such as the original values of EIP and CS before the processor invoked the exception handler, so that the exception handler can later restore that old state and resume the interrupted code from where it left off. But this save area for the old processor state must in turn be protected from unprivileged usermode code; otherwise buggy or malicious user code could compromise the kernel.
For this reason, when an x86 processor takes an interrupt or trap that causes a privilege level change from user to kernel mode, it also switches to a stack in the kernel's memory. A structure called the task state segment (TSS) specifies the segment selector and address where this stack lives. The processor pushes (on this new stack) SS , ESP , EFLAGS , CS , EIP , and an optional error code. Then it loads the CS and EIP from the interrupt descriptor, and sets the ESP and SS to refer to the new stack.
The processor can take exceptions and interrupts both from kernel and user mode. It is only when entering the kernel from user mode, however, that the x86 processor automatically switches stacks before pushing its old register state onto the stack and invoking the appropriate exception handler through the IDT.
特别的说明,x86处理器会自动的切换堆栈,然后再 把以前进程的寄存器各种状态push到栈上
If the processor is already in kernel mode when the interrupt or exception occurs (the low 2 bits of the CS register are already zero), then the CPU just pushes more values on the same kernel stack.
如果直接处于内核态,那么调用中断或者异常之后就不需要切换堆栈呗,直接把各种寄存器的值都push到stack上面去.
在kern/trap.c 里面的trapname()函数中我们可以对应的看到各种trap名字
Q1: Different exception handling in different ways, the required parameters; Therefore, each interrupt / exception need to have their own processing function; if not, in the current implementation can not distinguish what is happening, what kind of abnormal.
Q2: Executing the INT n instruction when the CPL is greater Within last The DPL of the referenced interrupt, trap, or task gate. (233) Intel technical manuals. When the CPL is 3 (user level), but I called the INT n instruction privilege level 0 (kernel level), resulting in a protection fault. If allowed to directly call the INT 14 (page fault), the user can check without a kernel permission to allocate memory, which is a big loophole.
两个exercise 一起
The Breakpoint Exception
The breakpoint exception, interrupt vector 3 ( T_BRKPT ), is normally used to allow debuggers to insert
breakpoints in a program's code by temporarily replacing the relevant program instruction with the special 1byte int3 software interrupt instruction.
这就是我们平常debug的时候熟悉使用的断点的原理.对于gdb的话,调试的时候 breakpoint line_number就可以在line_number行设置断点,程序会在这里暂停,供程序员"沉思"(我笑~)
而其中的原理就是在这行代码的机器指令之前插入一个字节的int 3异常.
我们还可以看到对于内核常常用到的panic函数,实质上就是打印相关寄存器和内核信息之后,然后调用中断 int 3
System calls
User processes ask the kernel to do things for them by invoking system calls. When the user process
invokes a system call, the processor enters kernel mode, the processor and the kernel cooperate to save the user process's state, the kernel executes appropriate code in order to carry out the system call, and then resumes the user process. The exact details of how the user process gets the kernel's attention and how it specifies which call it wants to execute vary from system to system.
Question three:
The cause of the exception to a general protection IDT is set the privilege level of the break point is set to 0 (kernel level), so by the user to access will certainly be a protection error. It is set to 3, the protection of the error will disappear.
Question 4:
softint does not allow the user to directly attributable to protect the kernel, to prevent the receipt of a vicious attack; break point mechanism provides developers with a convenient, but it will not lead to a vicious attack.
Page faults and memory protection
Memory protection is a crucial feature of an operating system, ensuring that bugs in one program cannot
corrupt other programs or corrupt the operating system itself.
嘛, page fault , trap , interrupt, exception, 这四者有区别,也有联系的...不记得去看CSAPP, 手边没书的话,我这里也有笔记.
CSAPP 8.1.2 Classes of Exceptions
补全好函数, 代码有点小多, 不贴出来了, github传送门. 去看lab3的分支中对应文件就可以了.
https://github.com/jasonleaster
先改 trap_dispatch(): 添加T_SYSCALL 选项,将 syscall的返回值储存在当前进程的 trap frame寄存器里面
再该 syscall.c里面的, github去看吧.贴代码是最浪费时间的.分析过程才有意义,代码是自己看的.
一切OK就会出现下面的输出.... 如果没有,"骚年, 尽情的debug内核吧哈哈~"
2015年2月17日 JOS lab 3 告一段落... 后期只要有问题还会继续更新
2015年4月15日 JOS lab3 更新
首先要去搞定kern/init.c 里面 i386_init()里调用的 trap_init(),这个函数用来初始化中断向量的处理机制.
我们先看看 trap_init()究竟是什么鬼~
先去翻番 inc/trap.h 看看
这里根据注释都能看出分成了"三段"
"第一段"各种类型的 trap是处理器已经定义好的 0-19号中断
"第二段"定义了T_SYSCALL (系统调用), T_DEFAULT, IRQ_OFFSET
这"两段"都是算软中断,后面是硬件触发的中断
"第三段" 定义各种硬件中断,这些硬件中断的计算方式是 IRQ_OFFSET + IRQ_Number
一定要配置好中断向量表,别马虎这地方,不然坑的是自己...
再次强调,不要分不清exception和interrupt
真忘记也木有关系嘛~下面有爱心传送门
http://blog.csdn.net/cinmyheart/article/details/38132521
TRAPHANDLER(name, num) 把name作为一个汇编的全局标记,并且标记为函数.
这个宏定义做的事情也很简单,就是把这个中断向量号压栈,然后jmp _alltraps 跳转到 _alltraps
TRAPHANDLER_NOEC(name, num) 还是把name作为一个汇编的全局标记.这里和上面不同的是,会多压栈一个0,为CPU不压栈一个错误码的traps填补一个错误码(╭(╯^╰)╮,就是要占个坑,对齐)
没别的,我们要注意的就是这段以 _alltraps开头的汇编代码.
你会看到一个
pushl %ds
pushl %es之后紧跟着一个pushal指令,这个指令把
pusha
This instruction pushes all the general purpose registers onto the stack in the following order: EAX, ECX, EDX, EBX, ESP, EBP, ESI, EDI. The value of ESP pushed is the value before the instruction is executed. It is useful for saving state before an operation that could potential change these registers.
你会发现!!这个pusha指令压栈的寄存器和下面这个PushRegs结构体的结构体是一模一样的(故意滴啦~)
后面定义结构体 PushRegs,
这个结构体干啥用的?就是为了更好的在栈上寻址.
这里看到Trapframe里面的"基类"呵呵..你就会更深刻的领悟,啊,一个中断调用发生的时候,之前用户进程的各种寄存器神马鬼都被压栈保存好!
Trapframe里面还有个各种tf_padding* 这都是为了字节对齐.es寄存器16bit哇~这里要对齐到4byte于是就填充了一下.实质上这些数据没什么用.纯粹为了对齐占坑
你会在trapentry.S里面看到这样一行汇编代码(自己实现)
_alltraps里面还有这样的一部分汇编.
把 ds(data segment)数据段寄存器和 es (extra segment)寄存器设置为 GD_KD (GD全局的段描述符,KD内核数据段),忘记的话去看看头文件inc/memlayout.h 或者再把lab2做一遍 : ) 所以嘛~ 操作系统这种实验,多做几遍不过事更加熟悉,无所谓踩坑,加油~
下面还是讲_alltraps
其实这里为什么要把栈的基地址设为 0 我不是很清楚,而后他就调用了call trap()函数
完事儿之后逆序弹栈,返回用户态,但事实上是不用后面#clean up那段代码的,为什么呢?
因为调用trap()之后,会调用env_run(),又调用env_pop_tf(),这个函数会帮我们搞定弹栈的问题,恩,没看懂就多看几遍...再不行就单步调试,把进程的运行"方向"搞清楚,加油...
我们再深入的分析一下之前调用的trap()函数.利用断言去检查中断是否被允许.不允许直接挂
在这里就能看到我们看过N次的 "Incoming TRAP frame at %p" 呵呵...
检查当前调用trap()的运行级别,是否是用户态,是就进入那个if语句,
关键的就是那句 curenv->env_tf = *tf 完成了结构体的赋值.把栈上的那些个寄存器统统拷贝到当前进程的trap fram里面去.
trap_dispatch()里面会调用syscall触发系统调用!
后面的都是一些验证性的测试,只有最后的那个env_run()开始运行切换到内核态的进程.
这里这个宏定义 ENV_CREATE用来初始化一个进程环境(就是把user_hello,对应的程序对应的进程结构体填充好啦)下面给出填充的细节.
kern/env.h
你会发现这里实质上user_hello并不是用户程序的全名,他需要ENV_PASTE3(x, y, z)这个宏定义来实现"动态的绑定" 然后调用env_create函数去创建这个进程环境.注意只是创建而已,真正的实际运行是要调用env_run()的
我们实际上调用的用户程序的入口是_binary_obj_user_hello_start
这个入口你可以在 obj/kern/kern.sym里面找到
这是env_create()的实现.之前的_obj_user_hello_start会被传入到 uint8_t *binary指针.
调用load_icode把整块程序都加载到系统中.
这样,一个进程就算创建完了,你会发现,trap_init()之后就立马调用这进程了,通过i386_init()里面的env_run()
由于之前lab3最后面的部分有点麻烦,当时做的时候也被虐成狗,以至于后面搞定了就没怎么写了,这篇笔记贴也本来很长了,打算让有心人自己去看代码的.于是后面就讲的不是很细.
话说居然还有人认真看我的笔记贴, 也是难得...哈哈
还是希望有兴趣的同学一起交流讨论学习.这里感谢@lvan.也很难得遇到能够关注问题,并致力于解决问题的人.
这里我整理了关于lab3所有用户程序的运行分析情况:
http://blog.csdn.net/cinmyheart/article/details/45171475
完成评价:
2015年2月 <<扁担和竹篮的生活>>