罗冲 + 原创 + 《Linux内核分析》MOOC课程http://mooc.study.163.com/course/USTC-1000029000
整个任务代码分为两部分:mymain.c与myinterrupt.c两部分,这里不列出全部的代码(详细代码参见孟宁老师的github:https://github.com/mengning/mykernel)
void __init my_start_kernel(void)
{
int pid = 0;
int i;
/* Initialize process 0*/
task[pid].pid = pid;
task[pid].state = 0;/* -1 unrunnable, 0 runnable, >0 stopped */
task[pid].task_entry = task[pid].thread.ip = (unsigned long)my_process;
task[pid].thread.sp = (unsigned long)&task[pid].stack[KERNEL_STACK_SIZE-1];
task[pid].next = &task[pid];
/*fork more process */
for(i=1;i<MAX_TASK_NUM;i++)
{
//初始化数组
... ...
}
/* start process 0 by task[0] */
pid = 0;
my_current_task = &task[pid];
asm volatile(
"movl %1,%%esp\n\t" /* set task[pid].thread.sp to esp */
"pushl %1\n\t" /* push ebp */
"pushl %0\n\t" /* push task[pid].thread.ip */
"ret\n\t" /* pop task[pid].thread.ip to eip */
"popl %%ebp\n\t"
:
: "c" (task[pid].thread.ip),"d" (task[pid].thread.sp) /* input c or d mean %ecx/%edx*/
);
}
void my_process(void)
{
int i = 0;
while(1)
{
... 切换部分...
if(my_need_sched == 1)
{
my_need_sched = 0;
my_schedule();
}
... ...
}
}
}
代码分析:
对于mymain.c来说,它分为两个部分my_start_kernel与my_process,其中my_start_kernel的主要工作为初始化task数组与启动task[0]
my_process的主要工作: task执行的动作,以及监控my_need_sched 信号量以判断是否需要进行程序切换
myinterrupt.c的代码:
void my_timer_handler(void)
{
#if 1
if(time_count%1000 == 0 && my_need_sched != 1)
{
... ... 切换信号量控制
my_need_sched = 1;
}
time_count ++ ;
#endif
return;
}
void my_schedule(void)
{
... 程序切换...
if(next->state == 0)/* -1 unrunnable, 0 runnable, >0 stopped */
{
/* switch to next process */
asm volatile(
"pushl %%ebp\n\t" /* save ebp */
"movl %%esp,%0\n\t" /* save esp */
"movl %2,%%esp\n\t" /* restore esp */
"movl $1f,%1\n\t" /* save eip */
"pushl %3\n\t"
"ret\n\t" /* restore eip */
"1:\t" /* next process start here */
"popl %%ebp\n\t"
: "=m" (prev->thread.sp),"=m" (prev->thread.ip)
: "m" (next->thread.sp),"m" (next->thread.ip)
);
my_current_task = next;
printk(KERN_NOTICE ">>>switch %d to %d<<<\n",prev->pid,next->pid);
}
else
{
next->state = 0;
my_current_task = next;
printk(KERN_NOTICE ">>>switch %d to %d<<<\n",prev->pid,next->pid);
/* switch to new process */
asm volatile(
"pushl %%ebp\n\t" /* save ebp */
"movl %%esp,%0\n\t" /* save esp */
"movl %2,%%esp\n\t" /* restore esp */
"movl %2,%%ebp\n\t" /* restore ebp */
"movl $1f,%1\n\t" /* save eip */
"pushl %3\n\t"
"ret\n\t" /* restore eip */
: "=m" (prev->thread.sp),"=m" (prev->thread.ip)
: "m" (next->thread.sp),"m" (next->thread.ip)
);
}
return;
}
myinterrupt.c也分为两个部分:
1)my_timer_handler: 时间片调度,系统在每个时间片到来时执行一次
2)my_schedule : 程序切换代码,由两个汇编程序来执行。
下面按顺序来进行分析
其执行图如下:
当启动内核的时候,会首先调用void __init my_start_kernel(void)函数。在这个函数可以分为三个部份:
1) 实始化task[0]
task[pid].pid = pid;
task[pid].state = 0;/* -1 unrunnable, 0 runnable, >0 stopped */
task[pid].task_entry = task[pid].thread.ip = (unsigned long)my_process;
task[pid].thread.sp = (unsigned long)&task[pid].stack[KERNEL_STACK_SIZE-1];
task[pid].next = &task[pid];
2)实始化task[]数组
/*fork more process */
for(i=1;i<MAX_TASK_NUM;i++)
{
memcpy(&task[i],&task[0],sizeof(tPCB));
task[i].pid = i;
task[i].state = -1;
task[i].thread.sp = (unsigned long)&task[i].stack[KERNEL_STACK_SIZE-1];
task[i].next = task[i-1].next;
task[i-1].next = &task[i];
}
3)启动task[0]
asm volatile(
"movl %1,%%esp\n\t" /* set task[pid].thread.sp to esp */
"pushl %1\n\t" /* push ebp */
"pushl %0\n\t" /* push task[pid].thread.ip */
"ret\n\t" /* pop task[pid].thread.ip to eip */
"popl %%ebp\n\t"
:
: "c" (task[pid].thread.ip),"d" (task[pid].thread.sp) /* input c or d mean %ecx/%edx*/
);
对照程序堆栈图:
而ret后面的代码因为eip指针已经移动到my_process()地方,因此在第一次执行的时候popl %%ebp是无法执行到的。
当myinterrupt.c的时间片次数到达设定值时,会修改my_need_sched 的值 ,从而导致mymain.c中调用到函数my_schedule(),导致第一次时间片切换。
即从task[0]切换到task[1]。
next = my_current_task->next;
prev = my_current_task;
my_current_task为task[0],而task[0]->next为task[1],因此在这里,我们可以简单的将next理解为task[1],prev为task[0]。
在接下来的判断中:
if(next->state == 0)/* -1 unrunnable, 0 runnable, >0 stopped */
{
/* switch to next process */
... ...
}
else
{
next->state = 0;
my_current_task = next;
... ...
}
此时task[1]->state的值为-1,因此,可以知道此时应该进入else分支。下面重点分析else中的汇编语言:
asm volatile(
"pushl %%ebp\n\t" /* save ebp */
"movl %%esp,%0\n\t" /* save esp */
"movl %2,%%esp\n\t" /* restore esp */
"movl %2,%%ebp\n\t" /* restore ebp */
"movl $1f,%1\n\t" /* save eip */
"pushl %3\n\t"
"ret\n\t" /* restore eip */
: "=m" (prev->thread.sp),"=m" (prev->thread.ip)
: "m" (next->thread.sp),"m" (next->thread.ip)
);
前三句的汇编程序比较好理解,其对应的堆栈示意图如下:
简单来说,它就是将task[0]对应的堆栈位置保存到prev.sp这个内存中,然后再次task[1].sp的值压栈
接下来执行
"movl $1f,%1\n\t" /* save eip */
这句话比较难以理解。不过可以通过查询其汇编得到答案。使用objdump获取汇编码,
其对应的为:
a1: c7 86 08 20 00 00 b2 movl $0xb2,0x2008(%esi)
a8: 00 00 00
ab: ff b3 08 20 00 00 pushl 0x2008(%ebx)
b1: c3 ret
b2: 5d pop %ebp
其中0x2008(%esi), 对应就是prev->thread.ip内存地址。它的意思是将0xb2这个地址保存到prev->thread.ip,其中0xb2是一个相对偏移地址,它对应的汇编代码就是:
b2: 5d pop %ebp
也就是prev->thread.ip中保存的是pop %ebp对应的代码段地址。
接下来的2句,就是设置eip的地址,将eip的代码段指针指向task[1]。从而task[1]开始执行。
当所有的task都运行后,此时所有的state的值都为0, 此时再次进入调度时,就是执行if里面的代码,即:
asm volatile(
"pushl %%ebp\n\t" /* save ebp */
"movl %%esp,%0\n\t" /* save esp */
"movl %2,%%esp\n\t" /* restore esp */
"movl $1f,%1\n\t" /* save eip */
"pushl %3\n\t"
"ret\n\t" /* restore eip */
"1:\t" /* next process start here */
"popl %%ebp\n\t"
: "=m" (prev->thread.sp),"=m" (prev->thread.ip)
: "m" (next->thread.sp),"m" (next->thread.ip)
);
}
因为所有的工作都是在汇编中完成,只对汇编程序进行分析。在这之前需要先确认一下:
next = my_current_task->next;
prev = my_current_task;
my_current_task为task[3],而task[3].next的值为task[0]。与task[1]第一次切换运相比,此时只是少了一行
"movl %2,%%ebp\n\t" /* restore ebp */
而这一行的作用就是保存ebp,即后面开始不保存ebp,因为所有的数据都保存在PCB中,所以就不在push了。
6e: 55 push %ebp
6f: 89 a6 0c 20 00 00 mov %esp,0x200c(%esi)
75: 8b a3 0c 20 00 00 mov 0x200c(%ebx),%esp
7b: 8b ab 0c 20 00 00 mov 0x200c(%ebx),%ebp
81: c7 86 08 20 00 00 b2 movl $0xb2,0x2008(%esi) ;movl $1f 0x2008(%esi)
88: 00 00 00
8b: ff b3 08 20 00 00 pushl 0x2008(%ebx) "pushl %3\n\t"
91: c3 ret ret
92: eb 8a jmp 1e <my_schedule+0x1e>
94: 55 push %ebp 95: 89 a6 0c 20 00 00 mov %esp,0x200c(%esi) 9b: 8b a3 0c 20 00 00 mov 0x200c(%ebx),%esp a1: c7 86 08 20 00 00 b2 movl $0xb2,0x2008(%esi) a8: 00 00 00 ab: ff b3 08 20 00 00 pushl 0x2008(%ebx) b1: c3 ret b2: 5d pop %ebp
因为之前已经保存了eip的值,因此此时再次获取出来。从而接着继续执行原来的程序。
与程序中的函数调用相比, 程序调度是通过PCB保存相应的指针数据,而程序调用是通过堆栈保存