进程切换之奥秘解析

学号:SA12**6112

前面一篇博文分析了进程从用户态切换到内核态时,内核所做的主要的事,本文将研究在进程切换时,内核所做的事。

在内核态,进程切换主要分两步:

1:切换页全局目录

2:切换内核堆栈和硬件上下文

 

用prev指向被替换进程的表述符,next指向被激活进程的描述符

下面分析进程切换的第二步

第二步主要由switch_to宏实现:

3.3内核中X86体系下:/arch/x86/include/asm/system.h文件的第48行处:

 48 #define switch_to(prev, next, last)                                     \

 49 do {                                                                    \

 50         /*                                                              \

 51          * Context-switching clobbers all registers, so we clobber      \

 52          * them explicitly, via unused output variables.                \

 53          * (EAX and EBP is not listed because EBP is saved/restored     \

 54          * explicitly for wchan access and EAX is the return value of   \

 55          * __switch_to())                                               \

 56          */                                                             \

 57         unsigned long ebx, ecx, edx, esi, edi;                          \

 58                                                                         \

 59         asm volatile("pushfl\n\t"               /* save    flags */     \

 60                      "pushl %%ebp\n\t"          /* save    EBP   */     \

 61                      "movl %%esp,%[prev_sp]\n\t"        /* save    ESP   */ \

 62                      "movl %[next_sp],%%esp\n\t"        /* restore ESP   */ \

 63                      "movl $1f,%[prev_ip]\n\t"  /* save    EIP   */     \

 64                      "pushl %[next_ip]\n\t"     /* restore EIP   */     \

 65                      __switch_canary                                    \

 66                      "jmp __switch_to\n"        /* regparm call  */     \

 67                      "1:\t"                                             \

 68                      "popl %%ebp\n\t"           /* restore EBP   */     \

 69                      "popfl\n"                  /* restore flags */     \

 70                                                                         \

 71                      /* output parameters */                            \

 72                      : [prev_sp] "=m" (prev->thread.sp),                \

 73                        [prev_ip] "=m" (prev->thread.ip),                \

 74                        "=a" (last),                                     \

 75                                                                         \

 76                        /* clobbered output registers: */                \

 77                        "=b" (ebx), "=c" (ecx), "=d" (edx),              \

 78                        "=S" (esi), "=D" (edi)                           \

 79                                                                         \

 80                        __switch_canary_oparam                           \

 81                                                                         \

 82                        /* input parameters: */                          \

 83                      : [next_sp]  "m" (next->thread.sp),                \

 84                        [next_ip]  "m" (next->thread.ip),                \

 85                                                                         \

 86                        /* regparm parameters for __switch_to(): */      \

 87                        [prev]     "a" (prev),                           \

 88                        [next]     "d" (next)                            \

 89                                                                         \

 90                        __switch_canary_iparam                           \

 91                                                                         \

 92                      : /* reloaded segment registers */                 \

 93                         "memory");                                      \

 94 } while (0)

 

一:由上面的代码可以看出,切换内核堆栈主要工作是:

1:把eflags和ebp寄存器保存到prev内核栈中。

2:把esp保存到prev->thread.sp中,eip保存到prev->thread.ip中。

3:把next指向的新进程的thread.esp保存到esp中,把next->thread.ip保存到eip中

至此已经完成了内核堆栈的切换。

 

二:切换内核堆栈之后,TSS段也要相应的改变:

这是因为对于linux系统来说同一个CPU上所有的进程共用一个TSS,进程切换了,因此TSS需要随之改变。

linux系统中主要从两个方面用到了TSS:

(1)任何进程从用户态陷入内核态都必须从TSS获得内核堆栈指针

(2)用户态读写IO需要访问TSS的权限位图。

所以在进程切换时也要更新TSS中的esp0和IO权位图的值,这主要在_switch_to函数中完成:

3.3内核X86体系下:/arch/x86/kernel/process_32.c文件中第296行处:

296 __notrace_funcgraph struct task_struct *

297 __switch_to(struct task_struct *prev_p, struct task_struct *next_p)

298 {

299         struct thread_struct *prev = &prev_p->thread,

300                                  *next = &next_p->thread;

301         int cpu = smp_processor_id();

302         struct tss_struct *tss = &per_cpu(init_tss, cpu);

303         fpu_switch_t fpu;

304 

305         /* never put a printk in __switch_to... printk() calls wake_up*() indirectly */

306 

307         fpu = switch_fpu_prepare(prev_p, next_p, cpu);

308 

309         /*

310          * Reload esp0.

311          */

312         load_sp0(tss, next);

313 

314         /*

315          * Save away %gs. No need to save %fs, as it was saved on the

316          * stack on entry.  No need to save %es and %ds, as those are

317          * always kernel segments while inside the kernel.  Doing this

318          * before setting the new TLS descriptors avoids the situation

319          * where we temporarily have non-reloadable segments in %fs

320          * and %gs.  This could be an issue if the NMI handler ever

321          * used %fs or %gs (it does not today), or if the kernel is

322          * running inside of a hypervisor layer.

323          */

324         lazy_save_gs(prev->gs);

325 

326         /*

327          * Load the per-thread Thread-Local Storage descriptor.

328          */

329         load_TLS(next, cpu);

330 

331         /*

332          * Restore IOPL if needed.  In normal use, the flags restore

333          * in the switch assembly will handle this.  But if the kernel

334          * is running virtualized at a non-zero CPL, the popf will

335          * not restore flags, so it must be done in a separate step.

336          */

337         if (get_kernel_rpl() && unlikely(prev->iopl != next->iopl))

338                 set_iopl_mask(next->iopl);

339 

340         /*

341          * Now maybe handle debug registers and/or IO bitmaps

342          */

343         if (unlikely(task_thread_info(prev_p)->flags & _TIF_WORK_CTXSW_PREV ||

344                      task_thread_info(next_p)->flags & _TIF_WORK_CTXSW_NEXT))

345                 __switch_to_xtra(prev_p, next_p, tss);

346 

347         /*

348          * Leave lazy mode, flushing any hypercalls made here.

349          * This must be done before restoring TLS segments so

350          * the GDT and LDT are properly updated, and must be

351          * done before math_state_restore, so the TS bit is up

352          * to date.

353          */

354         arch_end_context_switch(next_p);

355 

356         /*

357          * Restore %gs if needed (which is common)

358          */

359         if (prev->gs | next->gs)

360                 lazy_load_gs(next->gs);

361 

362         switch_fpu_finish(next_p, fpu);

363 

364         percpu_write(current_task, next_p);

365 

366         return prev_p;

367 }

 由上面的代码可看出:TSS的更新主要是

1: load_sp0(tss, next); 从下一个进程的thread字段中获取它的sp0,并用它来更新TSS中的sp0

2: __switch_to_xtra(prev_p, next_p, tss);必要的时候会更新IO权位值。

 

 

你可能感兴趣的:(解析)