他人关于Linux线程和进程的讨论
本章讨论Linux内核如何管理进程:
1. how they are enumerated(列举) within the kernel
2. how they are created
3. how they ultimately(最终,最后) die.
进程管理是所有操作系统最重要的部分
进程是处于运行中的程序(A process is a program (object code stored on some media) in the midst of execution.)
但是,进程不仅仅是正在执行的程序代码。
进程,实际上,是运行程序代码形成的生命体。
threads是进程内部活动的实体。
每个线程包含:
1. a unique program counter
2. process stack
3. set of processor registers.
Linux又唯一的threads的实现:这使得threads和process并没有什么不同。对于Linux, thread就是一种特殊的进程
使进程以为是自己独占系统,尽管系统已经给上百个进程共享了处理器(chapter 4)
使进程认为自己可以分配和管理系统的所有memory(chapter12)
fork : 通过复制已经存在的进程来创建新进程
exec :创建新的地址空间,并加载program到该地址空间
clone : fork实际上使用clone实现的,clone能由调用者决定那些部分是parent和child共享的
exit :中止进程并且释放资源
wait4 :父进程用于获取 已经结束的子进程 的状态,允许进程等待特定进程的结束。
在进程exit后,进程将进入特殊的zombie state
,该状态用于表示已经结束的进程,直到parent
进程调用wait
或者waitpid
kernel使用双向循环链表(task list
)来存储process
列表。
task list
的每一项是process descriptor(进程描述符)
,类型是struct task_struct
。定义在 linux/sched.h
保存了执行的程序需要的所有信息。
Thetask_struct
structure is allocated via the slab allocator
to provide object reuse and cache coloring (see Chapter 12).
task_struct
存在每个进程的内核栈的末端。
随着进程描述符由slab allocator
动态创建,新的结构体struct thread_info
被创建在栈底(定义在asm/thread_info.h)
结构体如下:
struct thread_info {
struct pcb_struct pcb; /* palcode state */
struct task_struct *task; /* main task structure */
//..........................
//省略暂时不需要的部分,具体请查看源代码
};
thread_info
在栈的末尾,其中元素task
是指针,指向实际的task_struct
task直接通过指向它们结构体task_struct
的指针来引用。
大部分内核代码通过操作struct task_struct
来处理进程工作
每个进程有唯一进程标识值PID
,内核使用pid_t
在process descriptor里存储PID
如要打破兼容性,可以增加/proc/sys/kernel/pid_max
里面的最大值
通过current
宏可以获得
x86中使用current_thread_info
获取thread_info
current
引用thread_info
的task成员返回task_current
current_thread_info()->task;
进程状态 | 解释 |
---|---|
TASK_RUNNING | 进程处于运行中或者在运行等待队列里等待运行 |
TASK_INTERRUPTIBLE | 进程睡眠中,等待条件中。条件满足或者获取信号,会进入TASK_RUNNING 状态 |
TASK_UNINTERRUPTIBLE | 与上者不同之处就是不会因为接受信号而唤醒 |
__TASK_TRACED | 进程被其他进程追踪,例如debugger 通过ptrace |
__TASK_STOPPED | Process execution has stopped; the task is not running nor is it eligible to run.This occurs if the task receives the SIGSTOP , SIGTSTP , SIGTTIN , or SIGTTOU signal or if it receives any signal while it is being debugged. |
kernel代码经常需要改变进程状态,代码:
set_task_state(task, state); /*set task'task' to state 'state'*/
/*等效于*/
task->state = state;
set_current_state(state);
/*等效于*/
set_task_state(current, state);
linux/sched.h 中有具体的实现
struct task_struct *my_parent = current->parent;//拥有parent指针
//下个任务
#define next_task(p) list_entry((p)->tasks.next, struct task_struct, tasks)
//上个任务
#define prev_task(p) list_entry((p)->tasks.prev, struct task_struct, tasks)
fork
创建和父进程只有PID不同的子进程exec
将可执行程序加载进地址空间,然后执行Linux的fork
实现使用了COW pages
。
只有在需要write
的时候才进行数据的复制,如果在fork
之后立即调用exec
,那么不会复制任何数据。
clone
系统调用,而clone
调用了do_fork
有一系列flags
指定了父子进程之间共享哪些资源
clone
调用了do_fork
用于处理forking
中的成批的任务。
clone
调用copy_process
,然后启动进程。
dup_task_struct()
, which creates a new kernel stack
, thread_info
structure, and task_struct
for the new process.task_struct
remain unchanged.flags
member of the task_struct .The PF_SUPERPRIV flag, which denotes whether a task used superuser privileges
, is cleared(=0
).The PF_FORKNOEXEC flag, which denotes a process that has not called exec()
, is set(=1
).PID
to the new task.与fork作用一样,除了的父进程的page table entries
不进行复制。随着COW
的使用,不复制父进程的PTE
是vfork仅有的好处。
vfork
的实现通过clone
系统调用的特殊标志来实现
If Linux one day gains copy-on-write page table entries, there will no longer be any benefit.
Linux内核没有提供任何特殊的scheduling semantics(调度语义) or data structures(数据结构)来显示threads.
thread就是能和其他process共享明确数据的process
Each thread has a unique(唯一) task_struct
and appears to the kernel as a normal process
—threads just happen to share resources, such as an address space
, with other processes.
The name “lightweight process” sums up the difference in philosophies 理论
between Linux and other systems.
windows等系统中明确支持threads
其他系统中,lightweight process是比heavy process更小更快的执行单位。
Linux中,threads仅仅是process之间共享数据的简单方式
为创建一个拥有4个threads的process,in Linux, there are simply four processes
and thus four normal task_struct
structures.The four processes are set up to share certain resources. The result is quite elegant(简练的)
.
Threads are created the same as normal tasks, with the exception that the clone() system call is passed flags corresponding to the specific resources to be shared:
clone(CLONE_VM | CLONE_FS | CLONE_FILES | CLONE_SIGHAND, 0);
很类似于fork,其parent process和创建的process被称为threads
clone(SIGCHLD, 0);
clone(CLONE_VFORK | CLONE_VM | SIGCHLD, 0);
kernel需要在后台完成一些操作,这就需要kernel thread完成。
kernel thread是在kernel space唯一存在的standard process(标准进程)
kernel thread没有address space(地址空间)
它们的用于address space的mm指针,指向NULL
ps -ef
查看kernel threadskthreadd
kernel process.接口在 linux/kthread.h
struct task_struct *kthread_create(int (*threadfn)(void *data),
void *data,
const char namefmt[],
...)
创建的kernel thread处于unrunnable状态,除非显式地执行wake_up_process()
才会开始执行
struct task_struct *kthread_run(int (*threadfn)(void *data),
void *data,
const char namefmt[],
...)
通常用macro(宏)实现,简单调用kthread_create
和wake_up_process
kernel thread会持续存在,直到调用do_exit()
或者kernel的其他部分调用kthread_stop
—将kthread_create()
返回的task_struct
的地址传入。
int kthread_stop(struct task_struct * k);
无论process如何中止,定义于 kernel/exit.c 的do_exit()会进行很多工作的处理。
do_exit完成如下琐事(chores):
file descriptors
and filesystem
data, respectively. If either usage counts reach zero, the object is no longer in use by any process, and it is destroyed.task’s exit
code, stored in the exit_code member of the task_struct , to the code provided by exit() or whatever kernel mechanism forced the termination.The exit code is stored here for optional retrieval(恢复)
by the parent.reparents
any of the task’s children to another thread in their thread group or the init process, and sets the task’s exit state, stored in exit_state in the task_struct
structure, to EXIT_ZOMBIE .do_exit()
calls schedule() to switch to a new process (see Chapter 4). Because the process is now not schedulable
, this is the last code the task will ever execute. do_exit()
never returns.这些内容用于向parent提供信息。在parent接收信息后,这些留存的memory将会被释放,并且返回给system
这是用于system在process中止后获取child process的信息。
porcess的清理工作和删除进程描述符的操作是分离开的。
no | release_taskl做了如下内容 |
---|---|
1. | It calls __exit_signal() , which calls __unhash_process() , which in turns calls detach_pid() to remove the process from the pidhash and remove the process from the task list. |
2. | __exit_signal() releases any remaining resources used by the now dead process and finalizes statistics and bookkeeping. |
3. | If the task was the last member of a thread group, and the leader is a zombie, then release_task() notifies the zombie leader’s parent. |
4. | release_task() calls put_task_struct() to free the pages containing the process’s kernel stack and thread_info structure and deallocate the slab cache containing the task_struct . |
do_exit()
会调用exit_notify()
,其会调用forget_original_parent
能调用find_new_reaper()
来改变父母(reparenting)
随着进程成功重分配父进程,就没有必要冒杀死(stray)僵尸进程的风险。init
process会给所有子进程调用wait
,这样能清除任何僵尸进程。
In this chapter, we looked at the core operating system a**bstraction of the process**.We discussed the generalities of the process, why it is important, and the relationship between processes and threads.We then discussed how Linux stores and represents processes (with task_struct
and thread_info
), how processes are created (via fork() and ultimately(最后)
clone() ), how new executable images are loaded into address spaces (via the exec() family of system calls), the hierarchy of processes, how parents glean(收集) information about their deceased(已死亡的)
children (via the wait() family of system calls), and how processes ultimately(最后)
die (forcefully or intentionally via exit() ).The process is a fundamental and crucial(重要的)
abstraction, at the heart of every modern operating system, and ultimately the reason we have operating systems altogether (to run programs).
The next chapter discusses process scheduling, which is the delicate(精美的)
and interesting manner in which the kernel decides which processes to run, at what time, and in what order.