1.4 Startup: In The Kernel
Starting execution of a program begins in the kernel, nor- mally in the execve system call. The currently executed code is replaced with a new program. This means the ad-dress space content is replaced by the content of the file containing the program. This does not happen by sim- ply mapping (using mmap) the content of the file. ELF files are structured and there are normally at least three different kinds of regions in the file:
一般内核从execve开始执行一个程序。
现在执行的是替换后的新的程序。
这就是说地址空间的内容已被程序内容替换。
在使用mmap时不会执行替换过程。
ELF文件一般至少有三个不同的组成部分:
• Codewhichisexecuted;thisregionisnormallynot writable;
• Data which is modified; this region is normally not executable;
• Datawhichisnotusedatrun-time;sincenotneeded it should not be loaded at startup.
代码段;不可写;
可修改数据段;不可执行;
空数据段; 启动时无需加载;
Modern operating systems and processors can protect mem- ory regions to allow and disallow reading, writing, and executing separately for each page of memory。It is preferable to mark as many pages as possible not writable since this means that the pages can be shared between processes which use the same application or DSO the page is from. Write protection also helps to detect and prevent unintentional or malignant modifications of data or even code.
当前的操作系统和程序可以设置内存页的读、写、执行权限。
标记内存页不可写使得可以在进程之间共享这些进程或者共享库的内存页。
写保护可以辅助检测或阻止无意或恶意的修改数据或代码。
For the kernel to find the different regions, or segments in ELF-speak, and their access permissions, the ELF file format defines a table which contains just this informa- tion, among other things. The ELF Program Header ta- ble, as it is called, must be present in every executable and DSO. It is represented by the C types Elf32 Phdr and Elf64 Phdr which are defined as can be seen in fig- ure 1
为了发现不同的区域(ELF段)和操作权限,ELF文件定义了一个只包含这些信息的表格。
这就是ELF头,包含在所有的可执行文件和DSO中。格式见表格1.
To locate the program header data structure another data structure is needed, the ELF Header. The ELF header is the only data structure which has a fixed place in the file, starting at offset zero. Its C data structure can be seen in figure 2. The e phoff field specifies where, counting from the beginning of the file, the program header table starts. The e phnum field contains the number of entries in the program header table and the e phentsize field contains the size of each entry. This last value is useful only as a run-time consistency check for the binary.
文件中其他内容需要ELF头完成定位。
ELF头是唯一的有固定地址的数据结构,偏移地址为0.
见图2的数据结构。
e_phoff定义ELF偏移位置。
e_phnum计数,e_phentsize标记大小。
最后一位用于运行时一致性校验。
The different segments are represented by the program header entries with the PT LOAD value in the p type field. The p offset and p filesz fields specify where in the file the segment starts and how long it is. The p vaddr and p memsz fields specify where the segment is located in the the process’ virtual address space and how large the memory region is. The value of the p vaddr field itself is not necessarily required to be the final load address. DSOs can be loaded at arbitrary addresses in the virtual address space. But the relative position of the segments is important. For pre-linked DSOs the actual value of the p vaddr field is meaningful: it specifies the address for which the DSO was pre-linked. But even this does not mean the dynamic linker cannot ignore this information if necessary.
不同的段通过p_type的不同值来区分。
p_offset和p_filesz表示段的起始地址和长度。
p_vaddr和p_memsz表示字段在进程虚拟地址空间中的位置和占用内存大小。
p_vaddr字段不是最终加载必须的内容。
DSO可以加载到虚拟地址空间。
但是段的相对位置是导入的。
p_vaddr对于DSO预链接的作用:定义DSO预加载地址。
即使如此必要时链接器也可以忽略这部分。
The size in the file can be smaller than the address space it takes up in memory. The first p filesz bytes of the memory region are initialized from the data of the seg- ment in the file, the difference is initialized with zero. This can be used to handle BSS sections2, sections for uninitialized variables which are according to the C stan- dard initialized with zero. Handling uninitialized vari- ables this way has the advantage that the file size can be reduced since no initialization value has to be stored, no data has to be copied from disc to memory, and the mem- ory provided by the OS via the mmap interface is already initialized with zero.
文件大小可以远小于启动时占用内存空间的大小。
p_filesz的第一个字节是数据段的初始化的,其他为0.
这可以用来处理BSS(未初始化的数据,c的标准初始化为0)。
这种方式处理是一个优势,由于没有初始值所以可以减少文件存储空间,而且由于内核提供mmap接口自动初始化为0.
The p flags finally tells the kernel what permissions to use for the memory pages. This field is a bitmap with the bits given in the following table being defined. The flags are directly mapped to the flags mmap understands.
p_flags标记内核如何使用内存页。
这个区域是一个下面定义的bitmap结构。
mmap可以识别这个标记。
After mapping all the PT LOAD segments using the ap-propriate permissions and the specified address, or after freely allocating an address for dynamic objects which have no fixed load address, the next phase can start. The virtual address space of the dynamically linked executable itself is set up. But the binary is not complete. The kernel has to get the dynamic linker to do the rest and for this the dynamic linker has to be loaded in the same way as the executable itself (i.e., look for the loadable segments in the program header). The difference is that the dy- namic linker itself must be complete and should be freely relocatable.
在完成不同段的地址分配和权限设置之后,或者动态申请内存后就可以继续下一步了。
虚拟地址空间中动态库的设置启动。
但是程序还是不完整。
操作系统内核需要执行重新设置链接器。(待完善)
不同点在于链接器本身必须是完整的可以重定位的。
Which binary implements the dynamic linker is not hard- coded in the kernel. Instead the program header of the application contains an entry with the tag PT INTERP. The p offset field of this entry contains the offset of a NUL-terminated string which specifies the file name of this file. The only requirement on the named file is that its load address does not conflict with the load address of any possible executable it might be used with. In gen- eral this means that the dynamic linker has no fixed load address and can be loaded anywhere; this is just what dy- namic binaries allow.
动态链接器不是在内核中硬编码实现的。
取而代之的是在程序头部分有一个PT INTERP的标记位。
p_offset区域的包含一个NULL结束的文件名字符串的偏移地址。
唯一的要求是加载地址不会和可能使用的可执行程序冲突。
也就是动态链接器没有固定的入口日志,可以加载到任何地方。只有动态链接器可以这样。
Once the dynamic linker has also been mapped into the memory of the to-be-started process we can start the dy- namic linker. Note it is not the entry point of the applica- tion to which control is transfered to. Only the dynamic linker is ready to run. Instead of calling the dynamic linker right away, one more step is performed. The dy- namic linker somehow has to be told where the applica- tion can be found and where control has to be transferred to once the application is complete. For this a structured way exists. The kernel puts an array of tag-value pairs on the stack of the new process. This auxiliary vector con- tains beside the two aforementioned values several more values which allow the dynamic linker to avoid several system calls. The elf.h header file defines a number of constants with a AT prefix. These are the tags for the entries in the auxiliary vector.
一旦动态链接器进入程序的虚拟地址空间就可以启动它了。
注意这不是控制转换的入口地址。
只是链接器准备执行。
正确执行链接器还有更多的步骤。
动态链接器必须完成的还有如何发现应用程序,如何转换控制权限到准备好的程序。
有一个数据结构完成这个功能。
内核设置一个tag-value对的数组到新进程的栈上。
辅助vector还有其他几个内容使得链接器阻止几个系统调用(待完善)
elf.h中定义了一个AT prefix类型的常量。
在辅助vector中包含这些内容。
After setting up the auxiliary vector the kernel is finally ready to transfer control to the dynamic linker in user mode. The entry point is defined in e entry field of the ELF header of the dynamic linker.
设置完成vector之后,内核会把链接器转到用户空间。
入口地址在ELF头中定义。