How To Write Shared Libraries(15)

1.5.5 GOT and PLT

The Global Offset Table (GOT) and Procedure Linkage Table (PLT) are the two data structures central to the ELF run-time. We will introduce now the reasons why they are used and what consequences arise from that.
GOT和PLT是运行时的两个主要数据结构。我们将介绍为什么这样用和这样用的结果。
Relocations are created for source constructs like
重加载创建构造如下：

extern int foo;
extern int bar (int); int call_bar (void) { return bar (foo); }

The call to bar requires two relocations: one to load the value of foo and another one to find the address of bar. If the code would be generated knowing the addresses of the variable and the function the assembler instructions would directly load from or jump to the address. For IA- 32 the code would look like this:
调用bar有两个需要重加载的过程：加载foo，加载bar函数。如果生成的代码知道foo的地址，函数会直接加载或者跳转到相应地址。IA-32 的代码如下：

pushl  foo
call   bar

This would encode the addresses of foo and bar as part of the instruction in the text segment. If the address is only known to the dynamic linker the text segment would have to be modified at run-time. According to what we learned above this must be avoided.
这会把foo和bar的地址作为指令编码到代码段。如果只知道动态链接的代码地址需要运行时修改。通过之前了解情况需要阻止这种情况发生。

Therefore the code generated for DSOs, i.e., when using -fpic or -fPIC, looks like this:
因此代码生成DSOs，使用-fpic或-fPIC参数，如下：

movl   foo@GOT(%ebx), %eax
pushl  (%eax)
call   bar@PLT

The address of the variable foo is now not part of the instruction. Instead it is loaded from the GOT. The address of the location in the GOT relative to the PIC register
value (%ebx) is known at link-time. Therefore the text segment does not have to be changed, only the GOT.
现在foo的地址不在指令中，而是从GOT加载。现在地址是GOT之中通过PIC注册的链接时相对地址。因此代码段不需要改变，只需要修改GOT内容。

The situation for the function call is similar. The function bar is not called directly. Instead control is transferred to a stub for bar in the PLT (indicated by bar@PLT). For IA-32 the PLT itself does not have to be modified and can be placed in a read-only segment, each entry is 16 bytes in size. Only the GOT is modified and each entry consists of 4 bytes. The structure of the PLT in an IA-32 DSO looks like this:
函数的解决方案类似。不是直接调用。使用PLT。在IA-32上PLT本身不会修改，可以存入代码段，每一个有16字节大小。只有GOT修改每个实例的4个字节。IA-32上结构类似如下形式：

.PLOT:
        pushl 4(%ebx) jmp *8(%ebx)
        nop; nop
        nop; nop
.PLT1:jmp *name1@GOT(%ebx)
          pushl $offset1
          jmp .PLT0@PC 
.PLT2:jmp *name2@GOT(%ebx)
          pushl $offset2
          jmp .PLT0@PC

This shows three entries, there are as many as needed, all having the same size. The first entry, labeled with .PLT0, is special. It is used internally as we will see. All the following entries belong to exactly one function symbol. The first instruction is an indirect jump where the address is taken from a slot in the GOT. Each PLT entry has one GOT slot. At startup time the dynamic linker fills the GOT slot with the address pointing to the second instruction of the appropriate PLT entry. I.e., when the PLT entry is used for the first time the jump ends at the following pushl instruction. The value pushed on the stack is also specific to the PLT slot and it is the offset of the relocation entry for the function which should be called. Then control is transferred to the special first PLT entry which pushes some more values on the stack and finally jumps into the dynamic linker. The dynamic linker has do make sure that the third GOT slot (offset 8) contains the address of the entry point in the dynamic linker. Once the dynamic linker has determined the address of the function it stores the result in the GOT entry which was used in the jmp instruction at the beginning of the PLT entry before jumping to the found function. This has the effect that all future uses of the PLT entry will not go through the dynamic linker, but will instead directly transfer to the function. The overhead for all but the first call is therefore “only” one indirect jump.
这里展示出三个示例，和需要的一样，都有相同的大小。第一个标记PLTO。如同所见是内部使用的。后面的示例属于扩展的一个功能。第一个指令是一个从GOT上使用的内部跳转指令。每个PLT有一个GOT。启动时动态链接器使用适当的PLT内容填充GOT相应槽位地址，当PLT第一次使用跳转到相应指令地址。入栈的值也是PLT槽和偏移内容用于函数调用。然后控制转换PLT入栈更多值最终跳转到动态链接器。动态链接器必须确定第三个GOT槽位包含动态链接器的地址。一旦动态链接器确定了函数存储在GOT中的地址，这个地址回=会用于PLT入口地址跳转。这样所有的使用PLT实例的内容不再进入动态链接器，使用转换的函数。除了第一次调用，其他只需要一个直接跳转就OK啦。

The PLT stub is always used if the function is not guaranteed to be defined in the object which references it. Please note that a simple definition in the object with the reference is not enough to avoid the PLT entry. Looking at the symbol lookup process it should be clear that the definition could be found in another object (interposition) in which case the PLT is needed. We will later explain exactly when and how to avoid PLT entries.
PLT的stub在函数不是对象本身定义的情况下总是使用。注意，只是简单的定义引用不能省略PLT内容。查找语法标识的进程需要清晰的知道定义位置。后面回展开分析何时如何不使用PLT。

How exactly the GOT and PLT is structured is architecture-specific, specified in the respective psABI. What was said here about IA-32 is in some form applicable to some other architectures but not for all. For instance, while the PLT on IA-32 is read-only it must be writable for other architectures since instead of indirect jumps using GOT values the PLT entries are modified directly. A reader might think that the designers of the IA-32 ABI made a mistake by requiring a indirect, and therefore slower, call instead of a direct call. This is no mistake, though. Hav- ing a writable and executable segment is a huge security problem since attackers can simply write arbitrary code into the PLT and take over the program. We can anyhow summarize the costs of using GOT and PLT like this:
GOT和PLT的结构是特定于体系结构的，具体在各自的psABI中指定。这里所说的IA-32在某种程度上适用于其他一些架构，但并非适用于所有架构。例如，虽然IA-32上的PLT是只读的，但对于其他架构来说，它必须是可写的，因为PLT条目是直接修改的，而不是使用GOT值的间接跳转。读者可能会认为IA-32 ABI的设计者犯了一个错误，即要求使用间接的、因此更慢的调用，而不是直接调用。不过，这不是错误。拥有可写和可执行的段是一个巨大的安全问题，因为攻击者可以简单地将任意代码写入PLT并接管程序。总之，我们可以这样总结使用GOT和PLT的成本: （有道翻译）

• every use of a global variable which is exported uses a GOT entry and loads the variable values in- directly;
每次使用被导出的全局变量时，都使用一个GOT条目并直接加载变量值;（有道翻译）
• each function which is called (as opposed to refer- enced as a variable) which is not guaranteed to be defined in the calling object requires a PLT entry. The function call is performed indirectly by trans- ferring control first to the code in the PLT entry which in turn calls the function.
•每个被调用(相对于作为变量引用)的函数，如果不能保证在调用对象中定义，则需要一个PLT条目。函数调用是间接执行的，首先将控制传递给PLT条目中的代码，该代码反过来调用函数。（有道翻译）
• for some architectures each PLT entry requires at least one GOT entry.
•对于某些架构，每个PLT条目至少需要一个GOT条目。（有道翻译）

Avoiding a jump through the PLT therefore removes on IA-32 16 bytes of text and 4 bytes of data. Avoiding the GOT when accessing a global variable saves 4 bytes of data and one load instruction (i.e., at least 3 bytes of code and cycles during the execution). In addition each GOT entry has a relocation associated with the costs described above.
因此，避免PLT的跳转将在IA-32上删除16字节的文本和4字节的数据。在访问全局变量时避免GOT将节省4个字节的数据和一个加载指令(即，在执行期间至少节省3个字节的代码和周期)。此外，每个GOT条目都有一个与上述成本相关的重新定位。(有道翻译)

todo：重读

How To Write Shared Libraries(15)

1.5.5 GOT and PLT

你可能感兴趣的:(How To Write Shared Libraries(15))