免费PDF文档下载地址:http://ishare.iask.sina.com.cn/f/35309731.html
或者:http://wenku.baidu.com/view/5668e23a31126edb6f1a106d.html
The previous article explained how position independent code (PIC) works, with code compiled for the x86 architecture as an example. I promised to cover PIC on x64[1] in a separate article, so here we are. This article will go into much less detail, since it assumes an understanding of how PIC works in theory. In general, the idea is similar for both platforms, but some details differ because of unique features of each architecture.
前篇文章中,我在X86平台上通过实例向大家展示了位置无关代码是如何工作的。 之前我承诺读者会介绍X64平台上的位置无关代码,那么这篇文章就是了。 不过这篇文章介绍的不会太详细,因为以前文为基础,我假设读者已经明白什么是位置无关代码了。再说,位置无关代码这个概念在所有的平台上都是一样的,只不过具体的实现细节会因为平台的不同而稍微有些差异。
On x86, while function references (with the call instruction) use relative offsets from the instruction pointer, data references (with themov instruction) only support absolute addresses. As we’ve seen in the previous article, this makes PIC code somewhat less efficient, since PIC by its nature requires making all offsets IP-relative; absolute addresses and position independence don’t go well together.
我们知道,在X86平台上,函数的调用(使用call指令)使用相对位移调用指令(当然是相对指令指针(instruction pointer)),数据的访问(使用mov指令)只支持绝对地址访问。通过前篇文章的学习,我们知道位置无关代码(PIC)的设计思想就是利用IP-relative来寻找目标,这么说来X86平台对数据的访问需要绝对地址的限制在某种程度上降低了位置无关代码(PIC)的效率,因此绝地与位置无关代码(PIC)不应该同时存在。
x64 fixes that, with a new "RIP-relative addressing mode", which is the default for all 64-bitmov instructions that reference memory (it’s used for other instructions as well, such aslea). A quote from the "Intel Architecture Manual vol 2a":
X64通过一个RIP-relative的寻址模式修正了这个问题,这种寻址模式是所有64位平台上mov指令默认的访问内存的方式(这种方式也适合其他的指令,例如lea指令)。下面是摘自"Intel Architecture Manual vol 2a"的一段:
A new addressing form, RIP-relative (relative instruction-pointer) addressing, is implemented in 64-bit mode. An effective address is formed by adding displacement to the 64-bit RIP of the next instruction.
RIP-relative(相对于instruction-pointer)是在64位平台上实现的一种新的寻址方式。一个有效的地址是通过下条指令的地址(这个地址存储在64位寄存器RIP中)加上一个位移得到的。
The displacement used in RIP-relative mode is 32 bits in size. Since it should be useful for both positive and negative offsets, roughly +/- 2GB is the maximal offset from RIP supported by this addressing mode.
因为在RIP-relative寻址模式中的这个位移长度是32位的,又因为必须可以向前也可以向后寻址,因此这种寻址方式的范围是+/-2G。
For easier comparison, I will use the same C source as in the data reference example of the previous article:
为了便于比较,我使用与前文相同的C程序来说明:
int myglob = 42; int ml_func(int a, int b) { return myglob + a + b; }
Let’s look at the disassembly of ml_func:
来看函数ml_func的反汇编:
00000000000005ec <ml_func>: 5ec: 55 push rbp 5ed: 48 89 e5 mov rbp,rsp 5f0: 89 7d fc mov DWORD PTR [rbp-0x4],edi 5f3: 89 75 f8 mov DWORD PTR [rbp-0x8],esi 5f6: 48 8b 05 db 09 20 00 mov rax,QWORD PTR [rip+0x2009db] 5fd: 8b 00 mov eax,DWORD PTR [rax] 5ff: 03 45 fc add eax,DWORD PTR [rbp-0x4] 602: 03 45 f8 add eax,DWORD PTR [rbp-0x8] 605: c9 leave 606: c3 ret
The most interesting instruction here is at 0x5f6: it places the address ofmyglobal intorax, by referencing an entry in the GOT. As we can see, it uses RIP relative addressing. Since it’s relative to the address of the next instruction, what we actually get is0x5fd + 0x2009db = 0x200fd8. So the GOT entry holding the address ofmyglob is at0x200fd8. Let’s check if it makes sense:
来看地址0x5f6处的指令:将变量myglob的地址存入寄存器rax中,同样的这个地址是存储在GOT中的某一项。我们看到,这里使用的寻址方式正是RIP-relative —— 因为下条指令的地址是0x5fd,因此我们通过计算得出变量myglob的地址应该存储在0x5fd+0x2009db =0x200fd8地址处。让我们验证一下:
$ readelf -S libmlpic_dataonly.so There are 35 section headers, starting at offset 0x13a8: Section Headers: [Nr] Name Type Address Offset Size EntSize Flags Link Info Align [...] [20] .got PROGBITS 0000000000200fc8 00000fc8 0000000000000020 0000000000000008 WA 0 0 8 [...]
GOT starts at 0x200fc8, so myglob is in its third entry. We can also see the relocation inserted for the GOT reference tomyglob:
从输出中我们看待GOT的起始地址为0x200fc8,因此变量myglob是GOT的第三个项。我们也可以查看重定位表中关于变量myglob的重定位入口:
$ readelf -r libmlpic_dataonly.so Relocation section '.rela.dyn' at offset 0x450 contains 5 entries: Offset Info Type Sym. Value Sym. Name + Addend [...] 000000200fd8 000500000006 R_X86_64_GLOB_DAT 0000000000201010 myglob + 0 [...]
Indeed, a relocation entry for 0x200fd8 telling the dynamic linker to place the address ofmyglob into it once the final address of this symbol is known.
的确,地址0x200fd8处的重定位告诉链接器当得到变量myglob的地址后,就将该地址存储在0x200fd8处。
So it should be quite clear how the address of myglob is obtained in the code. The next instruction in the disassembly (at0x5fd) then dereferences the address to get the value ofmyglob into eax[2].
那么现在对如何得到变量myglob的地址就很清楚了。接着0x5fd处的指令取出变量myglob的值并存入寄存器eax中。
译者补充:
(gdb) set environment LD_LIBRARY_PATH=. (gdb) set disassembly-flavor intel (gdb) break ml_func Breakpoint 1 at 0x400510 (gdb) run Starting program: /home/astrol/c/dynamic/position_independent_code_in_shared_libraries_on_x64/driver Breakpoint 1, ml_func (a=1, b=1) at ml_func.c:4 4 return myglob + a + b; (gdb) disassemble Dump of assembler code for function ml_func: 0x00007ffff7bd859c <+0>: push rbp 0x00007ffff7bd859d <+1>: mov rbp,rsp 0x00007ffff7bd85a0 <+4>: mov DWORD PTR [rbp-0x4],edi 0x00007ffff7bd85a3 <+7>: mov DWORD PTR [rbp-0x8],esi => 0x00007ffff7bd85a6 <+10>: mov rax,QWORD PTR [rip+0x200a1b] 0x00007ffff7bd85ad <+17>: mov eax,DWORD PTR [rax] 0x00007ffff7bd85af <+19>: add eax,DWORD PTR [rbp-0x4] 0x00007ffff7bd85b2 <+22>: add eax,DWORD PTR [rbp-0x8] 0x00007ffff7bd85b5 <+25>: pop rbp 0x00007ffff7bd85b6 <+26>: ret End of assembler dump. (gdb) print/x *(long long *)(0x00007ffff7bd85ad + 0x200a1b) $1 = 0x7ffff7dd9010 (gdb) print/x &myglob $2 = 0x7ffff7dd9010 (gdb)
Now let’s see how function calls work with PIC code on x64. Once again, we’ll use the same example from the previous article:
现在让我们来看看X64上位置无关代码(PIC)关于函数调用是如何工作的。同样的,使用与前篇文章相同的C程序:
int myglob = 42; int ml_util_func(int a) { return a + 1; } int ml_func(int a, int b) { int c = b + ml_util_func(a); myglob += c; return b + myglob; }
Disassembling ml_func, we get:
获得函数ml_func的反汇编如下:
000000000000064b <ml_func>: 64b: 55 push rbp 64c: 48 89 e5 mov rbp,rsp 64f: 48 83 ec 20 sub rsp,0x20 653: 89 7d ec mov DWORD PTR [rbp-0x14],edi 656: 89 75 e8 mov DWORD PTR [rbp-0x18],esi 659: 8b 45 ec mov eax,DWORD PTR [rbp-0x14] 65c: 89 c7 mov edi,eax 65e: e8 fd fe ff ff call 560 <ml_util_func@plt> [... snip more code ...]
The call is, as before, to ml_util_func@plt. Let’s see what’s there:
与前文一样,call指令调用的是ml_util_func@plt,让我们看看它的内容:
0000000000000560 <ml_util_func@plt>: 560: ff 25 a2 0a 20 00 jmp QWORD PTR [rip+0x200aa2] 566: 68 01 00 00 00 push 0x1 56b: e9 d0 ff ff ff jmp 540 <_init+0x18>
So, the GOT entry holding the actual address of ml_util_func is at0x200aa2 + 0x566 = 0x201008.
因此很容易得出,0x200aa2 + 0x566 = 0x201008地址处存储是函数ml_util_func的地址。
And there’s a relocation for it, as expected:
同样的,重定位表中有关于函数ml_util_func的重定位入口:
$ readelf -r libmlpic.so Relocation section '.rela.dyn' at offset 0x480 contains 5 entries: [...] Relocation section '.rela.plt' at offset 0x4f8 contains 2 entries: Offset Info Type Sym. Value Sym. Name + Addend [...] 000000201008 000600000007 R_X86_64_JUMP_SLO 000000000000063c ml_util_func + 0
(gdb) set environment LD_LIBRARY_PATH=. (gdb) set disassembly-flavor intel (gdb) break ml_func Breakpoint 1 at 0x400500 (gdb) run Starting program: /home/astrol/c/dynamic/position_independent_code_in_shared_libraries_on_x64/driver Breakpoint 1, ml_func (a=1, b=1) at ml_func2.c:10 10 int c = b + ml_util_func(a); (gdb) disassemble Dump of assembler code for function ml_func: 0x00007ffff7bd860b <+0>: push rbp 0x00007ffff7bd860c <+1>: mov rbp,rsp 0x00007ffff7bd860f <+4>: sub rsp,0x20 0x00007ffff7bd8613 <+8>: mov DWORD PTR [rbp-0x14],edi 0x00007ffff7bd8616 <+11>: mov DWORD PTR [rbp-0x18],esi => 0x00007ffff7bd8619 <+14>: mov eax,DWORD PTR [rbp-0x14] 0x00007ffff7bd861c <+17>: mov edi,eax 0x00007ffff7bd861e <+19>: call 0x7ffff7bd8510 <ml_util_func@plt> 0x00007ffff7bd8623 <+24>: add eax,DWORD PTR [rbp-0x18] 0x00007ffff7bd8626 <+27>: mov DWORD PTR [rbp-0x4],eax 0x00007ffff7bd8629 <+30>: mov rax,QWORD PTR [rip+0x200998] 0x00007ffff7bd8630 <+37>: mov eax,DWORD PTR [rax] 0x00007ffff7bd8632 <+39>: mov edx,eax 0x00007ffff7bd8634 <+41>: add edx,DWORD PTR [rbp-0x4] 0x00007ffff7bd8637 <+44>: mov rax,QWORD PTR [rip+0x20098a] 0x00007ffff7bd863e <+51>: mov DWORD PTR [rax],edx 0x00007ffff7bd8640 <+53>: mov rax,QWORD PTR [rip+0x200981] 0x00007ffff7bd8647 <+60>: mov eax,DWORD PTR [rax] 0x00007ffff7bd8649 <+62>: add eax,DWORD PTR [rbp-0x18] 0x00007ffff7bd864c <+65>: leave 0x00007ffff7bd864d <+66>: ret End of assembler dump. (gdb) disassemble 0x7ffff7bd8510,+0x10 Dump of assembler code from 0x7ffff7bd8510 to 0x7ffff7bd8520: 0x00007ffff7bd8510 <ml_util_func@plt+0>: jmp QWORD PTR [rip+0x200aea] 0x00007ffff7bd8516 <ml_util_func@plt+6>: push 0x0 0x00007ffff7bd851b <ml_util_func@plt+11>: jmp 0x7ffff7bd8500 End of assembler dump. (gdb) print/x *(long long *)(0x00007ffff7bd8516 + 0x200aea) $1 = 0x7ffff7bd8516 (gdb) step ml_util_func (a=1) at ml_func2.c:5 5 return a + 1; (gdb) print/x *(long long *)(0x00007ffff7bd8516 + 0x200aea) $2 = 0x7ffff7bd85fc (gdb) print &ml_util_func $3 = (int (*)(int)) 0x7ffff7bd85fc <ml_util_func> (gdb)
通过上面的两个例子,我们可以清楚的看到位置无关代码(PIC)在X64平台上的实现需要更少的代码(相比较于X86平台)。 在X86平台上,需要通过两个步骤获得GOT的地址并且存入一个寄存器(惯例是ebx寄存器):首先通过一个特殊的函数调用得到指令指针的值,接着在这个值上加上与GOT中偏移值。 这两个步骤在X64平台上都不需要,因为链接器知道当前指令与GOT的相对偏移值,并且很容易将这个偏移值用在RIP-relative寻址模式中。
When calling a function, there’s also no need to prepare the GOT address in ebx for the trampoline, as the x86 code does, since the trampoline just accesses its GOT entry directly through RIP-relative addressing.
当调用一个函数时,X64平台不再像X86平台那样使用ebx实现trampoline,因为这个trampoline可以通过RIP-relative寻址直接完成。
So PIC on x64 still requires extra instructions when compared to non-PIC code, but the additional cost is smaller. The indirect cost of tying down a register to use as the GOT pointer (which is painful on x86) is also gone, since no such register is needed with RIP-relative addressing [3]. All in all, x64 PIC results in a much smaller performance hit than on x86, making it much more attractive. So attractive, in fact, that it’s the default method for writing shared libraries for this architecture.
可以看到在X64平台上的位置无关代码(PIC)相对于非位置无关代码(non-PIC)依然需要一点额外的指令来完成,但是花费的代价变小了,并且也不再需要额外的寄存器来存储GOT的地址(在X86平台上必须要占用一个寄存器)。 因此,位置无关代码(PIC)在X64平台上的实现相对于X86平台,性能损耗变得更少了,程序的性能最终得到了提升 —— 事实上,位置无关代码(PIC)是X64架构默认的创建共享库的方法。
Not only does gcc encourage you to use PIC for shared libraries on x64, it requires it by default. For instance, if we compile the first example without-fpic[4] and then try to link it into a shared library (with-shared), we’ll get an error from the linker, something like this:
GCC不仅仅推荐你在X64上创建共享库时使用PIC,它也将其设定为默认行为。 举例来说, 前面的程序如何编译时不加-fpic选项,而仅仅使用-shared选项的话,我们就会得到链接器的错误提示如下:
/usr/bin/ld: ml_nopic_dataonly.o: relocation R_X86_64_PC32 against symbol `myglob' can not be used when making a shared object; recompile with -fPIC /usr/bin/ld: final link failed: Bad value collect2: ld returned 1 exit status
What’s going on? Let’s look at the disassembly of ml_nopic_dataonly.o [5]:
怎么会这样呢?让我们看看目标文件ml_nopic_dataonly.o的反汇编:
0000000000000000 <ml_func>: 0: 55 push rbp 1: 48 89 e5 mov rbp,rsp 4: 89 7d fc mov DWORD PTR [rbp-0x4],edi 7: 89 75 f8 mov DWORD PTR [rbp-0x8],esi a: 8b 05 00 00 00 00 mov eax,DWORD PTR [rip+0x0] 10: 03 45 fc add eax,DWORD PTR [rbp-0x4] 13: 03 45 f8 add eax,DWORD PTR [rbp-0x8] 16: c9 leave 17: c3 ret
Note how myglob is accessed here, in instruction at address0xa. It expects the linker to patch in a relocation to the actual location ofmyglob into the operand of the instruction (so no GOT redirection is required):
注意地址0xa处的指令对变量myglob是如何访问的,果然链接器最终会将变量myglob的真实地址写进指令码中(因此就不在需要GOT来间接访问了)。
$ readelf -r ml_nopic_dataonly.o Relocation section '.rela.text' at offset 0xb38 contains 1 entries: Offset Info Type Sym. Value Sym. Name + Addend 00000000000c 000f00000002 R_X86_64_PC32 0000000000000000 myglob - 4 [...]
Here is the R_X86_64_PC32 relocation the linker was complaining about. It just can’t link an object with such relocation into a shared library. Why? Because the displacement of themov (the part that’s added to rip) must fit in 32 bits, and when a code gets into a shared library, we just can’t know in advance that 32 bits will be enough. After all, this is a full 64-bit architecture, with a vast address space. The symbol may eventually be found in some shared library that’s farther away from the reference than 32 bits will allow to reference. This makesR_X86_64_PC32 an invalid relocation for shared libraries on x64.
链接器之所以会报错,是因为它抱怨这里变量myglob的重定位类型是R_X86_64_PC32,而目标文件中一旦含有这种重定位类型的话是不能够创建共享库的,为什么会这样呢?从上面的汇编代码,我们可以看到mov指令中的位移必须是32位的(这个位移会与RIP相加),因此如果共享库中包含有这样的代码,我们是不能确保32位是否足够大的,毕竟这是在空间很大的64位平台上。 也许链接时发现这个符号定义在其它共享库中,而这个共享库远在32位所能访问的空间之外,因此重定位类型R_X86_64_PC32在X64平台的共享库中是无效的。
But can we still somehow create non-PIC code on x64? Yes! We should be instructing the compiler to use the "large code model", by adding the-mcmodel=large flag. The topic of code models is interesting, but explaining it would just take us too far from the real goal of this article[6]. So I’ll just say briefly that a code model is a kind of agreement between the programmer and the compiler, where the programmer makes a certain promise to the compiler about the size of offsets the program will be using. In exchange, the compiler can generate better code.
那么我们有办法在X64平台生成非位置无关代码(non-PIC)的程序吗? 当然可以!不过我们必须使用-mcmodel=large选项告诉编译器使用"large code model"模式来编译程序。关于code model的话题很有意思,但在这里讨论的话就离我们这篇文章的目的太远了。 简单点说,code model 就是我们与编译器之间的一种协议,我们告诉编译器在程序中会使用何种程度的位移,因此编译器最终会生成我们想要的更好的代码。
It turns out that to make the compiler generate non-PIC code on x64 that actually pleases the linker, only the large code model is suitable, because it’s the least restrictive. Remember how I explained why the simple relocation isn’t good enough on x64, for fear of an offset which will get farther than 32 bits away during linking? Well, the large code model basically gives up on all offset assumptions and uses the largest 64-bit offsets for all its data references. This makes load-time relocations always safe, and enables non-PIC code generation on x64. Let’s see the disassembly of the first example compiled without-fpic and with-mcmodel=large:
结果证明,想在X64上生成非位置无关代码(non-PIC)的程序必须得“取悦”链接器,因为链接器是有限制条件的,因此只有large code model可以生成非位置无关代码(non-PIC)。想想前面解释的为什么简单的重定位类型(譬如R_X86_64_PC32)不能满足X64,就是怕在链接时32位的寻址范围不足以满足需要的范围!然而,当指定large code model时,编译器基本上就会放弃所有的假设,取而代之的做法是对所有的数据访问都是采用64位的位移,这就使得装载时重定位在X64是安全的,最终我们在X64得到非位置无关代码(non-PIC)。让我们查看一下生成的汇编代码(不适用-fpic选项,加上-mcmodel=large):
0000000000000000 <ml_func>: 0: 55 push rbp 1: 48 89 e5 mov rbp,rsp 4: 89 7d fc mov DWORD PTR [rbp-0x4],edi 7: 89 75 f8 mov DWORD PTR [rbp-0x8],esi a: 48 b8 00 00 00 00 00 mov rax,0x0 11: 00 00 00 14: 8b 00 mov eax,DWORD PTR [rax] 16: 03 45 fc add eax,DWORD PTR [rbp-0x4] 19: 03 45 f8 add eax,DWORD PTR [rbp-0x8] 1c: c9 leave 1d: c3 ret
The instruction at address 0xa places the address ofmyglob intoeax. Note that its argument is currently 0, which tells us to expect a relocation. Note also that it has a full 64-bit address argument. Moreover, the argument is absolute and not RIP-relative[7]. Note also that two instructions are actually required here to get thevalue ofmyglob into eax. This is one reason why the large code model is less efficient than the alternatives.
可以看到,地址0xa处的指令将变量myglob的地址存入寄存器rax。 注意此时变量myglob的地址还不知道,所以是地址暂且是0,告诉我们这里装载时会重定位。 注意到没有,地址是64位的,并且是与RIP-relative无关的一个绝对值。同样的,注意到没有,这里需要两条指令才能最终获得变量myglob的值,这也就解释了为什么在large code model模式下生成的代码效率要低一些的原因。
Now let’s see the relocations:
现在让我们重定位:
$ readelf -r ml_nopic_dataonly.o Relocation section '.rela.text' at offset 0xb40 contains 1 entries: Offset Info Type Sym. Value Sym. Name + Addend 00000000000c 000f00000001 R_X86_64_64 0000000000000000 myglob + 0 [...]
Note the relocation type has changed to R_X86_64_64, which is an absolute relocation that can have a 64-bit value. It’s acceptable by the linker, which will now gladly agree to link this object file into a shared library.
注意到这里变量myglob的重定位类型是R_X86_64_64,这是一种64位值的绝对重定位类型。链接器可以接受这种重定位类型,因此现在可以生成共享库了。
译者补充:
(gdb) set environment LD_LIBRARY_PATH=. (gdb) set disassembly-flavor intel (gdb) break ml_func Breakpoint 1 at 0x400510 (gdb) run Starting program: /home/astrol/c/dynamic/position_independent_code_in_shared_libraries_on_x64/driver Breakpoint 1, ml_func (a=1, b=1) at ml_func.c:4 4 return myglob + a + b; (gdb) disassemble Dump of assembler code for function ml_func: 0x00007ffff7bd859c <+0>: push rbp 0x00007ffff7bd859d <+1>: mov rbp,rsp 0x00007ffff7bd85a0 <+4>: mov DWORD PTR [rbp-0x4],edi 0x00007ffff7bd85a3 <+7>: mov DWORD PTR [rbp-0x8],esi => 0x00007ffff7bd85a6 <+10>: movabs rax,0x7ffff7dd9010 0x00007ffff7bd85b0 <+20>: mov eax,DWORD PTR [rax] 0x00007ffff7bd85b2 <+22>: add eax,DWORD PTR [rbp-0x4] 0x00007ffff7bd85b5 <+25>: add eax,DWORD PTR [rbp-0x8] 0x00007ffff7bd85b8 <+28>: pop rbp 0x00007ffff7bd85b9 <+29>: ret End of assembler dump. (gdb) print/x &myglob $1 = 0x7ffff7dd9010 (gdb)
Some judgmental thinking may bring you to ponder why the compiler generated code that isn’t suitable for load-time relocation by default. The answer to this is simple. Don’t forget that code also tends to get directly linked into executables, which don’t require load-time relocations at all. Therefore, by default the compiler assumes the small code model to generate the most efficient code. If you know your code is going to get into a shared library, and you don’t want PIC, then just tell it to use the large code model explicitly. I think gcc‘s behavior makes sense here.
那么我们有没有想过,编译器为什么不直接默认生成适合装载时重定位的代码呢?答案很简单 —— 不要忘了,这段代码也有可能直接链接进可执行文件,那么根本就不需要重载时重定位。 因此,编译器默认使用small code model模式编译生成更高效的代码。 如果你想使你的代码可以即可以创建共享库,而又是非位置无关代码(non-PIC),那么你只需告诉编译器使用large code model模式即可。
Another thing to think about is why there are no problems with PIC code using the small code model. The reason is that the GOT is always located in the same shared library as the code that references it, and unless a single shared library is big enough for a 32-bit address space, there should be no problems addressing the PIC with 32-bit RIP-relative offsets. Such huge shared libraries are unlikely, but in case you’re working on one, the AMD64 ABI has a "large PIC code model" for this purpose.
另一个值得考虑的问题是为什么在small code model模式下生成的位置无关代码(PIC)就没有问题呢? 理由是, GOT总是存在于同一个共享库中,所以除非单一共享库大到超过32位的地址空间,不然对于PIC采用32位的RIP-relative就不会有问题。 这样的超大共享库是不太可能的,但是万一你在这样的例外之中,ADM64 ABI有一个"large PIC code model"作为解决方案。
This article complements its predecessor by showing how PIC works on the x64 architecture. This architecture has a new addressing mode that helps PIC code be faster, and thus makes it more desirable for shared libraries than on x86, where the cost is higher. Since x64 is currently the most popular architecture used in servers, desktops and laptops, this is important to know. Therefore, I tried to focus on additional aspects of compiling code into shared libraries, such as non-PIC code. If you have any questions and/or suggestions on future directions to explore, please let me know in the comments or by email.
这篇文章介绍了位置无关代码(PIC)在X64上如何工作的。文章中介绍了一个新的寻址模式(RIP-relative),这个寻址模式使得PIC更高效,因此位置无关代码(PIC)成为X64平台的默认的创建共享库的方法。因为X64越来越流行,因此我也涉及了一些非位置无关代码(non-PIC)的介绍。
[1] | As always, I’m using x64 as a convenient short name for the architecture known as x86-64, AMD64 or Intel 64. |
[2] | Into eax and not rax because the type of myglob is int, which is still 32-bit on x64. |
[3] | By the way, it would be much less "painful" to tie down a register on x64, since it has twice as many GPRs as x86. |
[4] | It also happens if we explicitly specify we don’t want PIC by passing -fno-pic to gcc. |
[5] | Note that unlike other disassembly listings we’ve been looking at in this and the previous article, this is an object file, not a shared library or executable. Therefore it will contain some relocations for the linker. |
[6] | For some good information on this subject, take a look at the AMD64 ABI, and man gcc. |
[7] | Some assemblers call this instruction movabs to distinguish it from the othermov instructions that accept a relative argument. The Intel architecture manual, however, keeps naming it justmov. Its opcode format is REX.W + B8 + rd. |