【整理】GCC中-fpic解惑


参考:

1.《3.18 Options for Code Generation Conventions
2.《Options for Linking
3.《GCC -fPIC option
4.《百度百科》
5.《请问gcc里的参数-fPIC的一些问题


=== 我是路人甲很好看的分割线  ===

3.18 Options for Code Generation Conventions

...
-fpic
Generate position-independent code (PIC) suitable for use in a shared library, if supported for the target machine. Such code accesses all constant addresses through a global offset table (GOT). The dynamic loader resolves the GOT entries when the program starts (the dynamic loader is not part of GCC; it is part of the operating system). If the GOT size for the linked executable exceeds a machine-specific maximum size, you get an error message from the linker indicating that -fpic does not work; in that case, recompile with -fPIC instead. (These maximums are 8k on the SPARC and 32k on the m68k and RS/6000. The x86 has no such limit.)
该选项用于生成位置无关代码(PIC),尤其被用于共享库的创建(如果目标机器架构支持的话)。使用该选项编译出的代码在访问所有常量地址时,会通过全局偏移表(GOT)进行计算得到。动态加载器将会在目标程序启动的时候解析 GOT 的入口(动态加载器不是 GCC 的一部分,其属于操作系统的一部分)。如果对于需要进行链接的可执行程序来说, 使用的 GOT 大小超过了 machine-specific 值指定的最大大小,你将会得到一条来自链接器的错误信息,以表明 -fpic 无法正常工作;在这种情况下,会使用 -fPIC 选项再重新编译一次。(这个最大大小限制在 SPARC 上为 8k ,在 m68k 和 RS/6000 上为 32k ,而在 x86 上没有限制

Position-independent code requires special support, and therefore works only on certain machines. For the x86, GCC supports PIC for System V but not for the Sun 386i. Code generated for the IBM RS/6000 is always position-independent.
位置无关代码要求特定的支持,所以进在特定的机器架构上才有效。对于 x86 而言,GCC 支持 System V 上的 PIC ,但不支持 Sun 385i 上的。而对于 IBM RS/6000 来说,生成的代码总是位置无关的。

When this flag is set, the macros __pic__ and __PIC__ are defined to 1. 
当该选项设置后,宏 __pic__ 和 __PIC__ 将被定义为 1 。

-fPIC
If supported for the target machine, emit position-independent code, suitable for dynamic linking and avoiding any limit on the size of the global offset table. This option makes a difference on the m68k, PowerPC and SPARC.
Position-independent code requires special support, and therefore works only on certain machines.
除了可以避免全局偏移表大小限制的问题外,其它方面和上面的一样。

When this flag is set, the macros __pic__ and __PIC__ are defined to 2. 
当该选项设置后,宏 __pic__ 和 __PIC__ 将被定义为 2 。
...

=== 我是路人甲很好看的分割线  ===


3.13 Options for Linking

-shared
Produce a shared object which can then be linked with other objects to form an executable. Not all systems support this option. For predictable results, you must also specify the same set of options used for compilation (-fpic, -fPIC, or model suboptions) when you specify this linker option.
...
On some systems, ‘gcc -shared’ needs to build supplementary stub code for constructors to work. On multi-libbed systems, ‘gcc -shared’ must select the correct support libraries to link against. Failing to supply the correct flags may lead to subtle defects. Supplying them in cases where they are not necessary is innocuous.

结论:使用 -shared 选项生成共享库时,最好还是带上 -fpic 或 -fPIC 等选项。

=== 我是路人甲很好看的分割线  ===


GCC -fPIC option

I have read the link about GCC's Options for Code Generation Conventions, but could not understand what is "Generate position-independent code (PIC)". Please give an example to explain me what does it mean.

Position Independent Code means that the generated machine code is not dependent on being located at a specific address in order to work.
位置无关代码是指生成的机器码不再依赖于自身被加载到的具体地址是什么,都可以正常工作。

E.g. jumps would be generated as relative rather than absolute.
例如 jumps 指令将按照相对地址进行跳转而非绝对地址。

Pseudo-assembly:
PIC: This would work whether the code was at address 100 or 1000
100: COMPARE REG1, REG2
101: JUMP_IF_EQUAL CURRENT+10
...
111: NOP
Non-PIC: This will only work if the code is at address 100
100: COMPARE REG1, REG2
101: JUMP_IF_EQUAL 111
...
111: NOP
EDIT: In response to comment.

If your code is compiled with -fPIC, it's suitable for inclusion in a library - the library must be able to be relocated from its preferred location in memory to another address, there could be another already loaded library at the address your library prefers.
--
This example is clear, but as a user what will be the difference if I create a shared labrary (.so) file without the option? Are there some cases that without -fPIC my lib will be invalid?
--
To be more specific, the shared library is supposed to be shared between processes, but it may not always be possible to load the library at the same address in both. If the code were not position independent, then each process would require its own copy. 
这里的意思是,如果编译库的时候没有指定 -fPIC 选项,那么多进程再同时使用该库的时候,将需要在各自的进程空间中加载一份该库。
--
The problem is that, as I know, all the addresses written in the example above are virtual addresses and they will be others when this library will be loaded in a memory. So one process can call the same function from A address and the second process can call the same function from the same A address, or can copy the lib into a memory once again and call the function from B address. What is the problem? Where the error should occur without -fPIC?
--
the error occurs if one process wants to load more than one shared library at the same virtual address. Since libraries cannot predict what other libraries could be loaded, this problem is unavoidable with the traditional shared library concept. Virtual address space doesn't help here. 
真正的问题在于,一个进程会在自己的进程空间中(虚拟地址空间)加载多个共享库,若未采用 -fPIC ,那么各个共享库内的地址(会采用绝对地址)就可能发生冲突,因为在编译共享库的时候,是不会假设还有其他共享库存在的。这个问题在传统共享库概念中是无法避免的。虚拟地址空间在这个问题上起不了作用。
--
I'll try to explain what already been said more simply.
when a shared lib is loaded the loader (the code on the OS which load any program you run) changes some addresses in the code depending on where the object was loaded to. in the ex. above the "111" in the Non-PIC code is written by the loader in the first time it was loaded.
for not shared object, you may want it to be like that because the compiler can make some optimizations on that code.
for shared object, if another process will want to "link" to that code he must read it to the same virtual addresses or the "111" will make no sense. but that virtual-space may already be in use in the second process.
这段话从另一个角度讲解了为什么共享库需要 PIC 。
--
Code that is built into shared libraries should normally be position-independent code, so that the shared library can readily be loaded at (more or less) any address in memory. The -fPIC option ensures that GCC produces such code.
--
Adding further...
Every process has same virtual address space (If randomization of virtual address is stopped by using a flag in linux OS) 
So if its one exe with no shared linking (Hypothetical scenario), then we can always give same virtual address to same asm instruction without any harm.
如果可执行程序不需要进行共享库的链接,那么我们总是可以将相同的虚拟地址赋予相同的 asm 指令,而不会导致任何问题。

But when we want to link shared object to the exe, then we are not sure of the start address assigned to shared object as it will depend upon the order the shared objects were linked. That being said, asm instruction inside .so will always have different virtual address depending upon the process its linking to.
但是,当想要链接共享库到可执行程序时,我们无法确定共享库(在进程空间)的起始地址,因为这取决于所链接的其他共享库。这也就是说,.so 内的 asm 指令将总是被赋予不同的虚拟地址,具体取决于可执行程序链接了哪些共享库(以及链接顺序)

So one process can give start address to .so as 0x45678910 in its own virtual space and other process at the same time can give start address of 0x12131415 and if they do not use relative addressing, .so will not work at all.

So they always have to use the relative addressing mode and hence fpic option.


=== 我是路人甲很好看的分割线  ===

百度百科

使用 -fPIC 选项,会生成 PIC 代码。.so 要求为 PIC,以达到动态链接的目的,否则,无法实现动态链接。
non-PIC 与 PIC 代码的区别主要在于 access global data, jump label 的不同。

比如 access global data 指令
non-PIC 的形式是:
ld r3, var1
PIC 的形式则是:
ld r3, var1-offset@GOT
      意思是从 GOT 表中 index 为 var1-offset 的地方指示的地址处去装载一个值,即 var1-offset@GOT 处的 4 个 byte 其实就是 var1 的地址。这个地址只有在运行的时候才知道, 是由 dynamic-loader(ld-linux.so) 填进去的。

再比如 jump label 指令
non-PIC 的形式是:
jump printf
意思是调用 printf。

PIC 的形式则是:
jump printf-offset@GOT
      意思是跳到 GOT 表的 index 为 printf-offset 的地方处指示的地址去执行。这个地址处的代码存放在 .plt section ,每个外部函数对应一段这样的代码,其功能是呼叫 dynamic-loader(ld-linux.so) 来查找函数的地址(本例中是 printf),然后将其地址写到 GOT 表的 index 为 printf-offset 的地方,同时执行这个函数。这样,第 2 次呼叫 printf 的时候,就会直接跳到 printf 的地址,而不必再查找了。

GOT 是 data section, 是一个 table, 除专用的几个 entry,每个 entry 的内容可以在执行的时候修改;
PLT 是 text section, 是一段一段的 code,执行中不需要修改。

每个 target 实现 PIC 的机制不同,但大同小异。
比如 MIPS 没有 .plt, 而是叫 .stub,功能和 .plt 一样。


=== 我是路人甲很好看的分割线  ===


请问gcc里的参数-fPIC的一些问题

加上-fPIC参数后编译的文件和没有加这个参数的文件有什么区别呢?在代码里面做了什么修改能增强它的可重定位性,或者说位置无关性呢?
而且,用没有加这个参数的编译后的共享库,也可以使用,它和加了参数后的使用起来又有什么区别呢?
--
我的理解.
不加fPIC编译出来的so,是要再加载时根据加载到的位置再次重定位的.(因为它里面的代码并不是位置无关代码)
如果被多个应用程序共同使用,那么它们必须每个程序维护一份so的代码副本了.(因为so被每个程序加载的位置都不同,显然这些重定位后的代码也不同,当然不能共享)
这样就失去了共享库的好处,实际上和静态库的区别并不大,在运行时占用的内存是类似的,仅仅是二进制代码占的硬盘空间小一些.而且在加载时才重定位的开销也很大(这一点使得这种做法更加没有意义).
--
阁下忽略了动态连接库的另外一个非常重要的作用, 动态连接, 这样程序可以支持二进制文件接口, 比如连接libc时一般都使用.so而不是.a, 你总不想在libc更新后重新链接你的程序吧? 实际上这种功能比所谓的share更重要, 应用也更广泛.
--
阁下好象误会了我的意思,我只是针对fPIC在做说明.fPIC与非fPIC的区别,与.so和.a的区别是两回事.虽然我们总是用fPIC来生成so,也从来不用fPIC来生成a.
fPIC与动态链接可以说基本没有关系,libc.so一样可以不用fPIC编译,只是这样的so必须要在加载到用户程序的地址空间时重定向所有表目.
--
因此,不用fPIC编译so并不总是不好.
如果你满足以下4个需求/条件:
  • 该库可能需要经常更新
  • 该库需要非常高的效率(尤其是有很多全局量的使用时)
  • 该库并不很大.
  • 该库基本不需要被多个应用程序共享
我认为你的so就完全可以不用fPIC编译.
--
从GCC来看,shared应该是包含fPIC选项的,但似乎不是所以系统都支持,所以最好显式加上fPIC选项。参见如下
`-shared'
     Produce a shared object which can then be linked with other
     objects to form an executable.  Not all systems support this
     option.  For predictable results, you must also specify the same
     set of options that were used to generate code (`-fpic', `-fPIC',
     or model suboptions) when you specify this option.(1)
--
加了-fPIC之后生成的x86代码,由于x86不像IA64有专用重定位寄存器,编译器用了EBX基址寄存器来做间接寻址。写内嵌汇编时注意不要破坏EBX的值。
--
我就不想趟这混水,程序员的自我修养上面有详细解释,主要就是关于符号的地址确定问题
--
我理解是这样的:
使用 fpic 编译会生出地址无关代码,改代码执行效率会稍微低一些, 不过可以便与多个进程共享;
静态库不需要进程之间共享, 使用fpic 编译不仅得不到好处, 还会降低程序执行效率



你可能感兴趣的:(gcc,pic,-fpic)