C++ GCC -fPIC 选项

Question

提问by Narek

I have read about GCC's Options for Code Generation Conventions, but could not understand what "Generate position-independent code (PIC)" does. Please give an example to explain me what does it mean.

我已阅读GCC 的代码生成约定选项，但无法理解“生成位置无关代码 (PIC)”的作用。请举例说明这是什么意思。

Answer 1

回答by Erik

Position Independent Code means that the generated machine code is not dependent on being located at a specific address in order to work.

位置无关代码意味着生成的机器代码不依赖于位于特定地址才能工作。

E.g. jumps would be generated as relative rather than absolute.

例如，跳跃将作为相对而不是绝对生成。

Pseudo-assembly:

伪组装：

PIC: This would work whether the code was at address 100 or 1000

PIC：无论代码在地址 100 还是 1000，这都有效

100: COMPARE REG1, REG2
101: JUMP_IF_EQUAL CURRENT+10
...
111: NOP

Non-PIC: This will only work if the code is at address 100

非 PIC：仅当代码位于地址 100 时才有效

100: COMPARE REG1, REG2
101: JUMP_IF_EQUAL 111
...
111: NOP

EDIT: In response to comment.

编辑：回应评论。

If your code is compiled with -fPIC, it's suitable for inclusion in a library - the library must be able to be relocated from its preferred location in memory to another address, there could be another already loaded library at the address your library prefers.

如果您的代码是使用 -fPIC 编译的，则它适合包含在库中 - 该库必须能够从其在内存中的首选位置重定位到另一个地址，在您的库首选地址处可能有另一个已加载的库。

Answer 2

回答by Roee Gavirel

I'll try to explain what has already been said in a simpler way.

我将尝试以更简单的方式解释已经说过的内容。

Whenever a shared lib is loaded, the loader (the code on the OS which load any program you run) changes some addresses in the code depending on where the object was loaded to.

每当加载共享库时，加载器（操作系统上加载您运行的任何程序的代码）都会根据对象加载到的位置更改代码中的某些地址。

In the above example, the "111" in the non-PIC code is written by the loader the first time it was loaded.

在上面的例子中，非PIC代码中的“111”是加载器第一次加载时写入的。

For not shared objects, you may want it to be like that because the compiler can make some optimizations on that code.

对于非共享对象，您可能希望它是这样，因为编译器可以对该代码进行一些优化。

For shared object, if another process will want to "link" to that code he must read it to the same virtual addresses or the "111" will make no sense. but that virtual-space may already be in use in the second process.

对于共享对象，如果另一个进程想要“链接”到该代码，他必须将其读取到相同的虚拟地址，否则“111”将毫无意义。但该虚拟空间可能已经在第二个进程中使用。

Answer 3

回答by Jonathan Leffler

Code that is built into shared libraries should normally be position-independent code, so that the shared library can readily be loaded at (more or less) any address in memory. The -fPICoption ensures that GCC produces such code.

内置于共享库中的代码通常应该是与位置无关的代码，以便共享库可以很容易地（或多或少）加载到内存中的任何地址。该-fPIC选项可确保 GCC 生成此类代码。

Answer 4

回答by Ritesh

Adding further...

进一步补充...

Every process has same virtual address space (If randomization of virtual address is stopped by using a flag in linux OS) (For more details Disable and re-enable address space layout randomization only for myself)

每个进程都有相同的虚拟地址空间（如果在 linux 操作系统中使用标志停止了虚拟地址的随机化）（有关更多详细信息，仅为我自己禁用并重新启用地址空间布局随机化）

So if its one exe with no shared linking (Hypothetical scenario), then we can always give same virtual address to same asm instruction without any harm.

因此，如果它的一个 exe 没有共享链接（假设场景），那么我们总是可以为相同的 asm 指令提供相同的虚拟地址而不会造成任何伤害。

But when we want to link shared object to the exe, then we are not sure of the start address assigned to shared object as it will depend upon the order the shared objects were linked.That being said, asm instruction inside .so will always have different virtual address depending upon the process its linking to.

但是当我们想要将共享对象链接到 exe 时，我们不确定分配给共享对象的起始地址，因为它将取决于共享对象链接的顺序。也就是说，.so 中的 asm 指令将始终具有不同的虚拟地址取决于其链接到的进程。

So one process can give start address to .so as 0x45678910 in its own virtual space and other process at the same time can give start address of 0x12131415 and if they do not use relative addressing, .so will not work at all.

因此，一个进程可以将起始地址赋予 .so 作为 0x45678910 在其自己的虚拟空间中，而其他进程同时可以赋予 0x12131415 的起始地址，如果它们不使用相对寻址，则 .so 将根本不起作用。

So they always have to use the relative addressing mode and hence fpic option.

所以他们总是必须使用相对寻址模式，因此必须使用 fpic 选项。

Answer 5

回答by bruziuz

The link to a function in a dynamic library is resolved when the library is loaded or at run time. Therefore, both the executable file and dynamic library are loaded into memory when the program is run. The memory address at which a dynamic library is loaded cannot be determined in advance, because a fixed address might clash with another dynamic library requiring the same address.

在加载库或运行时解析动态库中函数的链接。因此，在程序运行时，可执行文件和动态库都被加载到内存中。无法预先确定加载动态库的内存地址，因为固定地址可能会与另一个需要相同地址的动态库发生冲突。

There are two commonly used methods for dealing with this problem:

有两种常用的方法来处理这个问题：

1.Relocation. All pointers and addresses in the code are modified, if necessary, to fit the actual load address. Relocation is done by the linker and the loader.

1.搬迁。如有必要，将修改代码中的所有指针和地址，以适合实际加载地址。重定位由链接器和加载器完成。

2.Position-independent code. All addresses in the code are relative to the current position. Shared objects in Unix-like systems use position-independent code by default. This is less efficient than relocation if program run for a long time, especially in 32-bit mode.

2.位置无关代码。代码中的所有地址都是相对于当前位置的。默认情况下，类 Unix 系统中的共享对象使用与位置无关的代码。如果程序长时间运行，这比重定位效率低，尤其是在 32 位模式下。

The name "position-independent code" actually implies following:

名称“位置无关代码”实际上意味着以下内容：

The code section contains no absolute addresses that need relocation, but only self relative addresses. Therefore, the code section can be loaded at an arbitrary memory address and shared between multiple processes.
The data section is not shared between multiple processes because it often contains writeable data. Therefore, the data section may contain pointers or addresses that need relocation.
All public functions and public data can be overridden in Linux. If a function in the main executable has the same name as a function in a shared object, then the version in main will take precedence, not only when called from main, but also when called from the shared object. Likewise, when a global variable in main has the same name as a global variable in the shared object, then the instance in main will be used, even when accessed from the shared object.

代码部分不包含需要重定位的绝对地址，而只包含自相对地址。因此，代码段可以加载到任意内存地址并在多个进程之间共享。
数据部分不在多个进程之间共享，因为它通常包含可写数据。因此，数据段可能包含需要重定位的指针或地址。
在 Linux 中可以覆盖所有公共函数和公共数据。如果 main 可执行文件中的函数与共享对象中的函数同名，则 main 中的版本将优先，不仅在从 main 调用时，而且在从共享对象调用时也是如此。同样，当 main 中的全局变量与共享对象中的全局变量同名时，即使从共享对象访问，也会使用 main 中的实例。

This so-called symbol interposition is intended to mimic the behavior of static libraries.

这种所谓的符号插入旨在模仿静态库的行为。

A shared object has a table of pointers to its functions, called procedure linkage table (PLT) and a table of pointers to its variables called global offset table (GOT) in order to implement this "override" feature. All accesses to functions and public variables go through this tables.

共享对象有一个指向其函数的指针表，称为过程链接表 (PLT)，还有一个指向其变量的指针表，称为全局偏移表 (GOT)，以实现此“覆盖”功能。所有对函数和公共变量的访问都通过这个表。

p.s. Where dynamic linking cannot be avoided, there are various ways to avoid the timeconsuming features of the position-independent code.

ps 在无法避免动态链接的地方，有多种方法可以避免位置无关代码的耗时特性。

You can read more from this article: http://www.agner.org/optimize/optimizing_cpp.pdf

您可以从这篇文章中阅读更多内容：http: //www.agner.org/optimize/optimizing_cpp.pdf

Answer 6

回答by user1016759

A minor addition to the answers already posted: object files not compiled to be position independent are relocatable; they contain relocation table entries.

对已经发布的答案的一个小补充：未编译为与位置无关的目标文件是可重定位的；它们包含重定位表条目。

These entries allow the loader (that bit of code that loads a program into memory) to rewrite the absolute addresses to adjust for the actual load address in the virtual address space.

这些条目允许加载器（将程序加载到内存中的那段代码）重写绝对地址以调整虚拟地址空间中的实际加载地址。

An operating system will try to share a single copy of a "shared object library" loaded into memory with all the programs that are linked to that same shared object library.

操作系统将尝试与链接到同一共享对象库的所有程序共享加载到内存中的“共享对象库”的单个副本。

Since the code address space (unlike sections of the data space) need not be contiguous, and because most programs that link to a specific library have a fairly fixed library dependency tree, this succeeds most of the time. In those rare cases where there is a discrepancy, yes, it may be necessary to have two or more copies of a shared object library in memory.

由于代码地址空间（与数据空间的部分不同）不需要是连续的，并且因为链接到特定库的大多数程序都有一个相当固定的库依赖树，所以大多数情况下这是成功的。在出现差异的极少数情况下，是的，可能需要在内存中拥有两个或多个共享对象库的副本。

Obviously, any attempt to randomize the load address of a library between programs and/or program instances (so as to reduce the possibility of creating an exploitable pattern) will make such cases common, not rare, so where a system has enabled this capability, one should make every attempt to compile all shared object libraries to be position independent.

显然，任何在程序和/或程序实例之间随机化库的加载地址的尝试（以减少创建可利用模式的可能性）都会使这种情况变得普遍，而不是罕见，因此在系统启用此功能的情况下，应该尽一切努力将所有共享对象库编译为位置无关。

Since calls into these libraries from the body of the main program will also be made relocatable, this makes it much less likely that a shared library will have to be copied.

由于从主程序主体对这些库的调用也将是可重定位的，这使得必须复制共享库的可能性大大降低。

C++ GCC -fPIC 选项

提问by Narek

回答by Erik

回答by Roee Gavirel

回答by Jonathan Leffler

回答by Ritesh

回答by bruziuz

回答by user1016759

相关推荐

最近更新

标签

C++ GCC -fPIC 选项

提问by Narek

回答by Erik

回答by Roee Gavirel

回答by Jonathan Leffler

回答by Ritesh

回答by bruziuz

回答by user1016759

相关推荐

在 C++ 中实现 no-op 语句的可移植方式是什么？

C++ 使用 OpenSSL 进行 Base64 编码和解码

C++ 在 Qt Creator 中，我在哪里将参数传递给编译器？

C++ 为什么 std::map 实现为红黑树？

相关推荐

最近更新

标签