Linux 如何将目标文件“链接”到可执行/编译的二进制文件?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/9449845/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-06 04:49:32  来源:igfitidea点击:

How to 'link' object file to executable/compiled binary?

clinuxlinker

提问by Mike Kwan

Problem

问题

I wish to inject an object file into an existing binary. As a concrete example, consider a source Hello.c:

我希望将目标文件注入现有的二进制文件中。作为一个具体的例子,考虑一个来源Hello.c

#include <stdlib.h>

int main(void)
{
    return EXIT_SUCCESS;
}

It can be compiled to an executable named Hellothrough gcc -std=gnu99 -Wall Hello.c -o Hello. Furthermore, now consider Embed.c:

它可以编译为名为的可执行文件,Hello通过gcc -std=gnu99 -Wall Hello.c -o Hello. 此外,现在考虑Embed.c

func1(void)
{
}

An object file Embed.ocan be created from this through gcc -c Embed.c. My question is how to generically insert Embed.ointo Helloin such a way that the necessary relocations are performed, and the appropriate ELF internal tables (e.g. symbol table, PLT, etc.) are patched properly?

目标文件Embed.o可以从此通过创建gcc -c Embed.c。我的问题是如何将一般插入Embed.oHello以这样的方式使必要的位置进行,并适当ELF内部表(如符号表,PLT等)正确修补?



Assumptions

假设

It can be assumed that the object file to be embedded has its dependencies statically linked already. Any dynamic dependencies, such as the C runtime can be assumed to be present also in the target executable.

可以假设要嵌入的目标文件已经静态链接了它的依赖项。可以假设任何动态依赖项(例如 C 运行时)也存在于目标可执行文件中。



Current Attempts/Ideas

当前的尝试/想法

  • Use libbfdto copy sections from the object file into the binary. The progress I have made with this is that I can create a new object with the sections from the original binary and the sections from the object file. The problem is that since the object file is relocatable, its sections can not be copied properly to the output without performing the relocations first.
  • Convert the binary back to an object file and relink with ld. So far I tried using objcopyto perform the conversion objcopy --input elf64-x86-64 --output elf64-x86-64 Hello Hello.o. Evidently this does not work as I intend since ld -o Hello2 Embed.o Hello.owill then result in ld: error: Hello.o: unsupported ELF file type 2. I guess this should be expected though since Hellois not an object file.
  • Find an existing tool which performs this sort of insertion?
  • 使用libbfd从对象文件复制到节二进制。我在这方面取得的进展是,我可以使用原始二进制文件中的部分和目标文件中的部分创建一个新对象。问题在于,由于目标文件是可重定位的,如果不先执行重定位,就无法将其部分正确复制到输出中。
  • 将二进制文件转换回目标文件并使用ld. 到目前为止,我尝试使用objcopy来执行转换objcopy --input elf64-x86-64 --output elf64-x86-64 Hello Hello.o。显然这并不像我打算的那样工作,因为ld -o Hello2 Embed.o Hello.o会导致ld: error: Hello.o: unsupported ELF file type 2. 我想这应该是意料之中的,因为Hello它不是目标文件。
  • 查找执行此类插入的现有工具?


Rationale (Optional Read)

基本原理(可选阅读)

I am making a static executable editor, where the vision is to allow the instrumentation of arbitrary user-defined routines into an existing binary. This will work in two steps:

我正在制作一个静态的可执行编辑器,其愿景是允许将任意用户定义的例程插装到现有的二进制文件中。这将分两步进行:

  1. The injection of an object file (containing the user-defined routines) into the binary. This is a mandatory step and can not be worked around by alternatives such as injection of a shared object instead.
  2. Performing static analysis on the new binary and using this to statically detour routines from the original code to the newly added code.
  1. 将目标文件(包含用户定义的例程)注入二进制文件。这是一个强制性步骤,不能通过替代方法解决,例如注入共享对象。
  2. 对新二进制文件执行静态分析,并使用它静态地将例程从原始代码绕道到新添加的代码。

I have, for the most part, already completed the work necessary for step 2, but I am having trouble with the injection of the object file. The problem is definitely solvable given that other tools use the same method of object injection (e.g. EEL).

在大多数情况下,我已经完成了第 2 步所需的工作,但是我在注入目标文件时遇到了问题。鉴于其他工具使用相同的对象注入方法(例如EEL),这个问题肯定是可以解决的。

回答by bmargulies

You cannot do this in any practical way. The intended solution is to make that object into a shared lib and then call dlopen on it.

你不能以任何实际的方式做到这一点。预期的解决方案是将该对象变成共享库,然后对其调用 dlopen。

回答by Dan Fego

If it were me, I'd look to create Embed.cinto a shared object, libembed.so, like so:

如果是我,我会考虑创建Embed.c一个共享对象libembed.so,如下所示:

gcc -Wall -shared -fPIC -o libembed.so Embed.c

That should created a relocatable shared object from Embed.c. With that, you can force your target binary to load this shared object by setting the environment variable LD_PRELOADwhen running it (see more information here):

这应该从Embed.c. 这样,您可以通过LD_PRELOAD在运行时设置环境变量来强制目标二进制文件加载此共享对象(请参阅此处的更多信息):

LD_PRELOAD=/path/to/libembed.so Hello

The "trick" here will be to figure out how to do your instrumentation, especially considering it's a static executable. There, I can't help you, but this is one way to have code present in a process' memory space. You'll probably want to do some sort of initialization in a constructor, which you can do with an attribute (if you're using gcc, at least):

这里的“技巧”是弄清楚如何进行检测,特别是考虑到它是一个静态可执行文件。在那里,我无能为力,但这是在进程的内存空间中存在代码的一种方法。您可能希望在构造函数中进行某种初始化,您可以使用属性进行初始化(如果您正在使用gcc,至少):

void __attribute__ ((constructor)) my_init()
{
    // put code here!
}

回答by Marco van de Voort

The problem is that .o's are not fully linked yet, and most references are still symbolic. Binaries (shared libraries and executables) are one step closer to finally linked code.

问题是 .o 还没有完全链接,而且大多数引用仍然是象征性的。二进制文件(共享库和可执行文件)离最终链接的代码又近了一步。

Doing the linking step to a shared lib, doesn't mean you must load it via the dynamic lib loader. The suggestion is more that an own loader for a binary or shared lib might be simpler than for .o.

对共享库执行链接步骤并不意味着您必须通过动态库加载器加载它。建议更多的是二进制或共享库的自己的加载器可能比 .o 更简单。

Another possibility would be to customize that linking process yourself and call the linker and link it to be loaded on some fixed address. You might also look at the preparation of e.g. bootloaders, which also involve a basic linking step to do exactly this (fixate a piece of code to a known loading address).

另一种可能性是自己定制该链接过程并调用链接器并将其链接到某个固定地址上。您还可以查看例如引导加载程序的准备工作,这也涉及一个基本的链接步骤来完成此操作(将一段代码固定到已知的加载地址)。

If you don't link to a fixed address, and want to relocate runtime you will have to write a basic linker that takes the object file, relocates it to the destination address by doing the appropriate fixups.

如果您没有链接到固定地址,并且想要重新定位运行时,您将必须编写一个基本链接器来获取目标文件,通过执行适当的修复将其重新定位到目标地址。

I assume you already have it, seeing it is your master thesis, but this book: http://www.iecc.com/linker/is the standard introduction about this.

我假设您已经拥有它,因为它是您的硕士论文,但是这本书:http: //www.iecc.com/linker/是关于此的标准介绍。

回答by 80x25

Have you looked at the DyninstAPI? It appears support was recently added for linking a .o into a static executable.

你看过DyninstAPI吗?似乎最近添加了将 .o 链接到静态可执行文件的支持。

From the release site:

从发布站点:

Binary rewriter support for statically linked binaries on x86 and x86_64 platforms

二进制重写器支持 x86 和 x86_64 平台上的静态链接二进制文件

回答by elfmaster

You must make room for the relocatable code to fit in the executable by extending the executables text segment, just like a virus infection. Then after writing the relocatable code into that space, update the symbol table by adding symbols for anything in that relocatable object, and then apply the necessary relocation computations. I've written code that does this pretty well with 32bit ELF's.

您必须通过扩展可执行文件文本段来为可重定位代码腾出空间以适应可执行文件,就像病毒感染一样。然后在将可重定位代码写入该空间后,通过为该可重定位对象中的任何内容添加符号来更新符号表,然后应用必要的重定位计算。我编写的代码可以很好地处理 32 位 ELF。

回答by fsheikh

Assuming source code for first executable is available and is compiled with a linker script that allocates space for later object file(s), there is a relatively simpler solution. Since I am currently working on an ARM project examples below are compiled with the GNU ARM cross-compiler.

假设第一个可执行文件的源代码可用,并且使用链接描述文件编译,该脚本为以后的目标文件分配空间,则有一个相对简单的解决方案。由于我目前正在处理一个 ARM 项目,因此下面的示例是使用 GNU ARM 交叉编译器编译的。

Primary source code file, hello.c

主要源代码文件,hello.c

#include <stdio.h>

int main ()
{

   return 0;
}

is built with a simple linker script allocating space for an object to be embedded later:

用一个简单的链接器脚本构建,为稍后嵌入的对象分配空间:

SECTIONS
{
    .text :
    {
        KEEP (*(embed)) ;

        *(.text .text*) ;
    }
}

Like:

喜欢:

arm-none-eabi-gcc -nostartfiles -Ttest.ld -o hello hello.c
readelf -s hello

Num:    Value  Size Type    Bind   Vis      Ndx Name
 0: 00000000     0 NOTYPE  LOCAL  DEFAULT  UND 
 1: 00000000     0 SECTION LOCAL  DEFAULT    1 
 2: 00000000     0 SECTION LOCAL  DEFAULT    2 
 3: 00000000     0 SECTION LOCAL  DEFAULT    3 
 4: 00000000     0 FILE    LOCAL  DEFAULT  ABS hello.c
 5: 00000000     0 NOTYPE  LOCAL  DEFAULT    1 $a
 6: 00000000     0 FILE    LOCAL  DEFAULT  ABS 
 7: 00000000    28 FUNC    GLOBAL DEFAULT    1 main

Now lets compile the object to be embedded whose source is in embed.c

现在让我们编译源在 embed.c 中的要嵌入的对象

void func1()
{
   /* Something useful here */
}

Recompile with the same linker script this time inserting new symbols:

这次使用相同的链接器脚本重新编译插入新符号​​:

arm-none-eabi-gcc -c embed.c
arm-none-eabi-gcc -nostartfiles -Ttest.ld -o new_hello hello embed.o

See the results:

查看结果:

readelf -s new_hello
Num:    Value  Size Type    Bind   Vis      Ndx Name
 0: 00000000     0 NOTYPE  LOCAL  DEFAULT  UND 
 1: 00000000     0 SECTION LOCAL  DEFAULT    1 
 2: 00000000     0 SECTION LOCAL  DEFAULT    2 
 3: 00000000     0 SECTION LOCAL  DEFAULT    3 
 4: 00000000     0 FILE    LOCAL  DEFAULT  ABS hello.c
 5: 00000000     0 NOTYPE  LOCAL  DEFAULT    1 $a
 6: 00000000     0 FILE    LOCAL  DEFAULT  ABS 
 7: 00000000     0 FILE    LOCAL  DEFAULT  ABS embed.c
 8: 0000001c     0 NOTYPE  LOCAL  DEFAULT    1 $a
 9: 00000000     0 FILE    LOCAL  DEFAULT  ABS 
10: 0000001c    20 FUNC    GLOBAL DEFAULT    1 func1
11: 00000000    28 FUNC    GLOBAL DEFAULT    1 main