Linux 如何动态生成和运行本机代码?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/4911993/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-05 02:44:53  来源:igfitidea点击:

How to generate and run native code dynamically?

c++linuxcompiler-constructionx86jit

提问by Chris Tonkinson

I'd like to write a very small proof-of-concept JIT compiler for a toy language processor I've written (purely academic), but I'm having some trouble in the middle-altitudes of design. Conceptually, I'm familiar with how JIT works - you compile bytecode into (machine or assembly?) code to run. At the nuts-and-bolts level however, I'm not quite gripping howyou actually go about doingthat.

我想为我编写的玩具语言处理器编写一个非常小的概念验证 JIT 编译器(纯粹是学术性的),但我在设计的中等高度遇到了一些麻烦。从概念上讲,我熟悉 JIT 的工作原理——您将字节码编译成(机器或程序集?)代码来运行。然而,在具体细节上,我不太了解你实际上是如何做到这一点的。

My (very "newb") knee-jerk reaction, since I haven't the first clue where to start, would be to try something like the following:

我的(非常“新手”)下意识的反应,因为我没有从哪里开始的第一个线索,将尝试以下类似的方法:

  1. mmap() a block of memory, setting access to PROT_EXEC
  2. write the native code into the block
  3. store the current registers (stack pointer, et al.) someplace cozy
  4. modify the current registers to point into the native code block in the mapped region
  5. the native code would now get executed by the machine
  6. restore the previous registers
  1. mmap() 一块内存,设置访问 PROT_EXEC
  2. 将本机代码写入块
  3. 将当前寄存器(堆栈指针等)存储在舒适的地方
  4. 修改当前寄存器以指向映射区域中的本机代码块
  5. 本机代码现在将由机器执行
  6. 恢复以前的寄存器

Is that even closeto a/the correct algorithm? I've tried perusing different projects that I know have JIT compilers to study (such as V8) but these codebases turn out to be difficult to consume because of their size, and I've little idea where to start looking.

这甚至接近于/正确的算法吗?我试过仔细研究我知道有 JIT 编译器要研究的不同项目(例如V8),但这些代码库由于它们的大小而难以使用,而且我几乎不知道从哪里开始寻找。

采纳答案by Shelwien

Not sure about linux, but this works on x86/windows.
Update: http://codepad.org/sQoF6kR8

不确定 linux,但这适用于 x86/windows。
更新:http: //codepad.org/sQoF6kR8

#include <stdio.h>
#include <windows.h>

typedef unsigned char byte;

int arg1;
int arg2;
int res1;

typedef void (*pfunc)(void);

union funcptr {
  pfunc x;
  byte* y;
};

int main( void ) {

  byte* buf = (byte*)VirtualAllocEx( GetCurrentProcess(), 0, 1<<16, MEM_COMMIT, PAGE_EXECUTE_READWRITE );

  if( buf==0 ) return 0;

  byte* p = buf;

  *p++ = 0x50; // push eax
  *p++ = 0x52; // push edx

  *p++ = 0xA1; // mov eax, [arg2]
  (int*&)p[0] = &arg2; p+=sizeof(int*);

  *p++ = 0x92; // xchg edx,eax

  *p++ = 0xA1; // mov eax, [arg1]
  (int*&)p[0] = &arg1; p+=sizeof(int*);

  *p++ = 0xF7; *p++ = 0xEA; // imul edx

  *p++ = 0xA3; // mov [res1],eax
  (int*&)p[0] = &res1; p+=sizeof(int*);

  *p++ = 0x5A; // pop edx
  *p++ = 0x58; // pop eax
  *p++ = 0xC3; // ret

  funcptr func;
  func.y = buf;

  arg1 = 123; arg2 = 321; res1 = 0;

  func.x(); // call generated code

  printf( "arg1=%i arg2=%i arg1*arg2=%i func(arg1,arg2)=%i\n", arg1,arg2,arg1*arg2,res1 );

}

回答by sstn

The Android Dalvik JIT compiler might also be worth looking at. It is supposed to be fairly small and lean (not sure if this helps understanding it or makes things more complicated). It targets Linux as well.

Android Dalvik JIT 编译器可能也值得一看。它应该相当小和精简(不确定这是否有助于理解它或使事情变得更复杂)。它也针对 Linux。

If things are getting more serious, looking at LLVM might be a good choice as well.

如果事情变得越来越严重,查看 LLVM 也可能是一个不错的选择。

The function pointer approach suggested by Jeremiah sounds good. You may want to use the caller's stack anyway and there will probably only be a few registers left (on x86) which you need to preserve or not touch. In this case, it is probably easiest if your compiled code (or the entry stub) saves them on the stack before proceeding. In the end, it all boils down to writing an assembler function and interfacing to it from C.

Jeremiah 建议的函数指针方法听起来不错。无论如何,您可能想要使用调用者的堆栈,并且可能只剩下几个寄存器(在 x86 上)需要保留或不接触。在这种情况下,如果您的编译代码(或入口存根)在继续之前将它们保存在堆栈中,则可能是最简单的。最后,这一切都归结为编写一个汇编函数并从 C 中连接到它。

回答by datenwolf

Youmay want to have a look at libjitwhich provides exactly the infrastructure you're looking for:

你可能想看看libjit,它提供了你正在寻找的基础设施:

The libjit library implements just-in-time compilation functionality. Unlike other JITs, this one is designed to be independent of any particular virtual machine bytecode format or language.

libjit 库实现了即时编译功能。与其他 JIT 不同,这个 JIT 被设计为独立于任何特定的虚拟机字节码格式或语言。

http://freshmeat.net/projects/libjit

http://freshmeat.net/projects/libjit

回答by MSalters

In addition to the techniques suggested so far, it might be worthwhile to look into the thread creation functions. If you create a new thread, with the starting address set to your generated code, you know for sure that there are no old registers that need saving or restoring, and the OS handles the setup of the relevant registers for you. I.e you eliminate steps 3, 4 and 6 of your list.

除了目前建议的技术之外,研究线程创建函数可能是值得的。如果您创建一个新线程,并将起始地址设置为您生成的代码,那么您肯定知道没有需要保存或恢复的旧寄存器,并且操作系统会为您处理相关寄存器的设置。即您消除了列表中的第 3、4 和 6 步。

回答by Corbin March

You may be interested in why the lucky stiff's Potionprogramming language. It's a small, incomplete language that features just-in-time compilation. Potion's small size makes it easier to understand. The repository includes a description of the language's internals(JIT content starts at heading "~ the jit ~").

您可能对为什么幸运僵硬Potion编程语言感兴趣。它是一种小型的、不完整的语言,具有即时编译功能。Potion 的小尺寸使它更容易理解。存储库包括语言内部的描述(JIT 内容从标题“ ~ jit ~”开始)。

The implementation is complicated by the fact it runs in the context of Potion's VM. Don't let this scare you off, though. It doesn't take long to see what he's up to. Basically, using a small set of VM opcodes allows some actions to be modeled as optimized assembly.

由于它在Potion 的 VM上下文中运行,因此实现变得复杂。不过,不要让这吓到你。不用花很长时间就能看到他在做什么。基本上,使用一小组 VM 操作码允许将某些操作建模为优化程序集

回答by Matt Mahoney

The answer depends on your compiler and where you put the code. See http://encode.ru/threads/1273-Just-In-Time-Compilation-Improvement-For-ZPAQ?p=24902&posted=1#post24902

答案取决于您的编译器以及您放置代码的位置。见http://encode.ru/threads/1273-Just-In-Time-Compilation-Improvement-For-ZPAQ?p=24902&posted=1#post24902

Testing in 32 bit Vista, Visual C++ gives a DEP (data execution prevention) error whether the code is put on the stack, heap, or static memory. g++, Borland, and Mars can be made to work sometimes. Data accessed by the JIT code needs to be declared volatile.

在 32 位 Vista 中测试,无论代码是放在堆栈、堆还是静态内存中,Visual C++ 都会给出 DEP(数据执行保护)错误。g++、Borland 和 Mars 有时可以工作。JIT 代码访问的数据需要声明为 volatile。

回答by Eli Bendersky

How to JIT - an introductionis a new article (from today!) that addresses some of these issues and describes the bigger picture as well.

如何 JIT - 介绍是一篇新文章(从今天开始!),它解决了其中一些问题并描述了更大的图景。