用 GNU C 内联汇编编写 Linux int 80h 系统调用包装器

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/5131568/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-05 03:02:44  来源:igfitidea点击:

Writing a Linux int 80h system-call wrapper in GNU C inline assembly

clinuxassemblyx86inline-assembly

提问by RodrigoCR

I'm trying to use inline assembly... I read this page http://www.codeproject.com/KB/cpp/edujini_inline_asm.aspxbut I can't understand the parameters passing to my function.

我正在尝试使用内联汇编...我阅读了这个页面http://www.codeproject.com/KB/cpp/edujini_inline_asm.aspx但我无法理解传递给我的函数的参数。

I'm writing a C write example.. this is my function header:

我正在写一个 C 写示例..这是我的函数头:

write2(char *str, int len){
}

And this is my assembly code:

这是我的汇编代码:

global write2
write2:
    push ebp
    mov ebp, esp
    mov eax, 4      ;sys_write
    mov ebx, 1      ;stdout
    mov ecx, [ebp+8]    ;string pointer
    mov edx, [ebp+12]   ;string size
    int 0x80        ;syscall
    leave
    ret

What do I have to do pass that code to the C function... I'm doing something like this:

我该怎么做才能将该代码传递给 C 函数......我正在做这样的事情:

write2(char *str, int len){
    asm ( "movl 4, %%eax;"
          "movl 1, %%ebx;"
          "mov %1, %%ecx;"
          //"mov %2, %%edx;"
          "int 0x80;"
           :
           : "a" (str), "b" (len)
    );
}

That's because I don't have an output variable, so how do I handle that? Also, with this code:

那是因为我没有输出变量,那么我该如何处理呢?此外,使用此代码:

global main
main:
    mov ebx, 5866       ;PID
    mov ecx, 9      ;SIGKILL
    mov eax, 37     ;sys_kill
    int 0x80        ;interruption
    ret 

How can I put that code inline in my code.. so I can ask for the pid to the user.. like this.. This is my precode

我怎样才能将该代码内联到我的代码中.. 这样我就可以向用户询问 pid.. 像这样.. 这是我的预代码

void killp(int pid){
    asm ( "mov %1, %%ebx;"
          "mov 9, %%ecx;"
          "mov 37, %%eax;"
           :
           : "a" (pid)         /* optional */
    );
}

采纳答案by Chris Dodd

Well, you don't say specifically, but by your post, it appears like you're using gcc and its inline asm with constraints syntax (other C compilers have very different inline syntax). That said, you probably need to use AT&T assembler syntax rather than Intel, as that's what gets used with gcc.

好吧,您没有具体说,但是根据您的帖子,您似乎正在使用 gcc 及其带有约束语法的内联 asm(其他 C 编译器具有非常不同的内联语法)。也就是说,您可能需要使用 AT&T 汇编程序语法而不是 Intel,因为 gcc 会使用这种语法。

So with the above said, lets look at your write2 function. First, you don't want to create a stack frame, as gcc will create one, so if you create one in the asm code, you'll end up with two frames, and things will probably get very confused. Second, since gcc is laying out the stack frame, you can't access vars with "[ebp + offset]" as you don't know how it's being laid out.

综上所述,让我们看看你的 write2 函数。首先,你不想创建一个栈帧,因为 gcc 会创建一个,所以如果你在 asm 代码中创建一个,你最终会得到两个帧,事情可能会变得很混乱。其次,由于 gcc 正在布置堆栈帧,因此您无法使用“[ebp + offset]”访问变量,因为您不知道它是如何布置的。

That's what the constraints are for -- you say what kind of place you want gcc to put the value (any register, memory, specific register) and the use "%X" in the asm code. Finally, if you use explicit registers in the asm code, you need to list them in the 3rd section (after the input constraints) so gcc knows you are using them. Otherwise it might put some important value in one of those registers, and you'd clobber that value.

这就是约束的用途——你说你想让 gcc 把值(任何寄存器、内存、特定寄存器)放在什么样的地方,并在 asm 代码中使用“%X”。最后,如果在 asm 代码中使用显式寄存器,则需要在第 3 部分(在输入约束之后)列出它们,以便 gcc 知道您正在使用它们。否则,它可能会在其中一个寄存器中放入一些重要的值,而您会破坏该值。

You also need to tell the compiler that inline asm will or might read from or write to memory pointed-to by the input operands; that is notimplied.

您还需要告诉编译器内联 asm 将或可能会读取或写入输入操作数指向的内存;这不是暗示。

So with all that, your write2 function looks like:

因此,您的 write2 函数如下所示:

void write2(char *str, int len) {
    __asm__ volatile (
        "movl , %%eax;"      // SYS_write
        "movl , %%ebx;"      // file descriptor = stdout_fd
        "movl %0, %%ecx;"
        "movl %1, %%edx;"
        "int 
void write2(char *str, int len) {
    __asm__ volatile (
        "movl , %%eax;"
        "movl , %%ebx;"
        "int 
// UNSAFE: destroys EAX (with return value) without telling the compiler
void write2(char *str, int len) {
    __asm__ volatile ("int 
int write2(const char *str, int len) {
    __asm__ volatile ("int ##代码##x80" 
     : "=a" (len)
     : "a" (4), "b" (1), "c" (str), "d" (len),
       "m"( *(const char (*)[])str )       // "dummy" input instead of memory clobber
     );
    return len;
}
x80" :: "a" (4), "b" (1), "c" (str), "d" (len) : "memory"); }
x80" :: "c" (str), /* c constraint tells the compiler to put str in ecx */ "d" (len) /* d constraint tells the compiler to put len in edx */ : "eax", "ebx", "memory"); }
x80" :: "g" (str), "g" (len) // input values we MOV from : "eax", "ebx", "ecx", "edx", // registers we destroy "memory" // memory has to be in sync so we can read it ); }

Note the AT&T syntax -- src, dest rather than dest, src and %before the register name.

请注意 AT&T 语法——src、dest 而不是 dest、src 和%寄存器名称之前。

Now this will work, but its inefficient as it will contain lots of extra movs. In general, you should NEVER use mov instructions or explicit registers in asm code, as you're much better off using constraints to say where you want things and let the compiler ensure that they're there. That way, the optimizer can probably get rid of most of the movs, particularly if it inlines the function (which it will do if you specify -O3). Conveniently, the i386 machine model has constraints for specific registers, so you can instead do:

现在这将起作用,但效率低下,因为它将包含许多额外的 movs。一般来说,你永远不应该在 asm 代码中使用 mov 指令或显式寄存器,因为你最好使用约束来说明你想要的东西,让编译器确保它们在那里。这样,优化器可能会摆脱大部分 movs,特别是如果它内联函数(如果指定 -O3,它将这样做)。方便的是,i386 机器模型对特定寄存器有限制,因此您可以改为:

##代码##

or even better

甚至更好

##代码##

Note also the use of volatilewhich is needed to tell the compiler that this can't be eliminated as dead even though its outputs (of which there are none) are not used. (asmwith no output operands is already implicitly volatile, but making it explicit doesn't hurt when the real purpose isn't to calculate something; it's for a side effect like a system call.)

另请注意volatile,需要使用which 来告诉编译器即使未使用其输出(其中没有输出),也不能将其消除为死。(asm没有输出操作数已经是隐式的volatile,但是当真正的目的不是计算某些东西时,让它显式并没有什么坏处;这是为了像系统调用一样的副作用。)

edit

编辑

One final note -- this function is doing a write system call, which does return a value in eax -- either the number of bytes written or an error code. So you can get that with an output constraint:

最后一个注意事项——这个函数正在执行一个 write 系统调用,它确实在 eax 中返回一个值——写入的字节数或错误代码。所以你可以通过输出约束得到它:

##代码##

All system calls return in EAX. Values from -4095to -1(inclusive) are negative errnocodes, other values are non-errors. (This applies globally to all Linux system calls).

所有系统调用都在 EAX 中返回。从-4095-1(含)的errno值是负代码,其他值是非错误。(这适用于所有 Linux 系统调用)。

If you're writing a generic system-call wrapper, you probably need a "memory"clobber because different system calls have different pointer operands, and might be inputs or outputs. See https://godbolt.org/z/GOXBuefor an example that breaks if you leave it out, and this answerfor more details about dummy memory inputs/outputs.

如果您正在编写通用系统调用包装器,您可能需要一个"memory"clobber,因为不同的系统调用具有不同的指针操作数,并且可能是输入或输出。请参阅https://godbolt.org/z/GOXBue以获取如果您忽略它会中断的示例,以及有关虚拟内存输入/输出的更多详细信息的答案

With this output operand, you need the explicit volatile-- exactly one writesystem call per time the asmstatement "runs" in the source. Otherwise the compiler is allowed to assume that it exists only to compute its return value, and can eliminate repeated calls with the same input instead of writing multiple lines. (Or remove it entirely if you didn't check the return value.)

使用此输出操作数,您需要明确的volatile——write每次asm语句在源中“运行”时恰好是一个系统调用。否则编译器可以假设它的存在只是为了计算它的返回值,并且可以消除对相同输入的重复调用而不是编写多行。(或者,如果您没有检查返回值,则将其完全删除。)