Linux 如何在内联汇编中通过 sysenter 调用系统调用?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/9506353/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-06 04:54:59  来源:igfitidea点击:

How to invoke a system call via sysenter in inline assembly?

linuxgccx86x86-64system-calls

提问by Infinite

How can we implement the system call using sysenter/syscall directly in x86 Linux? Can anybody provide help? It would be even better if you can also show the code for amd64 platform.

在 x86 Linux 中如何直接使用 sysenter/syscall 实现系统调用?有人可以提供帮助吗?如果能把amd64平台的代码也展示一下就更好了。

I know in x86, we can use

我知道在 x86 中,我们可以使用

__asm__(
"               movl , %eax  \n"
"               movl 
#include <sys/types.h>

ssize_t my_write(int fd, const void *buf, size_t size);

int main(void)
{
    const char hello[] = "Hello world!\n";
    my_write(1, hello, sizeof(hello));
    return 0;
}
, %ebx \n" " call *%gs:0x10 \n" );

to route to sysenter indirectly.

间接路由到sysenter。

But how can we code using sysenter/syscall directly to issue a system call?

但是我们如何直接使用 sysenter/syscall 进行编码以发出系统调用?

I find some material http://damocles.blogbus.com/tag/sysenter/. But still find it difficult to figure out.

我找到了一些材料http://damocles.blogbus.com/tag/sysenter/。但还是觉得很难弄明白。

采纳答案by Daniel Kamil Kozar

First of all, you can't safely use GNU C Basic asm("");syntax for this(without input/output/clobber constraints). You need Extended asm to tell the compiler about registers you modify. See the inline asm in the GNU C manualand the inline-assembly tag wikifor links to other guides for details on what things like "D"(1)means as part of an asm()statement.

首先,您不能asm("");为此安全地使用 GNU C Basic语法(没有输入/输出/clobber 约束)。您需要扩展 asm 来告诉编译器您修改的寄存器。请参阅GNU C 手册中内联 asm内联汇编标记 wiki以获取指向其他指南的链接,以了解有关"D"(1)作为asm()语句的一部分的含义的详细信息。



I'm going to show you how to execute system calls by writing a program that writes Hello World!to standard output by using the write()system call. Here's the source of the program without an implementation of the actual system call :

我将向您展示如何通过编写一个Hello World!使用write()系统调用写入标准输出的程序来执行系统调用。这是没有实现实际系统调用的程序源代码:

// i386 Linux
#include <asm/unistd.h>      // compile with -m32 for 32 bit call numbers
//#define __NR_write 4
ssize_t my_write(int fd, const void *buf, size_t size)
{
    ssize_t ret;
    asm volatile
    (
        "int 
// x86-64 Linux
#include <asm/unistd.h>      // compile without -m32 for 64 bit call numbers
// #define __NR_write 1
ssize_t my_write(int fd, const void *buf, size_t size)
{
    ssize_t ret;
    asm volatile
    (
        "syscall"
        : "=a" (ret)
        //                 EDI      RSI       RDX
        : "0"(__NR_write), "D"(fd), "S"(buf), "d"(size)
        : "rcx", "r11", "memory"
    );
    return ret;
}
x80" : "=a" (ret) : "0"(__NR_write), "b"(fd), "c"(buf), "d"(size) : "memory" // the kernel dereferences pointer args ); return ret; }

You can see that I named my custom system call function as my_writein order to avoid name clashes with the "normal" write, provided by libc. The rest of this answer contains the source of my_writefor i386 and amd64.

您可以看到我将自定义系统调用函数命名为 asmy_write以避免名称与writelibc 提供的 "normal" 发生冲突。这个答案的其余部分包含my_writei386 和 amd64的来源。

i386

i386

System calls in i386 Linux are implemented using the 128th interrupt vector, e.g. by calling int 0x80in your assembly code, having set the parameters accordingly beforehand, of course. It is possible to do the same via SYSENTER, but actually executing this instruction is achieved by the VDSO virtually mapped to each running process. Since SYSENTERwas never meant as a direct replacement of the int 0x80API, it's never directly executed by userland applications - instead, when an application needs to access some kernel code, it calls the virtually mapped routine in the VDSO (that's what the call *%gs:0x10in your code is for), which contains all the code supporting the SYSENTERinstruction. There's quite a lot of it because of how the instruction actually works.

i386 Linux 中的系统调用是使用第 128 个中断向量实现的,例如,通过调用int 0x80您的汇编代码,当然事先相应地设置了参数。也可以通过 做同样的事情SYSENTER,但实际执行这条指令是通过 VDSO 虚拟映射到每个正在运行的进程来实现的。由于SYSENTER从未打算直接替代int 0x80API,因此它从未由用户级应用程序直接执行 - 相反,当应用程序需要访问某些内核代码时,它会调用 VDSO 中的虚拟映射例程(这就是call *%gs:0x10您代码中的),其中包含支持该SYSENTER指令的所有代码。由于指令的实际运作方式,有相当多的内容。

If you want to read more about this, have a look at this link. It contains a fairly brief overview of the techniques applied in the kernel and the VDSO. See also The Definitive Guide to (x86) Linux System Calls- some system calls like getpidand clock_gettimeare so simple the kernel can export code + data that runs in user-space so the VDSO never needs to enter the kernel, making it much faster even than sysentercould be.

如果您想了解更多相关信息,请查看此链接。它包含对在内核和 VDSO 中应用的技术的相当简要的概述。又见权威指南(X86)的Linux系统调用-一些系统调用喜欢getpidclock_gettime这么简单的内核可以导出,在用户空间运行,从而将VDSO永远不需要进入内核代码+数据,使得它甚至远远快于sysenter可能。



It's much easier to use the slower int $0x80to invoke the 32-bit ABI.

使用较慢int $0x80的调用 32 位 ABI更容易。

##代码##

As you can see, using the int 0x80API is relatively simple. The number of the syscall goes to the eaxregister, while all the parameters needed for the syscall go into respectively ebx, ecx, edx, esi, edi, and ebp. System call numbers can be obtained by reading the file /usr/include/asm/unistd_32.h.

如您所见,使用int 0x80API 相对简单。系统调用的数量转到eax寄存器,而所需的系统调用去所有的参数分别为ebxecxedxesiedi,和ebp。系统调用号可以通过读取文件获得/usr/include/asm/unistd_32.h

Prototypes and descriptions of the functions are available in the 2nd section of the manual, so in this case write(2).

功能的原型和描述在手册的第二部分中可用,所以在这种情况下write(2)

The kernel saves/restores all the registers (except EAX) so we can use them as input-only operands to the inline asm. See What are the calling conventions for UNIX & Linux system calls on i386 and x86-64

内核保存/恢复所有寄存器(EAX 除外),因此我们可以将它们用作内联汇编的仅输入操作数。请参阅i386 和 x86-64 上的 UNIX 和 Linux 系统调用的调用约定是什么

Keep in mind that the clobber list also contains the memoryparameter, which means that the instruction listed in the instruction list references memory (via the bufparameter). (A pointer input to inline asm does not imply that the pointed-to memory is also an input. See How can I indicate that the memory *pointed* to by an inline ASM argument may be used?)

请记住,clobber 列表也包含memory参数,这意味着指令列表中列出的指令引用内存(通过buf参数)。(内联asm的指针输入并不意味着指向的内存也是输入。请参阅如何指示可以使用内联ASM参数*指向*的内存?

amd64

amd64

Things look different on the AMD64 architecture which sports a new instruction called SYSCALL. It is very different from the original SYSENTERinstruction, and definitely much easier to use from userland applications - it really resembles a normal CALL, actually, and adapting the old int 0x80to the new SYSCALLis pretty much trivial. (Except it uses RCX and R11 instead of the kernel stack to save the user-space RIP and RFLAGS so the kernel knows where to return).

在 AMD64 架构上,情况看起来有所不同,它运行了一个名为SYSCALL. 它与原始SYSENTER指令非常不同,并且在用户级应用程序中使用起来肯定要容易得多 -CALL实际上,它真的很像一个普通的,并且使旧的适应int 0x80SYSCALL的非常简单。(除了它使用 RCX 和 R11 而不是内核堆栈来保存用户空间 RIP 和 RFLAGS 以便内核知道从哪里返回)。

In this case, the number of the system call is still passed in the register rax, but the registers used to hold the arguments now nearly match the function calling convention: rdi, rsi, rdx, r10, r8and r9in that order. (syscallitself destroys rcxso r10is used instead of rcx, letting libc wrapper functions just use mov r10, rcx/ syscall.)

在这种情况下,系统调用的编号仍然在 register 中传递rax,但用于保存参数的寄存器现在几乎与函数调用约定匹配:rdi, rsi, rdx, r10,r8并按此r9顺序。(syscall本身会破坏,rcx所以r10使用而不是rcx,让 libc 包装函数只使用mov r10, rcx/ syscall。)

##代码##

(See it compile on Godbolt)

(见它在Godbolt 上编译)

Do notice how practically the only thing that needed changing were the register names, and the actual instruction used for making the call. This is mostly thanks to the input/output lists provided by gcc's extended inline assembly syntax, which automagically provides appropriate move instructions needed for executing the instruction list.

请注意实际上唯一需要更改的是寄存器名称和用于进行调用的实际指令。这主要归功于 gcc 的扩展内联汇编语法提供的输入/输出列表,它自动提供执行指令列表所需的适当移动指令。

The "0"(callnum)matching constraint could be written as "a"because operand 0 (the "=a"(ret)output) only has one register to pick from; we know it will pick EAX. Use whichever you find more clear.

"0"(callnum)匹配约束可以写成"a"因为操作数0("=a"(ret)输出)只具有一个寄存器从接; 我们知道它会选择 EAX。使用你觉得更清楚的那个。



Note that non-Linux OSes, like MacOS, use different call numbers. And even different arg-passing conventions for 32-bit.

请注意,非 Linux 操作系统(如 MacOS)使用不同的电话号码。甚至还有不同的 32 位 arg 传递约定。