C语言将寄存器值读入 C 变量

Question

提问by Brian

I remember seeing a way to use extended gcc inline assembly to read a register value and store it into a C variable.

我记得看到一种使用扩展 gcc 内联汇编读取寄存器值并将其存储到 C 变量中的方法。

I cannot though for the life of me remember how to form the asm statement.

我终生无法记住如何形成 asm 语句。

Answer 1

回答by ephemient

Editor's note: this way of using a local register-asm variable is now documented by GCC as "not supported". It still usually happens to work on GCC, but breaks with clang. (This wording in the documentation was added after this answer was posted, I think.)

编者注：这种使用本地 register-asm 变量的方式现在被 GCC 记录为“不支持”。它通常仍然可以在 GCC 上工作，但会因叮当声而中断。（我认为，在发布此答案后添加了文档中的此措辞。）

The global fixed-register variable version has a large performance cost for 32-bit x86, which only has 7 GP-integer registers (not counting the stack pointer). This would reduce that to 6. Only consider this if you have a global variable that all of your code uses heavily.

全局固定寄存器变量版本对于32位x86来说性能开销很大，只有7个GP-整数寄存器（不包括堆栈指针）。这会将其减少到 6。仅当您有一个所有代码都大量使用的全局变量时才考虑这一点。

Going in a different direction than other answers so far, since I'm not sure what you want.

到目前为止，与其他答案的方向不同，因为我不确定您想要什么。

GCC Manual § 5.40 Variables in Specified Registers

GCC 手册 § 5.40 指定寄存器中的变量

register int *foo asm ("a5");
Here a5is the name of the register which should be used…
Naturally the register name is cpu-dependent, but this is not a problem, since specific registers are most often useful with explicit assembler instructions (see Extended Asm). Both of these things generally require that you conditionalize your program according to cpu type.
Defining such a register variable does not reserve the register; it remains available for other uses in places where flow control determines the variable's value is not live.

register int *foo asm ("a5");
这a5是应该使用的寄存器的名称......
自然地，寄存器名称与 cpu 相关，但这不是问题，因为特定寄存器最常用于显式汇编指令（请参阅Extended Asm）。这两件事通常都要求您根据 cpu 类型对程序进行条件化。
定义这样的寄存器变量并不保留寄存器；在流量控制确定变量值无效的地方，它仍然可用于其他用途。

GCC Manual § 3.18 Options for Code Generation Conventions

GCC 手册 § 3.18 代码生成约定的选项

-ffixed-reg
Treat the register named regas a fixed register; generated code should never refer to it (except perhaps as a stack pointer, frame pointer or in some other fixed role).

-ffixed-注册
将名为reg 的寄存器视为固定寄存器；生成的代码永远不应该引用它（除非作为堆栈指针、帧指针或其他固定角色）。

This can replicate Richard's answer in a simpler way,

这可以以更简单的方式复制理查德的答案，

int main() {
    register int i asm("ebx");
    return i + 1;
}

although this is rather meaningless, as you have no idea what's in the ebxregister.

尽管这毫无意义，因为您不知道ebx寄存器中的内容。

If you combined these two, compiling this with gcc -ffixed-ebx,

如果你把这两者结合起来，用编译它gcc -ffixed-ebx，

#include <stdio.h>
register int counter asm("ebx");
void check(int n) {
    if (!(n % 2 && n % 3 && n % 5)) counter++;
}
int main() {
    int i;
    counter = 0;
    for (i = 1; i <= 100; i++) check(i);
    printf("%d Hamming numbers between 1 and 100\n", counter);
    return 0;
}

you can ensure that a C variable always uses resides in a register for speedy access and also will not get clobbered by other generated code. (Handily, ebxis callee-save under usual x86 calling conventions, so even if it gets clobbered by calls to other functions compiled without -ffixed-*, it should get restored too.)

您可以确保 C 变量始终使用驻留在寄存器中以进行快速访问，并且不会被其他生成的代码破坏。（方便的ebx是，在通常的 x86 调用约定下是被调用者保存的，所以即使它被其他没有编译的函数的调用破坏了-ffixed-*，它也应该被恢复。）

On the other hand, this definitely isn't portable, and usually isn't a performance benefit either, as you're restricting the compiler's freedom.

另一方面，这绝对不是可移植的，通常也不是性能优势，因为您限制了编译器的自由。

Answer 2

回答by Richard Pennington

Here is a way to get ebx:

这是一种获取ebx的方法：

int main()
{
    int i;
    asm("\t movl %%ebx,%0" : "=r"(i));
    return i + 1;
}

The result:

结果：

main:
    subl    , %esp
    #APP
             movl %ebx,%eax
    #NO_APP
    incl    %eax
    addl    , %esp
    ret

编辑：

The "=r"(i) is an output constraint, telling the compiler that the first output (%0) is a register that should be placed in the variable "i". At this optimization level (-O5) the variable i never gets stored to memory, but is held in the eax register, which also happens to be the return value register.

"=r"(i) 是一个输出约束，告诉编译器第一个输出 (%0) 是一个应该放在变量 "i" 中的寄存器。在这个优化级别（-O5），变量 i 永远不会被存储到内存中，而是保存在 eax 寄存器中，它也恰好是返回值寄存器。

Answer 3

回答by Jacob

I don't know about gcc, but in VS this is how:

我不知道 gcc，但在 VS 中是这样的：

int data = 0;   
__asm
{
    mov ebx, 30
    mov data, ebx
}
cout<<data;

Essentially, I moved the data in ebxto your variable data.

本质上，我将数据移到了ebx您的变量中data。

Answer 4

回答by R Samuel Klatchko

This will move the stack pointer register into the sp variable.

这会将堆栈指针寄存器移动到 sp 变量中。

intptr_t sp;
asm ("movl %%esp, %0" : "=r" (sp) );

Just replace 'esp' with the actual register you are interested in (but make sure not to lose the %%) and 'sp' with your variable.

只需将 'esp' 替换为您感兴趣的实际寄存器（但请确保不要丢失 %%）并将 'sp' 替换为您的变量。

Answer 5

回答by Peter Cordes

You can't know what value compiler-generated code will have stored in any register when your inline asmstatement runs, so the value is usually meaningless, and you'd be much better off using a debugger to look at register values when stopped at a breakpoint.

当您的内联asm语句运行时，您无法知道编译器生成的代码将存储在任何寄存器中的值，因此该值通常是没有意义的，并且您最好在停止时使用调试器查看寄存器值。断点。

That being said, if you're going to do this strange task, you might as well do it efficiently.

话虽如此，如果你要完成这个奇怪的任务，你不妨高效地完成它。

On some targets (like x86) you can use specific-register output constraints to tell the compiler whichregister an output will be in. Use a specific-register output constraint with an empty asm template(zero instructions) to tell the compiler that your asm statement doesn't care about that register value on input, but afterward the given C variable will be in that register.

在某些目标（如 x86）上，您可以使用特定寄存器输出约束来告诉编译器输出将在哪个寄存器中。 使用带有空 asm 模板（零指令）的特定寄存器输出约束来告诉编译器您的 asm语句不关心输入时的寄存器值，但之后给定的 C 变量将在该寄存器中。

#include <stdint.h>

int foo() {
    uint64_t rax_value;           // type width determines register size
    asm("" : "=a"(rax_value));  // =letter determines which register (or partial reg)

    uint32_t ebx_value;
    asm("" : "=b"(ebx_value));

    uint16_t si_value;
    asm("" : "=S"(si_value) );

    uint8_t sil_value;  // x86-64 required to use the low 8 of a reg other than a-d
       // With -m32:  error: unsupported size for integer register
    asm("# Hi mom, my output constraint picked %0" : "=S"(sil_value) );

    return sil_value + ebx_value;
}

Compiled with clang5.0 on Godbolt for x86-64. Notice that the 2 unused output values are optimized away, no #APP/ #NO_APPcompiler-generated asm-comment pairs (which switch the assembler out / into fast-parsing mode, or at least used to if that's no longer a thing). This is because I didn't use asm volatile, and they have an output operand so they're not implicitly volatile.

在 Godbolt for x86-64 上用clang5.0编译。请注意，2 个未使用的输出值被优化掉了，没有#APP/#NO_APP编译器生成的 asm-comment 对（将汇编器切换出 / 进入快速解析模式，或者至少习惯了如果这不再是一回事）。这是因为我没有使用asm volatile，而且它们有一个输出操作数，所以它们不是隐式的volatile。

foo():                                # @foo()
# BB#0:
    push    rbx
    #APP
    #NO_APP
    #DEBUG_VALUE: foo:ebx_value <- %EBX
    #APP
    # Hi mom, my output constraint picked %sil
    #NO_APP
    #DEBUG_VALUE: foo:sil_value <- %SIL
    movzx   eax, sil
    add     eax, ebx
    pop     rbx
    ret
                                    # -- End function
                                    # DW_AT_GNU_pubnames
                                    # DW_AT_external

Notice the compiler-generated code to add two outputs together, directly from the registers specified. Also notice the push/pop of RBX, because RBX is a call-preserved register in the x86-64 System V calling convention. (And basically all 32 and 64-bit x86 calling conventions). But we've told the compiler that our asm statement writes a value there. (Using an empty asm statement is kind of a hack; there's no syntax to directly tell the compiler we just want to read a register, because like I said you don't know what the compiler was doing with the registers when your asm statement is inserted.)

请注意编译器生成的代码，它直接从指定的寄存器将两个输出相加。还要注意 RBX 的 push/pop，因为 RBX 是 x86-64 System V 调用约定中的调用保留寄存器。（基本上所有 32 位和 64 位 x86 调用约定）。但是我们已经告诉编译器我们的 asm 语句在那里写入了一个值。（使用空的 asm 语句是一种黑客行为；没有语法可以直接告诉编译器我们只想读取寄存器，因为就像我说的，当你的 asm 语句是插入。）

The compiler will treat your asm statement as if it actually wrotethat register, so if it needs the value for later, it will have copied it to another register (or spilled to memory) when your asm statement "runs".

编译器会将您的 asm 语句视为它实际编写了该寄存器，因此如果它稍后需要该值，它将在您的 asm 语句“运行”时将其复制到另一个寄存器（或溢出到内存中）。

The other x86 register constraintsare b(bl/bx/ebx/rbx), c(.../rcx), d(.../rdx), S(sil/si/esi/rsi), D(.../rdi). There is no specific constraint for bpl/bp/ebp/rbp, even though it's not special in functions without a frame pointer. (Maybe because using it would make your code not compiler with -fno-omit-frame-pointer.)

其他x86 寄存器约束是b(bl/bx/ebx/rbx), c(.../rcx), d(.../rdx), S(sil/si/esi/rsi), D(.../rdi)。bpl/bp/ebp/rbp 没有特定的限制，即使它在没有帧指针的函数中并不特殊。（也许是因为使用它会使您的代码无法编译为-fno-omit-frame-pointer.）

You can use register uint64_t rbp_var asm ("rbp"), in which case asm("" : "=r" (rbp_var));guarantees that the "=r"constraint will pick rbp. Similarly for r8-r15, which don't have any explicit constraints either. On some architectures, like ARM, asm-register variables are the only way to specify which register you want for asm input/output constraints. (And note that asm constraints are the onlysupported use of register asmvariables; there's no guarantee that the variable's value will be in that register any other time.

您可以使用register uint64_t rbp_var asm ("rbp")，在这种情况下asm("" : "=r" (rbp_var));保证"=r"约束会选择rbp。同样对于 r8-r15，它也没有任何明确的约束。在某些体系结构上，例如 ARM，asm 寄存器变量是指定用于 asm 输入/输出约束的寄存器的唯一方法。（并注意asm 约束是唯一支持的register asm变量使用；不能保证变量的值在任何其他时间都在该寄存器中。

There's nothing to stop the compiler from placing these asm statements anywhere it wants within a function (or parent functions after inlining). So you have no control over whereyou're sampling the value of a register. asm volatilemay avoid some reordering, but maybe only with respect to other volatileaccesses. You could check the compiler-generated asm to see if you got what you wanted, but beware that it might have been by chance and could break later.

没有什么可以阻止编译器将这些 asm 语句放置在函数（或内联后的父函数）中它想要的任何位置。所以，你有无法控制，其中你采样寄存器的值。 asm volatile可能会避免一些重新排序，但可能仅针对其他volatile访问。您可以检查编译器生成的 asm 以查看您是否得到了您想要的东西，但要注意它可能是偶然的，以后可能会中断。

You can place an asm statement in the dependency chain for something else to control where the compiler places it. Use a "+rm"constraint to tell the compiler it modifies some other variable which is actually used for something that doesn't optimize away.

您可以在依赖链中放置一条 asm 语句，用于其他内容以控制编译器放置它的位置。使用"+rm"约束告诉编译器它修改了一些其他变量，这些变量实际上用于没有优化掉的东西。

uint32_t ebx_value;
asm("" : "=b"(ebx_value), "+rm"(some_used_variable) );

where some_used_variablemight be a return value from one function, and (after some processing) passed as an arg to another function. Or computed in a loop, and will be returned as the function's return value. In that case, the asm statement is guaranteed to come at some point after the end of the loop, and before any code that depends on the later value of that variable.

wheresome_used_variable可能是一个函数的返回值，并且（经过一些处理）作为 arg 传递给另一个函数。或者在循环中计算，并将作为函数的返回值返回。在这种情况下，asm 语句保证出现在循环结束之后的某个点，并且在依赖于该变量的后面值的任何代码之前。

This will defeat optimizations like constant-propagation for that variable, though. https://gcc.gnu.org/wiki/DontUseInlineAsm. The compiler can't assume anythingabout the output value; it doesn't check that the asmstatement has zero instructions.

但是，这将击败该变量的恒定传播等优化。 https://gcc.gnu.org/wiki/DontUseInlineAsm。编译器不能对输出值做任何假设；它不检查asm语句是否具有零指令。

This doesn't work for some registers that gcc won't let you use as output operands or clobbers, e.g. the stack pointer.

这不适用于某些 gcc 不允许您用作输出操作数或clobbers 的寄存器，例如堆栈指针。

Reading the value into a C variable might make sense for a stack pointer, though, if your program does something special with stacks.

但是，如果您的程序对堆栈做了一些特殊的处理，那么将值读入 C 变量可能对堆栈指针有意义。

As an alternative to inline-asm, there's __builtin_frame_address(0)to get a stack address. (But IIRC, cause that function to make a full stack frame, even when -fomit-frame-pointeris enabled, like it is by default on x86.)

作为 inline-asm 的替代方法，__builtin_frame_address(0)需要获取堆栈地址。（但是 IIRC 会导致该函数生成一个完整的堆栈帧，即使-fomit-frame-pointer启用时也是如此，就像 x86 上的默认设置一样。）

Still, in many functions that's nearly free (and making a stack frame can be good for code-size, because of smaller addressing modes for RBP-relative than RSP-relative access to local variables).

尽管如此，在许多几乎免费的函数中（并且制作堆栈帧可能有利于代码大小，因为 RBP 相关的寻址模式比 RSP 相关的局部变量访问更小）。

Using a movinstruction in an asmstatement would of course work, too.

mov在asm语句中使用指令当然也可以。

Answer 6

回答by user108127

From the GCC docs itself: http://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html

来自 GCC 文档本身：http: //gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html

Answer 7

回答by Mahdi Mohammadi

#include <stdio.h>

void gav(){
        //rgv_t argv = get();
        register unsigned long long i asm("rax");
        register unsigned long long ii asm("rbx");
        printf("I`m gav - first arguman is: %s - 2th arguman is: %s\n", (char *)i, (char *)ii);
}

int main(void)
{
    char *test = "I`m main";
    char *test1 = "I`m main2";
    printf("0x%llx\n", (unsigned long long)&gav);
    asm("call %P0" : :"i"((unsigned long long)&gav), "a"(test), "b"(test1));
    return 0;
}

Answer 8

回答by Kornel Kisielewicz

Isn't thiswhat you are looking for?

是不是这样，你在找什么？

Syntax:

句法：

 asm ("fsinx %1,%0" : "=f" (result) : "f" (angle));

C语言将寄存器值读入 C 变量

提问by Brian

回答by ephemient

回答by Richard Pennington

回答by Jacob

回答by R Samuel Klatchko

回答by Peter Cordes

回答by user108127

回答by Mahdi Mohammadi

回答by Kornel Kisielewicz

相关推荐

最近更新

标签

C语言 将寄存器值读入 C 变量

提问by Brian

回答by ephemient

回答by Richard Pennington

回答by Jacob

回答by R Samuel Klatchko

回答by Peter Cordes

回答by user108127

回答by Mahdi Mohammadi

回答by Kornel Kisielewicz

相关推荐

C语言 <stdlib.h> 和 <malloc.h> 的区别

C语言 赋值使指针来自整数而不进行强制转换

C语言 在 C++ 中通过 'recv' 和 'MSG_PEEK' 获取套接字中可用的字节数

C语言 c - 如何在c中将char *转换为char []

相关推荐

最近更新

标签

C语言将寄存器值读入 C 变量

C语言赋值使指针来自整数而不进行强制转换

C语言在 C++ 中通过 'recv' 和 'MSG_PEEK' 获取套接字中可用的字节数