将 ADC(带进位相加)组装到 C++

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/4153852/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-28 14:38:34  来源:igfitidea点击:

Assembly ADC (Add with carry) to C++

c++assemblyx86

提问by Martijn Courteaux

There is an assembly instruction ADC. I've found this means "Add with carry". But I don't know what thatmeans. Or how to write this instruction in C++. And I know it isn't the same as ADD. So making a simple summation is not correct.

有汇编指令ADC。我发现这意味着“带进位添加”。但我不知道是什么意思。或者如何在 C++ 中编写此指令。而且我知道它与ADD. 所以做一个简单的求和是不正确的。

INFO:
Compiled in Windows. I'm using a 32-bit Windows Installation. My processor is Core 2 Duo from Intel.

信息:
在 Windows 中编译。我使用的是 32 位 Windows 安装。我的处理器是英特尔的 Core 2 Duo。

回答by Simone

ADC is the same as ADD but adds an extra 1 if processor's carry flag is set.

ADC 与 ADD 相同,但如果设置了处理器的进位标志,则增加一个额外的 1。

回答by Chubsdad

From here(broken) or here

这里(破碎)或这里

However, Intel processor has a special instruction called adc. This command behaves similarly as the add command. The only extra thing is that it also add the value carry flag along. So, this may be very handy to add large integers. Suppose you'd like to add a 32-bit integers with 16-bit registers. How can we do that? Well, let's say that the first integer is held on the register pair DX:AX, and the second one is on BX:CX. This is how:

add  ax, cx
adc  dx, bx

Ah, so first, the lower 16-bit is added by add ax, cx. Then the higher 16-bit is added using adc instead of add. It is because: if there are overflows, the carry bit is automatically added in the higher 16-bit. So, no cumbersome checking. This method can be extended to 64 bits and so on... Note that: If the 32-bit integer addition overflows too at the higher 16-bit, the result will not be correct and the carry flag is set, e.g. Adding 5 billion to 5 billion.

但是,Intel 处理器有一条特殊指令,称为 adc。此命令的行为与 add 命令类似。唯一额外的事情是它还添加了值进位标志。因此,这对于添加大整数可能非常方便。假设您想添加一个带有 16 位寄存器的 32 位整数。我们怎么做?好吧,假设第一个整数保存在寄存器对 DX:AX 上,第二个整数保存在 BX:CX 上。这是如何:

add  ax, cx
adc  dx, bx

啊,那么首先,通过add ax,cx添加低16位。然后使用 adc 而不是 add 来添加更高的 16 位。这是因为:如果有溢出,进位位会自动添加到较高的 16 位中。所以,没有繁琐的检查。这种方法可以扩展到64位等等... 注意:如果32位整数加法在高16位也溢出,结果将不正确并设置进位标志,例如加50亿到 50 亿。

Everything from here on, remember that it falls pretty much into the zone of implementation defined behavior.

从这里开始的一切,请记住,它几乎属于实现定义的行为区域。

Here's a small sample that works for VS 2010 (32-bit, WinXp)

这是一个适用于 VS 2010(32 位,WinXp)的小示例

Caveat: $7.4/1- "The asm declaration is conditionally-supported; its meaning is implementation-defined. [ Note: Typically it is used to pass information through the implementation to an assembler. —end note ]"

警告:$7.4/1-“asm 声明是有条件支持的;其含义是实现定义的。[注意:通常它用于将信息通过实现传递给汇编程序。--尾注]”

int main(){
   bool carry = false;
   int x = 0xffffffff + 0xffffffff;
   __asm {
      jc setcarry
setcarry:
      mov carry, 1
   }
}

回答by Sparky

The ADC behaviour can be simulated in both C and C++. The following example adds two numbers (stored as arrays of unsigned as they are too large to fit into a single unsigned).

ADC 行为可以在 C 和 C++ 中模拟。以下示例将两个数字相加(存储为无符号数组,因为它们太大而无法放入单个无符号数字)。

unsigned first[10];
unsigned second[10];
unsigned result[11];

....   /* first and second get defined */

unsigned carry = 0;
for (i = 0; i < 10; i++) {
    result[i] = first[i] + second[i] + carry;
    carry = (first[i] > result[i]);
}
result[10] = carry;

Hope this helps.

希望这可以帮助。

回答by Oshkosher

There is a bug in this. Try this input:

这有一个错误。试试这个输入:

unsigned first[10] =  {0x00000001};
unsigned second[10] = {0xffffffff, 0xffffffff};

The result should be {0, 0, 1, ...} but the result is {0, 0, 0, ...}

结果应该是 {0, 0, 1, ...} 但结果是 {0, 0, 0, ...}

Changing this line:

改变这一行:

carry = (first[i] > result[i]);

to this:

对此:

if (carry)
    carry = (first[i] >= result[i]);
else
    carry = (first[i] > result[i]);

fixes it.

修复它。

回答by Peter Cordes

The C++ language doesn't have any concept of a carry flag, so making an intrinsic function wrapper around the ADCinstructionis clunky. However, Intel did it anyway: unsigned char _addcarry_u32 (unsigned char c_in, unsigned a, unsigned b, unsigned * out);. Last I checked, gcc did a poor job with this (saving the carry result into an integer register, instead of leaving it in CF), but hopefully Intel's own compiler does better.

C++ 语言没有任何进位标志的概念,因此围绕ADC指令制作内部函数包装器很笨拙。然而,英特尔还是做到了: unsigned char _addcarry_u32 (unsigned char c_in, unsigned a, unsigned b, unsigned * out);. 最后我检查过,gcc 在这方面做得很差(将进位结果保存到一个整数寄存器中,而不是将它留在 CF 中),但希望英特尔自己的编译器做得更好。

See also the x86tag wiki for assembly documentation.

另请参阅x86标记 wiki 以获取程序集文档。



The compiler will use ADC for you when adding integers wider than a single register, e.g. adding int64_tin 32bit code, or __int128_tin 64bit code.

当添加比单个寄存器更宽的整数时,编译器将为您使用 ADC,例如添加int64_t32 位代码或__int128_t64 位代码。

#include <stdint.h>
#ifdef __x86_64__
__int128_t add128(__int128_t a, __int128_t b) { return a+b; }
#endif
    # clang 3.8 -O3  for x86-64, SystemV ABI.
    # __int128_t args passed in 2 regs each, and returned in rdx:rax
    add     rdi, rdx
    adc     rsi, rcx
    mov     rax, rdi
    mov     rdx, rsi
    ret

asm output from the Godbolt compiler explorer. clang's -fverbose-asmisn't very vebose, but gcc 5.3 / 6.1 wastes two movinstructions so it's less readable.

Godbolt 编译器资源管理器的asm 输出。clang-fverbose-asm不是很冗长,但 gcc 5.3 / 6.1 浪费了两条mov指令,因此可读性较差。

You can sometimes hand-hold compilers into emitting an adcor otherwise using the carry-out of addusing the idiom uint64_t sum = a+b;/ carry = sum < a;. But extending this to get a carry-out from an adcinstead of addis not possible with current compilers; c+d+carry_incan wrap all the way around, and compilers don't manage to optimize the multiple checks for carry out on each +in c+d+carryif you do it safely.

您有时可以手持编译器来发出adc或以其他方式使用add习语uint64_t sum = a+b;/ 的执行carry = sum < a;。但是对于当前的编译器,扩展它以从而adc不是获得结转add是不可能的;c+d+carry_in可以绕到一路,和编译器不管理,以优化携带的多张支票出每个+c+d+carry如果你这样做安全。



Clang _ExtInt

_ExtInt

There is one way I'm aware of to get a chain of add/adc/.../adc: Clang's new _ExtInt(width)feature that provides fixed-bit-width types of any size up to 16,777,215 bits (blog post). It was added to clang's development version on April 21, 2020, so it's not yet in any released version.

我知道有一种方法可以获取一系列 add/adc/.../adc:Clang 的新_ExtInt(width)功能,该功能提供高达 16,777,215 位的任意大小的固定位宽类型(博客文章)。它已于 2020 年 4 月 21 日添加到 clang 的开发版本中,因此尚未在任何发布版本中。

This will hopefully show up in ISO C and/or C++ at some point; The N2472proposal is apparently being "being actively considered by the ISO WG14 C Language Committee"

这有望在某个时候出现在 ISO C 和/或 C++ 中;该N2472提案显然正在“正在积极通过ISO WG14 C语言的委员会审议”

typedef _ExtInt(256) wide_int;

wide_int add ( wide_int a, wide_int b) {
    return a+b;
}

compiles as follows with clang trunk -O2for x86-64 (Godbolt):

使用-O2x86-64 ( Godbolt) 的clang 主干编译如下:

add(int _ExtInt<256>, int _ExtInt<256>):
        add     rsi, r9
        adc     rdx, qword ptr [rsp + 8]
        adc     rcx, qword ptr [rsp + 16]
        mov     rax, rdi                        # return the retval pointer
        adc     r8, qword ptr [rsp + 24]        # chain of ADD / 3x ADC!

        mov     qword ptr [rdi + 8], rdx        # store results to mem
        mov     qword ptr [rdi], rsi
        mov     qword ptr [rdi + 16], rcx
        mov     qword ptr [rdi + 24], r8
        ret

Apparently _ExtIntis passed by value in integer registers until the calling convention runs out of registers. (At least in this early version; Perhaps x86-64 SysV should class it as "memory" when it's wider than 2 or maybe 3 registers, like structs larger than 16 bytes. Although moreso than structs, having it in registers is likely to be useful. Just put other args first so they're not displaced.)

显然_ExtInt在整数寄存器中按值传递,直到调用约定用完寄存器。(至少在这个早期版本中;当 x86-64 SysV 的宽度超过 2 个或 3 个寄存器时,例如大于 16 字节的结构,也许 x86-64 SysV 应该将其归类为“内存”。虽然比结构更多,但将它放在寄存器中很可能是有用。只需将其他 args 放在首位,这样它们就不会移位。)

The first _ExtInt arg is in R8:RCX:RDX:RSI, and the second has its low qword in R9, with the rest in memory.

第一个 _ExtInt arg 在 R8:RCX:RDX:RSI 中,第二个在 R9 中具有低 qword,其余在内存中。

A pointer to the return-value object is passed as a hidden first arg in RDI; x86-64 System V only ever returns in up to 2 integer registers (RDX:RAX) and this doesn't change that.

指向返回值对象的指针在 RDI 中作为隐藏的第一个 arg 传递;x86-64 System V 最多只能返回 2 个整数寄存器 (RDX:RAX),这不会改变这一点。

回答by Serg Stetsuk

unsigned long result;
unsigned int first;
unsigned int second;

result = first + second;
result += (result & 0x10000) ? 1 : 0;
result &= 0xFFFF