C++ 性能方面,按位运算符与普通模数的速度有多快?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/20393373/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-27 23:06:30  来源:igfitidea点击:

Performance wise, how fast are Bitwise Operators vs. Normal Modulus?

c++bit-manipulationbitwise-operators

提问by Maven

Does using bitwise operations in normal flow or conditional statements like for, if, and so on increase overall performance and would it be better to use them where possible? For example:

在正常流程或条件语句(如forif、 等)中使用按位运算是否会提高整体性能,并且在可能的情况下使用它们会更好吗?例如:

if(i++ & 1) {

}

vs.

对比

if(i % 2) {

}

回答by Jerry Coffin

Unless you're using an ancient compiler, it can already handle this level of conversion on its own. That is to say, a modern compiler can and will implement i % 2using a bitwise ANDinstruction, provided it makes sense to do so on the target CPU (which, in fairness, it usually will).

除非您使用的是古老的编译器,否则它已经可以自行处理这种级别的转换。也就是说,现代编译器可以并且将i % 2使用按位AND指令来实现,前提是在目标 CPU 上这样做是有意义的(公平地说,它通常会这样做)。

In other words, don't expect to see anydifference in performance between these, at least with a reasonably modern compiler with a reasonably competent optimizer. In this case, "reasonably" has a pretty broad definition too--even quite a few compilers that are decades old can handle this sort of micro-optimization with no difficulty at all.

换句话说,不希望看到任何至少在它们之间的性能差异,用一个合理的现代编译器以一个有能力的优化。在这种情况下,“合理”也有一个相当广泛的定义——即使是相当多的几十年前的编译器也可以毫无困难地处理这种微优化。

回答by Matthieu M.

TL;DRWrite for semantics first, optimize measured hot-spots second.

TL;DR 先写语义,然后优化测量的热点。

At the CPU level, integer modulus and divisions are among the slowest operations. But you are not writing at the CPU level, instead you write in C++, which your compiler translates to an Intermediate Representation, which finally is translated into assembly according to the model of CPU for which you are compiling.

在 CPU 级别,整数模数和除法是最慢的操作之一。但是您不是在 CPU 级别编写,而是用 C++ 编写,您的编译器将其转换为中间表示,最终根据您编译的 CPU 模型将其转换为汇编。

In this process, the compiler will apply Peephole Optimizations, among which figure Strength Reduction Optimizationssuch as (courtesy of Wikipedia):

在这个过程中,编译器会应用Peephole Optimizations,其中图强度减少优化如(由维基百科提供):

Original Calculation  Replacement Calculation
y = x / 8             y = x >> 3
y = x * 64            y = x << 6
y = x * 2             y = x << 1
y = x * 15            y = (x << 4) - x
Original Calculation  Replacement Calculation
y = x / 8             y = x >> 3
y = x * 64            y = x << 6
y = x * 2             y = x << 1
y = x * 15            y = (x << 4) - x

The last example is perhaps the most interesting one. Whilst multiplying or dividing by powers of 2 is easily converted (manually) into bit-shifts operations, the compiler is generally taught to perform even smarter transformations that you would probably think about on your own and who are not as easily recognized (at the very least, I do not personally immediately recognize that (x << 4) - xmeans x * 15).

最后一个例子可能是最有趣的。虽然乘以或除以 2 的幂很容易(手动)转换为位移操作,但通常会教导编译器执行更智能的转换,您可能会自己考虑这些转换并且不容易识别(在非常至少,我个人并没有立即认识到这(x << 4) - x意味着x * 15)。

回答by Tony Delroy

This is obviously CPU dependent, but you can expect that bitwise operations will never take more, and typically take less, CPU cycles to complete. In general, integer /and %are famously slow, as CPU instructions go. That said, with modern CPU pipelines having a specific instruction complete earlier doesn't mean your program necessarily runs faster.

这显然与 CPU 相关,但您可以预期按位运算永远不会花费更多,通常会花费更少的 CPU 周期来完成。一般来说,整数/%以 CPU 指令运行的速度慢着称。也就是说,现代 CPU 管道具有较早完成的特定指令并不意味着您的程序一定会运行得更快。

Best practice is to write code that's understandable, maintainable, and expressive of the logic it implements. It's extremely rare that this kind of micro-optimisation makes a tangible difference, so it should only be used if profiling has indicated a critical bottleneck and this is proven to make a significant difference. Moreover, if on some specific platform it did make a significant difference, your compiler optimiser may already be substituting a bitwise operation when it can see that's equivalent.

最佳实践是编写可理解、可维护且可表达其实现的逻辑的代码。这种微优化产生明显差异的情况极为罕见,因此只有在分析表明存在关键瓶颈并且已被证明会产生显着差异时才应使用它。此外,如果在某些特定平台上它确实产生了显着差异,则您的编译器优化器可能已经在它可以看到等效时替换了按位运算。

回答by starblue

By default you should use the operation that best expresses your intended meaning, because you should optimize for readable code. (Today most of the time the scarcest resource is the human programmer.)

默认情况下,您应该使用最能表达您的预期含义的操作,因为您应该针对可读代码进行优化。(今天大部分时间最稀缺的资源是人类程序员。)

So use &if you extract bits, and use %if you test for divisibility, i.e. whether the value is even or odd.

因此,&如果您提取位,请使用,并%在您测试可分性(即值是偶数还是奇数)时使用。

For unsigned values both operations have exactly the same effect, and your compiler should be smart enough to replace the division by the corresponding bit operation. If you are worried you can check the assembly code it generates.

对于无符号值,这两种操作具有完全相同的效果,您的编译器应该足够聪明,可以用相应的位操作替换除法。如果您担心,可以检查它生成的汇编代码。

Unfortunately integer division is slightly irregular on signed values, as it rounds towards zero and the result of % changes sign depending on the first operand. Bit operations, on the other hand, always round down. So the compiler cannot just replace the division by a simple bit operation. Instead it may either call a routine for integer division, or replace it with bit operations with additional logic to handle the irregularity. This may depends on the optimization level and on which of the operands are constants.

不幸的是,整数除法在有符号值上有点不规则,因为它向零舍入并且 % 的结果根据第一个操作数更改符号。另一方面,位操作总是向下舍入。所以编译器不能只用一个简单的位运算来代替除法。相反,它可以调用整数除法例程,或者用位操作替换它,并带有附加逻辑来处理不规则性。这可能取决于优化级别以及哪些操作数是常量。

This irregularity at zero may even be a bad thing, because it is a nonlinearity. For example, I recently had a case where we used division on signed values from an ADC, which had to be very fast on an ARM Cortex M0. In this case it was better to replace it with a right shift, both for performance and to get rid of the nonlinearity.

这种为零的不规则性甚至可能是一件坏事,因为它是非线性的。例如,我最近有一个案例,我们对来自 ADC 的有符号值使用除法,这在 ARM Cortex M0 上必须非常快。在这种情况下,最好用右移代替它,这既是为了性能,也是为了摆脱非线性。

回答by AnT

C operators cannot be meaningfully compared in therms of "performance". There's no such thing as "faster" or "slower" operators at language level. Only the resultant compiled machine code can be analyzed for performance. In your specific example the resultant machine code will normally be exactly the same (if we ignore the fact that the first condition includes a postfix increment for some reason), meaning that there won't be any difference in performance whatsoever.

C 运算符不能在“性能”方面进行有意义的比较。在语言级别没有“更快”或“更慢”运算符之类的东西。只能分析生成的编译机器码的性能。在您的具体示例中,生成的机器代码通常完全相同(如果我们忽略第一个条件由于某种原因包含后缀增量的事实),这意味着性能不会有任何差异。

回答by user9164692

Here is the compiler (GCC 4.6) generated optimized -O3 code for both options:

这是编译器 (GCC 4.6) 为两个选项生成的优化 -O3 代码:

int i = 34567;
int opt1 = i++ & 1;
int opt2 = i % 2;

Generated code for opt1:

为 opt1 生成的代码:

l     %r1,520(%r11)
nilf  %r1,1
st    %r1,516(%r11)
asi   520(%r11),1

Generated code for opt2:

为 opt2 生成的代码:

l     %r1,520(%r11)
nilf  %r1,2147483649
ltr   %r1,%r1
jhe  .L14
ahi   %r1,-1
oilf  %r1,4294967294
ahi   %r1,1
.L14: st %r1,512(%r11)

So 4 extra instructions...which are nothing for a prod environment. This would be a premature optimization and just introduce complexity

所以 4 个额外的指令......这对于生产环境来说没什么。这将是一个过早的优化,只会引入复杂性

回答by yosim

Bitwise operations are much faster. This is why the compiler will use bitwise operations for you. Actually, I think it will be faster to implement it as:

按位运算要快得多。这就是编译器将为您使用按位运算的原因。实际上,我认为实现它会更快:

~i & 1

Similarly, if you look at the assembly code your compiler generates, you may see things like x ^= xinstead of x=0. But (I hope) you are not going to use this in your C++ code.

同样,如果您查看编译器生成的汇编代码,您可能会看到类似x ^= x而不是x=0. 但是(我希望)你不会在你的 C++ 代码中使用它。

In summary, do yourself, and whoever will need to maintain your code, a favor. Make your code readable, and let the compiler do these micro optimizations. It will do it better.

总之,做你自己,以及需要维护你的代码的人,一个忙。使您的代码可读,并让编译器进行这些微优化。它会做得更好。