C++ 哪个更快:if (bool) 或 if(int)?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/5764956/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-28 18:50:07  来源:igfitidea点击:

Which is faster : if (bool) or if(int)?

c++assemblyintboolean

提问by Nawaz

Which value is better to use? Boolean true or Integer 1?

哪个值更好用?布尔真还是整数 1?

The above topic made me do some experiments with booland intin ifcondition. So just out of curiosity I wrote this program:

上面的话题让我在条件bool和条件下做了一些实验。所以出于好奇我写了这个程序:intif

int f(int i) 
{
    if ( i ) return 99;   //if(int)
    else  return -99;
}
int g(bool b)
{
    if ( b ) return 99;   //if(bool)
    else  return -99;
}
int main(){}

g++ intbool.cpp -Sgenerates asm code for each functions as follows:

g++ intbool.cpp -S为每个函数生成 asm 代码如下:

  • asm code for f(int)

    __Z1fi:
       LFB0:
             pushl  %ebp
       LCFI0:
              movl  %esp, %ebp
       LCFI1:
              cmpl  
    __Z1gb:
       LFB1:
              pushl %ebp
       LCFI3:
              movl  %esp, %ebp
       LCFI4:
              subl  , %esp
       LCFI5:
              movl  8(%ebp), %eax
              movb  %al, -4(%ebp)
              cmpb  
    __Z1fi:
       LFB0:
             pushl  %ebp
       LCFI0:
              movl  %esp, %ebp
       LCFI1:
              cmpl  
    __Z1gb:
       LFB1:
              pushl %ebp
       LCFI3:
              movl  %esp, %ebp
       LCFI4:
              subl  , %esp
       LCFI5:
              movl  8(%ebp), %eax
              movb  %al, -4(%ebp)
              cmpb  
        pushl   %ebp
        movl    %esp, %ebp
        cmpl    , 8(%ebp)
        popl    %ebp
        sbbl    %eax, %eax
        andb    , %al
        addl    , %eax
        ret
    
    , -4(%ebp) je L5 movl , %eax jmp L6 L5: movl $-99, %eax L6: leave LCFI6: ret
    , 8(%ebp) je L2 movl , %eax jmp L3 L2: movl $-99, %eax L3: leave LCFI2: ret
    , -4(%ebp) je L5 movl , %eax jmp L6 L5: movl $-99, %eax L6: leave LCFI6: ret
    , 8(%ebp) je L2 movl , %eax jmp L3 L2: movl $-99, %eax L3: leave LCFI2: ret
  • asm code for g(bool)

        pushl   %ebp
        movl    %esp, %ebp
        cmpb    , 8(%ebp)
        popl    %ebp
        sbbl    %eax, %eax
        andb    , %al
        addl    , %eax
        ret
    
  • 汇编代码 f(int)

            .type   _Z1fi, @function
    _Z1fi:
    .LFB0:
            .cfi_startproc
            .cfi_personality 0x3,__gxx_personality_v0
            cmpl    , %edi
            sbbl    %eax, %eax
            andb    , %al
            addl    , %eax
            ret
            .cfi_endproc
    
  • 汇编代码 g(bool)

            .type   _Z1gb, @function
    _Z1gb:
    .LFB1:
            .cfi_startproc
            .cfi_personality 0x3,__gxx_personality_v0
            cmpb    , %dil
            sbbl    %eax, %eax
            andb    , %al
            addl    , %eax
            ret
            .cfi_endproc
    

Surprisingly, g(bool)generates more asminstructions! Does it mean that if(bool)is little slower than if(int)? I used to think boolis especially designed to be used in conditional statement such as if, so I was expecting g(bool)to generate less asm instructions, thereby making g(bool)more efficient and fast.

令人惊讶的是,g(bool)生成了更多asm指令!这是否意味着if(bool)比 慢一点if(int)?我曾经认为bool是专门设计用于条件语句,例如if,所以我期望g(bool)生成更少的 asm 指令,从而提高g(bool)效率和速度。

EDIT:

编辑:

I'm not using any optimization flag as of now. But even absence of it, why does it generate more asm for g(bool)is a question for which I'm looking for a reasonable answer. I should also tell you that -O2optimization flag generates exactly same asm. But that isn't the question. The question is what I've asked.

我现在没有使用任何优化标志。但即使没有它,为什么它会产生更多的 asmg(bool)是一个我正在寻找合理答案的问题。我还应该告诉你-O2优化标志生成完全相同的 asm。但这不是问题。问题是我问过的。



回答by Sherm Pendley

Makes sense to me. Your compiler apparently defines a boolas an 8-bit value, and your system ABI requires it to "promote" small (< 32-bit) integer arguments to 32-bit when pushing them onto the call stack. So to compare a bool, the compiler generates code to isolate the least significant byte of the 32-bit argument that g receives, and compares it with cmpb. In the first example, the intargument uses the full 32 bits that were pushed onto the stack, so it simply compares against the whole thing with cmpl.

我感觉合理。您的编译器显然将 a 定义bool为 8 位值,并且您的系统 ABI 要求它在将小(< 32 位)整数参数推送到调用堆栈时将它们“提升”为 32 位。因此,为了比较 a bool,编译器生成代码来隔离 g 接收的 32 位参数的最低有效字节,并将其与cmpb. 在第一个示例中,int参数使用压入堆栈的完整 32 位,因此它只是与整个事物进行比较cmpl

回答by Alexander Gessler

Compiling with -03gives the following for me:

编译-03为我提供以下内容:

f:

F:

#include <stdio.h>
#include <string.h>

int testi(int);
int testb(bool);
int main (int argc, char* argv[]){
  bool valb;
  int  vali;
  int loops;
  if( argc < 2 ){
    return 2;
  }
  valb = (0 != (strcmp(argv[1], "0")));
  vali = strcmp(argv[1], "0");
  printf("Arg1: %s\n", argv[1]);
  printf("BArg1: %i\n", valb ? 1 : 0);
  printf("IArg1: %i\n", vali);
  for(loops=30000000; loops>0; loops--){
    //printf("%i: %i\n", loops, testb(valb=!valb));
    printf("%i: %i\n", loops, testi(vali=!vali));
  }
  return valb;
}

int testi(int val){
  if( val ){
    return 1;
  }
  return 0;
}
int testb(bool val){
  if( val ){
    return 1;
  }
  return 0;
}

g:

G:

sauer@trogdor:/tmp$ time /tmp/test_i 1 > /dev/null

real    0m8.203s
user    0m8.170s
sys 0m0.010s
sauer@trogdor:/tmp$ time /tmp/test_i 1 > /dev/null

real    0m8.056s
user    0m8.020s
sys 0m0.000s
sauer@trogdor:/tmp$ time /tmp/test_i 1 > /dev/null

real    0m8.116s
user    0m8.100s
sys 0m0.000s

.. so it compiles to essentially the same code, except for cmplvs cmpb. This means that the difference, if there is any, doesn't matter. Judging by unoptimized code is not fair.

.. 所以它编译成基本上相同的代码,除了cmplvs cmpb。这意味着差异(如果有)无关紧要。以未优化的代码来判断是不公平的。



Editto clarify my point. Unoptimized code is for simple debugging, not for speed. Comparing the speed of unoptimized code is senseless.

编辑以澄清我的观点。未优化的代码是为了简单的调试,而不是为了速度。比较未优化代码的速度是没有意义的。

回答by JUST MY correct OPINION

When I compile this with a sane set of options (specifically -O3), here's what I get:

当我用一组合理的选项(特别是 -O3)编译它时,我得到的是:

For f():

对于f()

sauer@trogdor:/tmp$ time /tmp/test_i 1 > /dev/null

real    0m8.254s
user    0m8.240s
sys 0m0.000s
sauer@trogdor:/tmp$ time /tmp/test_i 1 > /dev/null

real    0m8.028s
user    0m8.000s
sys 0m0.010s
sauer@trogdor:/tmp$ time /tmp/test_i 1 > /dev/null

real    0m7.981s
user    0m7.900s
sys 0m0.050s

For g():

对于g()

  mov eax,dword ptr[esp]    ;Store integer
  cmp eax,0                 ;Compare to 0
  je  false                 ;If int is 0, its false
  ;Do what has to be done when true
false:
  ;Do what has to be done when false

They still use different instructions for the comparison (cmpbfor boolean vs. cmplfor int), but otherwise the bodies are identical. A quick look at the Intel manuals tells me: ... not much of anything. There's no such thing as cmpbor cmplin the Intel manuals. They're all cmpand I can't find the timing tables at the moment. I'm guessing, however, that there's no clock difference between comparing a byte immediate vs. comparing a long immediate, so for all practical purposes the code is identical.

他们仍然使用不同的指令进行比较(cmpb布尔值与cmpl整数),但除此之外,主体是相同的。快速浏览一下英特尔手册告诉我:......没什么。英特尔手册中没有cmpbcmpl在这样的东西。他们都是cmp,我现在找不到时间表。但是,我猜想,比较立即字节与比较长立即数之间没有时钟差异,因此对于所有实际目的,代码都是相同的。



edited to add the following based on your addition

编辑以根据您的添加添加以下内容

The reason the code is different in the unoptimized case is that it is unoptimized. (Yes, it's circular, I know.) When the compiler walks the AST and generates code directly, it doesn't "know" anything except what's at the immediate point of the AST it's in. At that point it lacks all contextual information needed to know that at this specific point it can treat the declared type boolas an int. A boolean is obviously by default treated as a byte and when manipulating bytes in the Intel world you have to do things like sign-extend to bring it to certain widths to put it on the stack, etc. (You can't push a byte.)

代码在未优化情况下不同的原因是它未优化。(是的,它是循环的,我知道。)当编译器遍历 AST 并直接生成代码时,它不“知道”任何东西,除了它所在的 AST 的直接点。那时它缺少所有需要的上下文信息要知道在这个特定点,它可以将声明的类型bool视为int. 一个布尔值显然默认被视为一个字节,在 Intel 世界中操作字节时,您必须执行诸如符号扩展之类的操作以将其设置为特定宽度以将其放入堆栈等(您不能推送一个字节.)

When the optimizer views the AST and does its magic, however, it looks at surrounding context and "knows" when it can replace code with something more efficient without changing semantics. So it "knows" it can use an integer in the parameter and thereby lose the unnecessary conversions and widening.

然而,当优化器查看 AST 并发挥其魔力时,它会查看周围的上下文并“知道”何时可以用更有效的代码替换代码而不改变语义。所以它“知道”它可以在参数中使用一个整数,从而失去不必要的转换和扩展。

回答by Mat

With GCC 4.5 on Linux and Windows at least, sizeof(bool) == 1. On x86 and x86_64, you can't pass in less than an general purpose register's worth to a function (whether via the stack or a register depending on the calling convention etc...).

与GCC 4.5在Linux和Windows至少,sizeof(bool) == 1。在 x86 和 x86_64 上,您不能将小于通用寄存器的值传递给函数(无论是通过堆栈还是寄存器,取决于调用约定等...)。

So the code for bool, when un-optimized, actually goes to some length to extract that bool value from the argument stack (using another stack slot to save that byte). It's more complicated than just pulling a native register-sized variable.

所以 bool 的代码,当未优化时,实际上会从参数堆栈中提取该 bool 值(使用另一个堆栈槽来保存该字节)。这比仅仅提取一个本地寄存器大小的变量要复杂得多。

回答by DigitalRoss

At the machine level there is no such thing as bool

在机器级别,没有 bool 之类的东西

Very few instruction set architectures define any sort of boolean operand type, although there are often instructions that trigger an action on non-zero values. To the CPU, usually, everything is one of the scalar types or a string of them.

很少有指令集架构定义任何类型的布尔操作数类型,尽管经常有指令触发对非零值的操作。对于 CPU 而言,通常,一切都是标量类型之一或其中的一个字符串。

A given compiler and a given ABI will need to choose specific sizes for intand booland when, like in your case, these are different sizes they may generate slightly different code, and at some levels of optimization one may be slightly faster.

给定的编译器和给定的 ABI 将需要选择特定的大小intbool以及何时,就像在您的情况下一样,这些不同的大小可能会生成略有不同的代码,并且在某些优化级别上可能会稍快一些。

Why is bool one byte on many systems?

为什么 bool 在许多系统上是一个字节?

It's safer to choose a chartype for bool because someone might make a really large array of them.

char为 bool选择一种类型更安全,因为有人可能会制作一个非常大的数组。

Update:by "safer",I mean: for the compiler and library implementors.I'm not saying people need to reimplement the system type.

更新:通过“安全”的,我的意思是:编译器和库的实现者。我并不是说人们需要重新实现系统类型。

回答by dannysauer

Yeah, the discussion's fun. But just test it:

是的,讨论很有趣。但只需测试一下:

Test code:

测试代码:

  mov  al,1     ;Anything that is not 0 is true
  test al,1     ;See if first bit is fliped
  jz   false    ;Not fliped, so it's false
  ;Do what has to be done when true
false:
  ;Do what has to be done when false

Compiled on a 64-bit Ubuntu 10.10 laptop with: g++ -O3 -o /tmp/test_i /tmp/test_i.cpp

在 64 位 Ubuntu 10.10 笔记本电脑上编译: g++ -O3 -o /tmp/test_i /tmp/test_i.cpp

Integer-based comparison:

基于整数的比较:

##代码##

Boolean test / print uncommented (and integer commented):

布尔测试/打印未注释(和整数注释):

##代码##

They're the same with 1 assignment and 2 comparisons each loop over 30 million loops. Find something else to optimize. For example, don't use strcmp unnecessarily. ;)

它们与 1 次分配和 2 次比较相同,每个循环超过 3000 万次循环。找到其他要优化的东西。例如,不要不必要地使用 strcmp。;)

回答by Aleadam

It will mostly depend on the compiler and the optimization. There's an interesting discussion (language agnostic) here:

它主要取决于编译器和优化。这里有一个有趣的讨论(与语言无关):

Does "if ([bool] == true)" require one more step than "if ([bool])"?

“if ([bool] == true)”是否比“if ([bool])”需要多一步?

Also, take a look at this post: http://www.linuxquestions.org/questions/programming-9/c-compiler-handling-of-boolean-variables-290996/

另外,看看这篇文章:http: //www.linuxquestions.org/questions/programming-9/c-compiler-handling-of-boolean-variables-290996/

回答by Artie

Approaching your question in two different ways:

以两种不同的方式处理您的问题:

If you are specifically talking about C++ or any programming language that will produce assembly code for that matter, we are bound to what code the compiler will generate in ASM. We are also bound to the representation of true and false in c++. An integer will have to be stored in 32 bits, and I could simply use a byte to store the boolean expression. Asm snippets for conditional statements:

如果您专门讨论 C++ 或将为此生成汇编代码的任何编程语言,那么我们将受到编译器将在 ASM 中生成的代码的约束。在 C++ 中,我们也被绑定到 true 和 false 的表示中。一个整数必须以 32 位存储,我可以简单地使用一个字节来存储布尔表达式。条件语句的 Asm 片段:

For the integer:

对于整数:

##代码##

For the bool:

对于布尔:

##代码##

So, that's why the speed comparison is so compile dependent. In the case above, the bool would be slightly fast since cmpwould imply a subtraction for setting the flags. It also contradicts with what your compiler generated.

所以,这就是速度比较如此依赖编译的原因。在上面的情况下, bool 会稍微快一点,因为cmp这意味着设置标志的减法。它也与您的编译器生成的内容相矛盾。

Another approach, a much simpler one, is to look at the logic of the expression on it's own and try not to worry about how the compiler will translate your code, and I think this is a much healthier way of thinking. I still believe, ultimately, that the code being generated by the compiler is actually trying to give a truthful resolution. What I mean is that, maybe if you increase the test cases in the if statement and stick with boolean in one side and integer in another, the compiler will make it so the code generated will execute faster with boolean expressions in the machine level.

另一种更简单的方法是自己查看表达式的逻辑,尽量不要担心编译器将如何翻译您的代码,我认为这是一种更健康的思维方式。我仍然相信,最终,编译器生成的代码实际上是在试图给出一个真实的解决方案。我的意思是,也许如果您增加 if 语句中的测试用例,并坚持在一侧使用布尔值而在另一侧使用整数,编译器将使其生成的代码在机器级别使用布尔表达式执行得更快。

I'm considering this is a conceptual question, so I'll give a conceptual answer. This discussion reminds me of discussions I commonly have about whether or not code efficiency translates to less lines of code in assembly. It seems that this concept is generally accepted as being true. Considering that keeping track of how fast the ALU will handle each statement is not viable, the second option would be to focus on jumps and compares in assembly. When that is the case, the distinction between boolean statements or integers in the code you presented becomes rather representative. The result of an expression in C++ will return a value that will then be given a representation. In assembly, on the other hand, the jumps and comparisons will be based in numeric values regardless of what type of expression was being evaluated back at you C++ if statement. It is important on these questions to remember that purely logicical statements such as these end up with a huge computational overhead, even though a single bit would be capable of the same thing.

我认为这是一个概念性的问题,所以我会给出一个概念性的答案。这个讨论让我想起了我经常讨论的关于代码效率是否会转化为更少的汇编代码行的讨论。似乎这个概念被普遍认为是正确的。考虑到跟踪 ALU 处理每个语句的速度是不可行的,第二个选择是专注于汇编中的跳转和比较。在这种情况下,您提供的代码中布尔语句或整数之间的区别就变得非常具有代表性。C++ 中表达式的结果将返回一个值,然后将给出一个表示。另一方面,在组装中,无论在您的 C++ if 语句中评估什么类型的表达式,跳转和比较都将基于数值。对于这些问题,重要的是要记住,诸如此类的纯逻辑语句最终会产生巨大的计算开销,即使单个位也能完成相同的事情。