C++ 哪个更快:if (bool) 或 if(int)?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/5764956/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Which is faster : if (bool) or if(int)?
提问by Nawaz
The above topic made me do some experiments with bool
and int
in if
condition. So just out of curiosity I wrote this program:
上面的话题让我在条件bool
和条件下做了一些实验。所以出于好奇我写了这个程序:int
if
int f(int i)
{
if ( i ) return 99; //if(int)
else return -99;
}
int g(bool b)
{
if ( b ) return 99; //if(bool)
else return -99;
}
int main(){}
g++ intbool.cpp -S
generates asm code for each functions as follows:
g++ intbool.cpp -S
为每个函数生成 asm 代码如下:
asm code for
f(int)
__Z1fi: LFB0: pushl %ebp LCFI0: movl %esp, %ebp LCFI1: cmpl
, 8(%ebp) je L2 movl , %eax jmp L3 L2: movl $-99, %eax L3: leave LCFI2: ret__Z1gb: LFB1: pushl %ebp LCFI3: movl %esp, %ebp LCFI4: subl , %esp LCFI5: movl 8(%ebp), %eax movb %al, -4(%ebp) cmpb
, -4(%ebp) je L5 movl , %eax jmp L6 L5: movl $-99, %eax L6: leave LCFI6: ret__Z1fi: LFB0: pushl %ebp LCFI0: movl %esp, %ebp LCFI1: cmpl
, 8(%ebp) je L2 movl , %eax jmp L3 L2: movl $-99, %eax L3: leave LCFI2: ret__Z1gb: LFB1: pushl %ebp LCFI3: movl %esp, %ebp LCFI4: subl , %esp LCFI5: movl 8(%ebp), %eax movb %al, -4(%ebp) cmpb
, -4(%ebp) je L5 movl , %eax jmp L6 L5: movl $-99, %eax L6: leave LCFI6: retpushl %ebp movl %esp, %ebp cmpl , 8(%ebp) popl %ebp sbbl %eax, %eax andb , %al addl , %eax ret
asm code for
g(bool)
pushl %ebp movl %esp, %ebp cmpb , 8(%ebp) popl %ebp sbbl %eax, %eax andb , %al addl , %eax ret
汇编代码
f(int)
.type _Z1fi, @function _Z1fi: .LFB0: .cfi_startproc .cfi_personality 0x3,__gxx_personality_v0 cmpl , %edi sbbl %eax, %eax andb , %al addl , %eax ret .cfi_endproc
汇编代码
g(bool)
.type _Z1gb, @function _Z1gb: .LFB1: .cfi_startproc .cfi_personality 0x3,__gxx_personality_v0 cmpb , %dil sbbl %eax, %eax andb , %al addl , %eax ret .cfi_endproc
Surprisingly, g(bool)
generates more asm
instructions! Does it mean that if(bool)
is little slower than if(int)
? I used to think bool
is especially designed to be used in conditional statement such as if
, so I was expecting g(bool)
to generate less asm instructions, thereby making g(bool)
more efficient and fast.
令人惊讶的是,g(bool)
生成了更多asm
指令!这是否意味着if(bool)
比 慢一点if(int)
?我曾经认为bool
是专门设计用于条件语句,例如if
,所以我期望g(bool)
生成更少的 asm 指令,从而提高g(bool)
效率和速度。
EDIT:
编辑:
I'm not using any optimization flag as of now. But even absence of it, why does it generate more asm for g(bool)
is a question for which I'm looking for a reasonable answer. I should also tell you that -O2
optimization flag generates exactly same asm. But that isn't the question. The question is what I've asked.
我现在没有使用任何优化标志。但即使没有它,为什么它会产生更多的 asmg(bool)
是一个我正在寻找合理答案的问题。我还应该告诉你-O2
优化标志生成完全相同的 asm。但这不是问题。问题是我问过的。
回答by Sherm Pendley
Makes sense to me. Your compiler apparently defines a bool
as an 8-bit value, and your system ABI requires it to "promote" small (< 32-bit) integer arguments to 32-bit when pushing them onto the call stack. So to compare a bool
, the compiler generates code to isolate the least significant byte of the 32-bit argument that g receives, and compares it with cmpb
. In the first example, the int
argument uses the full 32 bits that were pushed onto the stack, so it simply compares against the whole thing with cmpl
.
我感觉合理。您的编译器显然将 a 定义bool
为 8 位值,并且您的系统 ABI 要求它在将小(< 32 位)整数参数推送到调用堆栈时将它们“提升”为 32 位。因此,为了比较 a bool
,编译器生成代码来隔离 g 接收的 32 位参数的最低有效字节,并将其与cmpb
. 在第一个示例中,int
参数使用压入堆栈的完整 32 位,因此它只是与整个事物进行比较cmpl
。
回答by Alexander Gessler
Compiling with -03
gives the following for me:
编译-03
为我提供以下内容:
f:
F:
#include <stdio.h>
#include <string.h>
int testi(int);
int testb(bool);
int main (int argc, char* argv[]){
bool valb;
int vali;
int loops;
if( argc < 2 ){
return 2;
}
valb = (0 != (strcmp(argv[1], "0")));
vali = strcmp(argv[1], "0");
printf("Arg1: %s\n", argv[1]);
printf("BArg1: %i\n", valb ? 1 : 0);
printf("IArg1: %i\n", vali);
for(loops=30000000; loops>0; loops--){
//printf("%i: %i\n", loops, testb(valb=!valb));
printf("%i: %i\n", loops, testi(vali=!vali));
}
return valb;
}
int testi(int val){
if( val ){
return 1;
}
return 0;
}
int testb(bool val){
if( val ){
return 1;
}
return 0;
}
g:
G:
sauer@trogdor:/tmp$ time /tmp/test_i 1 > /dev/null
real 0m8.203s
user 0m8.170s
sys 0m0.010s
sauer@trogdor:/tmp$ time /tmp/test_i 1 > /dev/null
real 0m8.056s
user 0m8.020s
sys 0m0.000s
sauer@trogdor:/tmp$ time /tmp/test_i 1 > /dev/null
real 0m8.116s
user 0m8.100s
sys 0m0.000s
.. so it compiles to essentially the same code, except for cmpl
vs cmpb
.
This means that the difference, if there is any, doesn't matter. Judging by unoptimized code is not fair.
.. 所以它编译成基本上相同的代码,除了cmpl
vs cmpb
。这意味着差异(如果有)无关紧要。以未优化的代码来判断是不公平的。
Editto clarify my point. Unoptimized code is for simple debugging, not for speed. Comparing the speed of unoptimized code is senseless.
编辑以澄清我的观点。未优化的代码是为了简单的调试,而不是为了速度。比较未优化代码的速度是没有意义的。
回答by JUST MY correct OPINION
When I compile this with a sane set of options (specifically -O3), here's what I get:
当我用一组合理的选项(特别是 -O3)编译它时,我得到的是:
For f()
:
对于f()
:
sauer@trogdor:/tmp$ time /tmp/test_i 1 > /dev/null
real 0m8.254s
user 0m8.240s
sys 0m0.000s
sauer@trogdor:/tmp$ time /tmp/test_i 1 > /dev/null
real 0m8.028s
user 0m8.000s
sys 0m0.010s
sauer@trogdor:/tmp$ time /tmp/test_i 1 > /dev/null
real 0m7.981s
user 0m7.900s
sys 0m0.050s
For g()
:
对于g()
:
mov eax,dword ptr[esp] ;Store integer
cmp eax,0 ;Compare to 0
je false ;If int is 0, its false
;Do what has to be done when true
false:
;Do what has to be done when false
They still use different instructions for the comparison (cmpb
for boolean vs. cmpl
for int), but otherwise the bodies are identical. A quick look at the Intel manuals tells me: ... not much of anything. There's no such thing as cmpb
or cmpl
in the Intel manuals. They're all cmp
and I can't find the timing tables at the moment. I'm guessing, however, that there's no clock difference between comparing a byte immediate vs. comparing a long immediate, so for all practical purposes the code is identical.
他们仍然使用不同的指令进行比较(cmpb
布尔值与cmpl
整数),但除此之外,主体是相同的。快速浏览一下英特尔手册告诉我:......没什么。英特尔手册中没有cmpb
或cmpl
在这样的东西。他们都是cmp
,我现在找不到时间表。但是,我猜想,比较立即字节与比较长立即数之间没有时钟差异,因此对于所有实际目的,代码都是相同的。
edited to add the following based on your addition
编辑以根据您的添加添加以下内容
The reason the code is different in the unoptimized case is that it is unoptimized. (Yes, it's circular, I know.) When the compiler walks the AST and generates code directly, it doesn't "know" anything except what's at the immediate point of the AST it's in. At that point it lacks all contextual information needed to know that at this specific point it can treat the declared type bool
as an int
. A boolean is obviously by default treated as a byte and when manipulating bytes in the Intel world you have to do things like sign-extend to bring it to certain widths to put it on the stack, etc. (You can't push a byte.)
代码在未优化情况下不同的原因是它未优化。(是的,它是循环的,我知道。)当编译器遍历 AST 并直接生成代码时,它不“知道”任何东西,除了它所在的 AST 的直接点。那时它缺少所有需要的上下文信息要知道在这个特定点,它可以将声明的类型bool
视为int
. 一个布尔值显然默认被视为一个字节,在 Intel 世界中操作字节时,您必须执行诸如符号扩展之类的操作以将其设置为特定宽度以将其放入堆栈等(您不能推送一个字节.)
When the optimizer views the AST and does its magic, however, it looks at surrounding context and "knows" when it can replace code with something more efficient without changing semantics. So it "knows" it can use an integer in the parameter and thereby lose the unnecessary conversions and widening.
然而,当优化器查看 AST 并发挥其魔力时,它会查看周围的上下文并“知道”何时可以用更有效的代码替换代码而不改变语义。所以它“知道”它可以在参数中使用一个整数,从而失去不必要的转换和扩展。
回答by Mat
With GCC 4.5 on Linux and Windows at least, sizeof(bool) == 1
. On x86 and x86_64, you can't pass in less than an general purpose register's worth to a function (whether via the stack or a register depending on the calling convention etc...).
与GCC 4.5在Linux和Windows至少,sizeof(bool) == 1
。在 x86 和 x86_64 上,您不能将小于通用寄存器的值传递给函数(无论是通过堆栈还是寄存器,取决于调用约定等...)。
So the code for bool, when un-optimized, actually goes to some length to extract that bool value from the argument stack (using another stack slot to save that byte). It's more complicated than just pulling a native register-sized variable.
所以 bool 的代码,当未优化时,实际上会从参数堆栈中提取该 bool 值(使用另一个堆栈槽来保存该字节)。这比仅仅提取一个本地寄存器大小的变量要复杂得多。
回答by DigitalRoss
At the machine level there is no such thing as bool
在机器级别,没有 bool 之类的东西
Very few instruction set architectures define any sort of boolean operand type, although there are often instructions that trigger an action on non-zero values. To the CPU, usually, everything is one of the scalar types or a string of them.
很少有指令集架构定义任何类型的布尔操作数类型,尽管经常有指令触发对非零值的操作。对于 CPU 而言,通常,一切都是标量类型之一或其中的一个字符串。
A given compiler and a given ABI will need to choose specific sizes for int
and bool
and when, like in your case, these are different sizes they may generate slightly different code, and at some levels of optimization one may be slightly faster.
给定的编译器和给定的 ABI 将需要选择特定的大小int
,bool
以及何时,就像在您的情况下一样,这些不同的大小可能会生成略有不同的代码,并且在某些优化级别上可能会稍快一些。
Why is bool one byte on many systems?
为什么 bool 在许多系统上是一个字节?
It's safer to choose a char
type for bool because someone might make a really large array of them.
char
为 bool选择一种类型更安全,因为有人可能会制作一个非常大的数组。
Update:by "safer",I mean: for the compiler and library implementors.I'm not saying people need to reimplement the system type.
更新:通过“安全”的,我的意思是:编译器和库的实现者。我并不是说人们需要重新实现系统类型。
回答by dannysauer
Yeah, the discussion's fun. But just test it:
是的,讨论很有趣。但只需测试一下:
Test code:
测试代码:
mov al,1 ;Anything that is not 0 is true
test al,1 ;See if first bit is fliped
jz false ;Not fliped, so it's false
;Do what has to be done when true
false:
;Do what has to be done when false
Compiled on a 64-bit Ubuntu 10.10 laptop with: g++ -O3 -o /tmp/test_i /tmp/test_i.cpp
在 64 位 Ubuntu 10.10 笔记本电脑上编译: g++ -O3 -o /tmp/test_i /tmp/test_i.cpp
Integer-based comparison:
基于整数的比较:
##代码##Boolean test / print uncommented (and integer commented):
布尔测试/打印未注释(和整数注释):
##代码##They're the same with 1 assignment and 2 comparisons each loop over 30 million loops. Find something else to optimize. For example, don't use strcmp unnecessarily. ;)
它们与 1 次分配和 2 次比较相同,每个循环超过 3000 万次循环。找到其他要优化的东西。例如,不要不必要地使用 strcmp。;)
回答by Aleadam
It will mostly depend on the compiler and the optimization. There's an interesting discussion (language agnostic) here:
它主要取决于编译器和优化。这里有一个有趣的讨论(与语言无关):
Does "if ([bool] == true)" require one more step than "if ([bool])"?
“if ([bool] == true)”是否比“if ([bool])”需要多一步?
Also, take a look at this post: http://www.linuxquestions.org/questions/programming-9/c-compiler-handling-of-boolean-variables-290996/
另外,看看这篇文章:http: //www.linuxquestions.org/questions/programming-9/c-compiler-handling-of-boolean-variables-290996/
回答by Artie
Approaching your question in two different ways:
以两种不同的方式处理您的问题:
If you are specifically talking about C++ or any programming language that will produce assembly code for that matter, we are bound to what code the compiler will generate in ASM. We are also bound to the representation of true and false in c++. An integer will have to be stored in 32 bits, and I could simply use a byte to store the boolean expression. Asm snippets for conditional statements:
如果您专门讨论 C++ 或将为此生成汇编代码的任何编程语言,那么我们将受到编译器将在 ASM 中生成的代码的约束。在 C++ 中,我们也被绑定到 true 和 false 的表示中。一个整数必须以 32 位存储,我可以简单地使用一个字节来存储布尔表达式。条件语句的 Asm 片段:
For the integer:
对于整数:
##代码##For the bool:
对于布尔:
##代码##So, that's why the speed comparison is so compile dependent. In the case above, the bool would be slightly fast since cmp
would imply a subtraction for setting the flags. It also contradicts with what your compiler generated.
所以,这就是速度比较如此依赖编译的原因。在上面的情况下, bool 会稍微快一点,因为cmp
这意味着设置标志的减法。它也与您的编译器生成的内容相矛盾。
Another approach, a much simpler one, is to look at the logic of the expression on it's own and try not to worry about how the compiler will translate your code, and I think this is a much healthier way of thinking. I still believe, ultimately, that the code being generated by the compiler is actually trying to give a truthful resolution. What I mean is that, maybe if you increase the test cases in the if statement and stick with boolean in one side and integer in another, the compiler will make it so the code generated will execute faster with boolean expressions in the machine level.
另一种更简单的方法是自己查看表达式的逻辑,尽量不要担心编译器将如何翻译您的代码,我认为这是一种更健康的思维方式。我仍然相信,最终,编译器生成的代码实际上是在试图给出一个真实的解决方案。我的意思是,也许如果您增加 if 语句中的测试用例,并坚持在一侧使用布尔值而在另一侧使用整数,编译器将使其生成的代码在机器级别使用布尔表达式执行得更快。
I'm considering this is a conceptual question, so I'll give a conceptual answer. This discussion reminds me of discussions I commonly have about whether or not code efficiency translates to less lines of code in assembly. It seems that this concept is generally accepted as being true. Considering that keeping track of how fast the ALU will handle each statement is not viable, the second option would be to focus on jumps and compares in assembly. When that is the case, the distinction between boolean statements or integers in the code you presented becomes rather representative. The result of an expression in C++ will return a value that will then be given a representation. In assembly, on the other hand, the jumps and comparisons will be based in numeric values regardless of what type of expression was being evaluated back at you C++ if statement. It is important on these questions to remember that purely logicical statements such as these end up with a huge computational overhead, even though a single bit would be capable of the same thing.
我认为这是一个概念性的问题,所以我会给出一个概念性的答案。这个讨论让我想起了我经常讨论的关于代码效率是否会转化为更少的汇编代码行的讨论。似乎这个概念被普遍认为是正确的。考虑到跟踪 ALU 处理每个语句的速度是不可行的,第二个选择是专注于汇编中的跳转和比较。在这种情况下,您提供的代码中布尔语句或整数之间的区别就变得非常具有代表性。C++ 中表达式的结果将返回一个值,然后将给出一个表示。另一方面,在组装中,无论在您的 C++ if 语句中评估什么类型的表达式,跳转和比较都将基于数值。对于这些问题,重要的是要记住,诸如此类的纯逻辑语句最终会产生巨大的计算开销,即使单个位也能完成相同的事情。