C语言 如何防止 GCC 优化繁忙的等待循环?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/7083482/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to prevent GCC from optimizing out a busy wait loop?
提问by Denilson Sá Maia
I want to write a C code firmware for Atmel AVR microcontrollers. I will compile it using GCC. Also, I want to enable compiler optimizations (-Osor -O2), as I see no reason to not enable them, and they will probably generate a better assembly way faster than writing assembly manually.
我想为 Atmel AVR 微控制器编写 C 代码固件。我将使用 GCC 编译它。此外,我想启用编译器优化(-Os或-O2),因为我认为没有理由不启用它们,并且它们可能会比手动编写程序集更快地生成更好的程序集方式。
But I want a small piece of code not optimized. I want to delay the execution of a function by some time, and thus I wanted to write a do-nothing loop just to waste some time. No need to be precise, just wait some time.
但我想要一小段没有优化的代码。我想将函数的执行延迟一段时间,因此我想编写一个什么都不做的循环来浪费一些时间。无需精确,只需等待一段时间。
/* How to NOT optimize this, while optimizing other code? */
unsigned char i, j;
j = 0;
while(--j) {
i = 0;
while(--i);
}
Since memory access in AVR is a lot slower, I want iand jto be kept in CPU registers.
由于 AVR 中的内存访问速度要慢得多,因此我希望i并j保留在 CPU 寄存器中。
Update: I just found util/delay.hand util/delay_basic.hfrom AVR Libc. Although most times it might be a better idea to use those functions, this question remains valid and interesting.
更新:我刚刚从AVR Libc 中找到了util/delay.h和util/delay_basic.h。尽管大多数时候使用这些函数可能是一个更好的主意,但这个问题仍然有效且有趣。
Related questions:
相关问题:
回答by Denilson Sá Maia
I developed this answer after following a link from dmckee's answer, but it takes a different approach than his/her answer.
我在遵循dmckee's answer的链接后开发了这个答案,但它采用了与他/她的答案不同的方法。
Function Attributesdocumentation from GCC mentions:
GCC 的函数属性文档提到:
noinlineThis function attribute prevents a function from being considered for inlining. If the function does not have side-effects, there are optimizations other than inlining that causes function calls to be optimized away, although the function call is live. To keep such calls from being optimized away, putasm ("");
noinline此函数属性可防止考虑内联函数。如果函数没有副作用,除了内联之外还有其他优化会导致函数调用被优化掉,尽管函数调用是实时的。为了防止此类调用被优化,请将asm ("");
This gave me an interesting idea... Instead of adding a nopinstruction at the inner loop, I tried adding an empty assembly code in there, like this:
这给了我一个有趣的想法……我没有nop在内部循环中添加指令,而是尝试在其中添加一个空的汇编代码,如下所示:
unsigned char i, j;
j = 0;
while(--j) {
i = 0;
while(--i)
asm("");
}
And it worked! That loop has not been optimized-out, and no extra nopinstructions were inserted.
它奏效了!该循环尚未优化,也没有nop插入额外的指令。
What's more, if you use volatile, gcc will store those variables in RAM and add a bunch of lddand stdto copy them to temporary registers. This approach, on the other hand, doesn't use volatileand generates no such overhead.
更重要的是,如果使用volatile,GCC将存储在RAM中的变量,并添加了一堆的ldd和std将它们复制到临时寄存器。另一方面,这种方法不使用也不volatile产生这样的开销。
Update:If you are compiling code using -ansior -std, you must replace the asmkeyword with __asm__, as described in GCC documentation.
更新:如果您使用-ansi或编译代码-std,则必须将asm关键字替换为__asm__,如GCC 文档中所述。
In addition, you can also use __asm__ __volatile__("")if your assembly statement must execute where we put it, (i.e. must not be moved out of a loop as an optimization).
此外,您还可以使用__asm__ __volatile__("")if 您的汇编语句必须在我们放置的位置执行,(即不得作为优化从循环中移出)。
回答by ks1322
Declare iand jvariables as volatile. This will prevent compiler to optimize code involving these variables.
声明i和j变量为volatile。这将阻止编译器优化涉及这些变量的代码。
unsigned volatile char i, j;
回答by R.. GitHub STOP HELPING ICE
I'm not sure why it hasn't been mentioned yet that this approach is completely misguided and easily broken by compiler upgrades, etc. It would make a lot more sense to determine the time value you want to wait until and spin polling the current time until the desired value is exceeded. On x86 you could use rdtscfor this purpose, but the more portable way would be to call clock_gettime(or the variant for your non-POSIX OS) to get the time. Current x86_64 Linux will even avoid the syscall for clock_gettimeand use rdtscinternally. Or, if you can handle the cost of a syscall, just use clock_nanosleepto begin with...
我不确定为什么还没有提到这种方法完全被误导并且很容易被编译器升级等破坏。 确定您想要等待的时间值并旋转轮询当前值会更有意义超过所需值的时间。在 x86 上,您可以rdtsc用于此目的,但更便携的方法是调用clock_gettime(或非 POSIX 操作系统的变体)来获取时间。当前的 x86_64 Linux 甚至会避免系统调用clock_gettime并在rdtsc内部使用。或者,如果您可以处理系统调用的成本,只需使用clock_nanosleep...
回答by dmckee --- ex-moderator kitten
I don't know off the top of my head if the avr version of the compiler supports the full set of #pragmas(the interesting ones in the link all date from gcc version 4.4), but that is where you would usually start.
我不知道编译器的 avr 版本是否支持完整的#pragmas集(链接中有趣的那些都来自 gcc 4.4 版),但这通常是您开始的地方。
回答by BiS
For me, on GCC 4.7.0, empty asm was optimized away anyways with -O3 (didnt try with -O2). and using a i++ in register or volatile resulted in a big performance penalty (in my case).
对我来说,在 GCC 4.7.0 上,空的 asm 无论如何都用 -O3 优化掉了(没有尝试使用 -O2)。并且在 register 或 volatile 中使用 i++ 会导致很大的性能损失(在我的情况下)。
What i did was linking with another empty function which the compiler couldnt see when compiling the "main program"
我所做的是链接另一个空函数,编译器在编译“主程序”时看不到该函数
Basically this:
基本上是这样的:
Created "helper.c" with this function declared (empty function)
创建了“helper.c”并声明了这个函数(空函数)
void donotoptimize(){}
Then compiled "gcc helper.c -c -o helper.o" and then
然后编译“gcc helper.c -c -o helper.o”然后
while (...) { donotoptimize();}
This gave me best results (and from my belief, no overhead at all, but can't test because my program won't work without it :) )
这给了我最好的结果(根据我的信念,根本没有开销,但无法测试,因为没有它我的程序将无法运行:))
I think it should work with icc too. Maybe not if you enable linking optimizations, but with gcc it does.
我认为它也应该与 icc 一起使用。如果您启用链接优化,则可能不会,但使用 gcc 可以。
回答by old_timer
put that loop in a separate .c file and do not optimize that one file. Even better write that routine in assembler and call it from C, either way the optimizer wont get involved.
将该循环放在单独的 .c 文件中,不要优化该文件。最好在汇编程序中编写该例程并从 C 调用它,无论哪种方式优化器都不会参与。
I sometimes do the volatile thing but normally create an asm function that simply returns put a call to that function the optimizer will make the for/while loop tight but it wont optimize it out because it has to make all the calls to the dummy function. The nop answer from Denilson Sá does the same thing but even tighter...
我有时会做 volatile 的事情,但通常会创建一个 asm 函数,该函数只返回对该函数的调用,优化器将使 for/while 循环紧密,但它不会优化它,因为它必须对虚拟函数进行所有调用。Denilson Sá 的 nop 回答做了同样的事情,但更严格......
回答by Groovy
Putting volatile asm should help. You can read more on this here:-
放置 volatile asm 应该会有所帮助。您可以在此处阅读更多相关信息:-
http://www.nongnu.org/avr-libc/user-manual/optimization.html
http://www.nongnu.org/avr-libc/user-manual/optimization.html
If you are working on Windows, you can even try putting the code under pragmas, as explained in detail below:-
如果您在 Windows 上工作,您甚至可以尝试将代码置于编译指示下,详细说明如下:-
Hope this helps.
希望这可以帮助。
回答by Michel Megens
You can also use the registerkeyword. Variables declared with register are stored in CPU registers.
您还可以使用register关键字。用寄存器声明的变量存储在 CPU 寄存器中。
In your case:
在你的情况下:
register unsigned char i, j;
j = 0;
while(--j) {
i = 0;
while(--i);
}

