C++ GCC:-O3 和 -Os 之间的区别

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/19689014/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-27 23:03:34  来源:igfitidea点击:

GCC: Difference between -O3 and -Os

c++cgcccompiler-constructiong++

提问by Saqlain

I am quite familiar with GCC -O3 flag, but how it differs from -Os, in which situation we should prefer one over other?

我非常熟悉 GCC -O3 标志,但它与 -Os 有何不同,在哪种情况下我们应该更喜欢一个?

采纳答案by CmdrMoozy

The GCC documentationdescribes what these options do very explicitly.

GCC文档描述了这些选项做的很明确。

-O3 tries to optimize code very heavily for performance. It includes all of the optimizations -O2 includes, plus some more.

-O3 尝试大量优化代码以提高性能。它包括 -O2 包括的所有优化,以及更多优化。

-Os, on the other hand, instructs GCC to "optimize for size." It enables all -O2 optimizations which do not increase the size of the executable, and then it also toggles some optimization flags to further reduce executable size.

另一方面,-Os 指示 GCC“优化大小”。它启用所有不增加可执行文件大小的-O2 优化,然后它还切换一些优化标志以进一步减少可执行文件大小。

Note that I've been deliberately a bit vague with my descriptions - read the GCC documentation for a more in-depth discussion of exactly which flags are enabled for either optimization level.

请注意,我的描述故意有点含糊不清 - 请阅读 GCC 文档,以更深入地讨论为任一优化级别启用了哪些标志。

I believe the -O* optimization levels are just that - mutually exclusive, distinct levelsof optimization. It doesn't really make sense to mix them, since two levels will enable or leave out flags that the other one intentionally leaves out or enables (respectively). If you want to mix and match (you probably don't actually want to do this, unless you have a really good reason to want a specific set of flags), you are best off reading the documentation and mixing and matching the flags each level enables by hand.

我相信 -O* 优化级别只是 - 互斥的、不同的优化级别。混合它们实际上没有意义,因为两个级别将启用或省略另一个级别有意省略或启用(分别)的标志。如果你想混合和匹配(你可能实际上不想这样做,除非你有很好的理由想要一组特定的标志),你最好阅读文档并混合和匹配每个级别的标志手动启用。

I think I'll also link this articlefrom the Gentoo Linux Wiki, which talks about optimization flags as they relate to building the packages for the operating system. Obviously not all of this is applicable, but it still contains some interesting information - for one:

我想我还将链接来自 Gentoo Linux Wiki 的这篇文章该文章讨论了与构建操作系统软件包相关的优化标志。显然,并非所有这些都适用,但它仍然包含一些有趣的信息——例如:

Compiling with -O3 is not a guaranteed way to improve performance, and in fact in many cases can slow down a system due to larger binaries and increased memory usage. -O3 is also known to break several packages. Therefore, using -O3 is not recommended.

使用 -O3 进行编译并不能保证提高性能,事实上,在许多情况下,由于较大的二进制文件和增加的内存使用量,可能会减慢系统速度。-O3 还已知会破坏多个软件包。因此,不推荐使用 -O3。

According to that article, -O2 is, most of the time, "as good as" -O3, and is safer to use, regarding broken executable output.

根据那篇文章,在大多数情况下,-O2 与 -O3 一样“好”,而且使用起来更安全,就损坏的可执行输出而言。

回答by Basile Starynkevitch

I suggest to read GCC documentation. -O3is for getting a fast running code (even at the expense of some code bloat), while -Osis optimizing for size of the generated code.

我建议阅读 GCC 文档。-O3用于获得快速运行的代码(即使以一些代码膨胀为代价),同时-Os针对生成的代码的大小进行优化。

There are tons of other (obscure) GCC optimization flags(e.g. -fgcse-sm) many of which are not enabled even at -O3.

还有大量其他(晦涩的)GCC优化标志(例如-fgcse-sm),其中许多甚至在-O3.

You might perhaps be also interested by -flto(for Link-Time Optimization) to be used, in addition of e.g. -O3or -Os, both at compile time and at linktime. Then see also this answer.

您可能还对-flto(用于链接时优化)感兴趣,除了 eg -O3or 之外-Os,在编译时和链接时。然后也看到这个答案

At last, take care to use the latest version of GCC (currently 4.8 at end of 2013), because GCC is improving significantly its optimizations.

最后,请注意使用最新版本的 GCC(目前在 2013 年底为 4.8),因为 GCC 正在显着改进其优化。

You might want to also use -mtune=native(at least for x86).

您可能还想使用-mtune=native(至少对于 x86)。

And you might even write your own optimization pass, specific to your own particular libraries and APIs, perhaps using MELTplugin.

您甚至可以编写自己的优化通道,特定于您自己的特定库和 API,也许使用MELT插件。

As CmdrMoozy answeredyou might prefer using -O2over -O3(but notice that recent GCC versions have improved a lot their -O3, so the Gentoocitation -recommending against -O3and in favor of -O2is becoming less relevant.).

正如CmdrMoozy 回答的那样,您可能更喜欢使用-O2over -O3(但请注意,最近的 GCC 版本已对其进行了很大改进-O3,因此Gentoo引文 - 反对-O3和赞成的-O2相关性变得不那么重要了。)。

Also, as this SlashDot-ed Stack paper(by Xi Wang, Nickolai Zeldovich, M. Frans Kaashoek, and Armando Solar-Lezama) shows, many programs are not entirely C standard compliant and are not happy (and behave incorrectly) when some validoptimizations are done. Undefined behavioris a tricky subject.

此外,正如这篇 SlashDot-ed Stack 论文(由 Xi Wang、Nickolai Zeldovich、M. Frans Kashoek 和 Armando Solar-Lezama 撰写)所示,许多程序并不完全符合 C 标准,并且在某些有效的情况下并不满意(并且行为不正确)优化完成。未定义的行为是一个棘手的话题。

BTW, notice that using -O3usually makes your compilation time much bigger, and brings often (but not always) at most a few percents more performance than -O2or even -O1.... (it is even worse with -flto). This is why Irarely use it.

顺便说一句,请注意,使用-O3通常会使您的编译时间更长,并且通常(但并非总是)比-O2甚至-O1……(甚至更糟-flto)带来最多几个百分点的性能。这就是很少使用它的原因。

回答by opalenzuela

It depends. Do you need to optimize speed or size?

这取决于。您需要优化速度或尺寸吗?

-O3
Optimize yet more. -O3 turns on all optimizations specified by -O2 and also turns on the -finline-functions, -funswitch-loops, -fpredictive-commoning, -fgcse-after-reload, -ftree-loop-vectorize, -ftree-slp-vectorize, -fvect-cost-model, -ftree-partial-pre and -fipa-cp-clone options.

-O0
Reduce compilation time and make debugging produce the expected results. This is the default.

-Os
Optimize for size. -Os enables all -O2 optimizations that do not typically increase code size. It also performs further optimizations designed to reduce code size.
-Os Disables the following optimization flags:

-falign-functions-falign-jumps-falign-loops-falign-labels-freorder-blocks-freorder-blocks-and-partition-fprefetch-loop-arrays

http://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html

-O3
优化更多。-O3 打开 -O2 指定的所有优化,并打开 -finline-functions、-funswitch-loops、-fpredictive-commoning、-fgcse-after-reload、-ftree-loop-vectorize、-ftree-slp-vectorize 、-fvect-cost-model、-ftree-partial-pre 和 -fipa-cp-clone 选项。

-O0
减少编译时间并使调试产生预期结果。这是默认设置。

-Os
优化大小。-Os 启用所有通常不会增加代码大小的 -O2 优化。它还执行旨在减少代码大小的进一步优化。
-Os 禁用以下优化标志:

-falign-functions-falign-jumps-falign-loops-falign-labels-freorder-blocks-freorder-blocks-and-partition-fprefetch-loop-arrays

http://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html

Actually, -Ois a shorthand for a long list of independent optimizations. If you don't know what you need, just go for -O3.

实际上,-O是一长串独立优化的简写。如果您不知道自己需要什么,请选择-O3

回答by Nathan

-O3 optimizes for speed, whereas -Os optimizes for space. That means -O3 will give you a fast executable, but it may be rather large, and -Os gives you a smaller executable, but it might be slower.

-O3 优化速度,而 -Os 优化空间。这意味着 -O3 会给你一个快速的可执行文件,但它可能相当大,而 -Os 会给你一个较小的可执行文件,但它可能会更慢。

Space and time efficiency is usually a trade-off. Faster algorithms tend to take up more space, where in-place algorithms (algorithms that don't increase the space usage) tend to be less efficient.

空间和时间效率通常是一种权衡。更快的算法往往会占用更多空间,而就地算法(不增加空间使用的算法)往往效率较低。

Usually modern computers have plenty of memory space, so -O3 is usually preferable. However if you're programing for something with low-ram (like a small device) you might prefer -Os

通常现代计算机有足够的内存空间,所以 -O3 通常更可取。但是,如果您正在为低内存(如小型设备)编程,您可能更喜欢 -Os

回答by galop1n

This is not really possible to answer, a simple rules would be to use optimize for speed on critical code path, and optimize for size on non critical code path such as loading, ...

这真的不可能回答,一个简单的规则是使用优化关键代码路径的速度,并优化非关键代码路径的大小,例如加载,...

Some compilers can work in two passes to decide it for you, a first one create a special executable with profiling support, you run the application to collect data and a second compilation is able to decide, based on the data of what is best. It allows de-virtualization, branch prediction, ...

一些编译器可以分两次工作来为您决定,第一个创建具有分析支持的特殊可执行文件,您运行应用程序以收集数据,第二个编译能够根据最佳数据做出决定。它允许去虚拟化,分支预测,...