C++ GCC 配置文件引导优化 (PGO) 收集哪些信息以及哪些优化使用这些信息?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/13881292/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
What information does GCC Profile Guided Optimization (PGO) collect and which optimizations use it?
提问by JohnTortugo
Which information does GCC collect when I enable -fprofile-generate
and which optimization does in fact uses the collected information (when setting the -fprofile-use
flag) ?
当我启用时 GCC 会收集哪些信息-fprofile-generate
,哪些优化实际上使用了收集的信息(设置-fprofile-use
标志时)?
I need citations here. I've searched for a while but didn't found anything documented.
我需要在这里引用。我已经搜索了一段时间,但没有找到任何记录。
Information regarding link-time optimization (LTO) would be a plus! =D
有关链接时间优化 (LTO) 的信息将是一个加分项!=D
采纳答案by chill
-fprofile-generate
enables -fprofile-arcs
, -fprofile-values
and -fvpt
.
-fprofile-generate
使-fprofile-arcs
,-fprofile-values
和-fvpt
.
-fprofile-use
enables -fbranch-probabilities
, -fvpt
, -funroll-loops
, -fpeel-loops
and -ftracer
-fprofile-use
使-fbranch-probabilities
,-fvpt
,-funroll-loops
,-fpeel-loops
和-ftracer
Source: http://gcc.gnu.org/onlinedocs/gcc-4.7.2/gcc/Optimize-Options.html#Optimize-Options
来源:http: //gcc.gnu.org/onlinedocs/gcc-4.7.2/gcc/Optimize-Options.html#Optimize-Options
PS. Information about LTO also on that page.
附注。有关 LTO 的信息也在该页面上。
回答by MichaelMoser
"What Every Programmer Should Know About Memory" by Ulrich Drepper https://people.freebsd.org/~lstewart/articles/cpumemory.pdfhttp://www.akkadia.org/drepper/cpumemory.pdf
Ulrich Drepper 撰写的“每个程序员都应该了解的内存知识” https://people.freebsd.org/~lstewart/articles/cpumemory.pdf http://www.akkadia.org/drepper/cpumemory.pdf
In section 7.4
在第 7.4 节
- compilation with --profile-generate generates .gcno file for each object file. (the same file that is used for gcov coverage reports)
- then you must run a few tests, during runtime it records coverage data into .gcda files
- recompile with --profile-use : it will gather the coverage data and infer if an branch is likely (__builtin_expect( .. , 1 ) or unlikely (__builtin_expect( .. , 0)
- 使用 --profile-generate 编译为每个目标文件生成 .gcno 文件。(用于 gcov 覆盖率报告的相同文件)
- 那么你必须运行一些测试,在运行时它将覆盖数据记录到 .gcda 文件中
- 使用 --profile-use 重新编译:它将收集覆盖数据并推断分支是可能的 (__builtin_expect( .. , 1 ) 还是不太可能的 (__builtin_expect( .. , 0)
The result should run faster as it should be better at prefetching code into the processor instruction cache.
结果应该运行得更快,因为它应该更好地将代码预取到处理器指令缓存中。