C语言 试图理解 gcc 选项 -fomit-frame-pointer
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/14666665/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Trying to understand gcc option -fomit-frame-pointer
提问by rashok
I asked Google to give me the meaning of the gccoption -fomit-frame-pointer, which redirects me to the below statement.
我要求谷歌给我gcc选项的含义-fomit-frame-pointer,这将我重定向到以下语句。
-fomit-frame-pointer
Don't keep the frame pointer in a register for functions that don't need one. This avoids the instructions to save, set up and restore frame pointers; it also makes an extra register available in many functions. It also makes debugging impossible on some machines.
-fomit-frame-pointer
不要将帧指针保存在不需要的函数的寄存器中。这避免了保存、设置和恢复帧指针的指令;它还为许多功能提供了一个额外的寄存器。它还使得在某些机器上无法进行调试。
As per my knowledge of each function, an activation record will be created in the stack of the process memory to keep all local variables and some more information. I hope this frame pointer means the address of the activation record of a function.
根据我对每个函数的了解,将在进程内存的堆栈中创建一个激活记录,以保存所有局部变量和更多信息。我希望这个帧指针表示一个函数的激活记录的地址。
In this case, what are the type of functions, for which it doesn't need to keep the frame pointer in a register? If I get this information, I will try to design the new function based on that (if possible) because if the frame pointer is not kept in registers, some instructions will be omitted in binary. This will really improve the performance noticeably in an application where there are many functions.
在这种情况下,不需要将帧指针保存在寄存器中的函数类型是什么?如果我得到这些信息,我将尝试基于该信息(如果可能)设计新函数,因为如果帧指针未保存在寄存器中,则某些指令将以二进制形式省略。在具有许多功能的应用程序中,这将真正显着提高性能。
采纳答案by Mats Petersson
Most smaller functions don't need a frame pointer - larger functions MAY need one.
大多数较小的函数不需要帧指针——较大的函数可能需要一个。
It's really about how well the compiler manages to track how the stack is used, and where things are on the stack (local variables, arguments passed to the current function and arguments being prepared for a function about to be called). I don't think it's easy to characterize the functions that need or don't need a frame pointer (technically, NO function HAS to have a frame pointer - it's more a case of "if the compiler deems it necessary to reduce the complexity of other code").
这实际上是关于编译器如何设法跟踪堆栈的使用方式以及堆栈中的内容(局部变量、传递给当前函数的参数以及为即将被调用的函数准备的参数)。我认为描述需要或不需要帧指针的函数并不容易(从技术上讲,没有函数必须有帧指针——这更像是“如果编译器认为有必要降低其他代码”)。
I don't think you should "attempt to make functions not have a frame pointer" as part of your strategy for coding - like I said, simple functions don't need them, so use -fomit-frame-pointer, and you'll get one more register available for the register allocator, and save 1-3 instructions on entry/exit to functions. If your function needs a frame pointer, it's because the compiler decides that's a better option than not using a frame pointer. It's not a goal to have functions without a frame pointer, it's a goal to have code that works both correctly and fast.
我不认为你应该“尝试让函数没有帧指针”作为你编码策略的一部分——就像我说的,简单的函数不需要它们,所以使用-fomit-frame-pointer,你会得到一个可用的寄存器用于寄存器分配器,并将进入/退出的 1-3 条指令保存到函数中。如果您的函数需要帧指针,那是因为编译器认为这是比不使用帧指针更好的选择。拥有没有帧指针的函数不是目标,目标是让代码既正确又快速地工作。
Note that "not having a frame pointer" should give better performance, but it's not some magic bullet that gives enormous improvements - particularly not on x86-64, which already has 16 registers to start with. On 32-bit x86, since it only has 8 registers, one of which is the stack pointer, and taking up another as the frame pointer means 25% of register-space is taken. To change that to 12.5% is quite an improvement. Of course, compiling for 64-bit will help quite a lot too.
请注意,“没有帧指针”应该会提供更好的性能,但这并不是带来巨大改进的灵丹妙药——尤其是在 x86-64 上,它已经有 16 个寄存器开始。在 32 位 x86 上,因为它只有 8 个寄存器,其中一个是堆栈指针,占用另一个作为帧指针意味着占用了 25% 的寄存器空间。将其更改为 12.5% 是一个相当大的改进。当然,为 64 位编译也会有很大帮助。
回答by Maxim Masiutin
This is all about the BP/EBP/RBP register on Intel platforms. This register defaults to stack segment (doesn't need a special prefix to access stack segment).
这就是英特尔平台上的 BP/EBP/RBP 寄存器的全部内容。该寄存器默认为堆栈段(不需要特殊前缀来访问堆栈段)。
The EBP is the best choice of register for accessing data structures, variables and dynamically allocated work space within the stack. EBP is often used to access elements on the stack relative to a fixed point on the stack rather than relative to the current TOS. It typically identifies the base address of the current stack frame established for the current procedure. When EBP is used as the base register in an offset calculation, the offset is calculated automatically in the current stack segment (i.e., the segment currently selected by SS). Because SS does not have to be explicitly specified, instruction encoding in such cases is more efficient. EBP can also be used to index into segments addressable via other segment registers.
EBP 是访问堆栈内的数据结构、变量和动态分配的工作空间的最佳寄存器选择。EBP 通常用于访问相对于堆栈上的固定点而不是相对于当前 TOS 的堆栈上的元素。它通常标识为当前过程建立的当前堆栈帧的基地址。当EBP作为偏移量计算的基址寄存器时,偏移量会在当前堆栈段(即SS当前选择的段)中自动计算。由于不必明确指定 SS,因此这种情况下的指令编码效率更高。EBP 还可用于索引可通过其他段寄存器寻址的段。
( source - http://css.csail.mit.edu/6.858/2017/readings/i386/s02_03.htm)
(来源 - http://css.csail.mit.edu/6.858/2017/readings/i386/s02_03.htm)
Since on most 32-bit platforms, data segment and stack segment are the same, this association of EBP/RBP with the stack is no longer an issue. So is on 64-bit platforms: The x86-64 architecture, introduced by AMD in 2003, has largely dropped support for segmentation in 64-bit mode: four of the segment registers: CS, SS, DS, and ES are forced to 0. These circumstances of x86 32-bit and 64-bit platforms essentially mean that EBP/RBP register can be used, without any prefix, in the processor instructions that access memory.
由于在大多数 32 位平台上,数据段和堆栈段是相同的,因此 EBP/RBP 与堆栈的这种关联不再是问题。在 64 位平台上也是如此:由 AMD 于 2003 年推出的 x86-64 架构在很大程度上放弃了对 64 位模式分段的支持:四个分段寄存器:CS、SS、DS 和 ES 被强制为 0 x86 32位和64位平台的这些情况本质上意味着可以在访问内存的处理器指令中使用EBP/RBP寄存器,无需任何前缀。
So the compiler option you wrote about allows the BP/EBP/RBP to be used for other means, e.g. to hold a local variable.
因此,您编写的编译器选项允许将 BP/EBP/RBP 用于其他方式,例如保存局部变量。
By “This avoids the instructions to save, set up and restore frame pointers” is meant avoiding the following code on the entry of each function:
“这避免了保存、设置和恢复帧指针的指令”意味着避免在每个函数的入口处使用以下代码:
push ebp
mov ebp, esp
or the enterinstruction, which was very useful on Intel 80286 and 80386 processors.
或enter指令,这在 Intel 80286 和 80386 处理器上非常有用。
Also, before function return, the following code is used:
此外,在函数返回之前,使用以下代码:
mov esp, ebp
pop ebp
or the leaveinstruction.
或leave指令。
Debugging tools may scan the stack data and use these pushed EBP register data while locating call sites, i.e. to display names of the function and the arguments in the order they have been called hierarchically.
调试工具可以扫描堆栈数据并在定位时使用这些压入的 EBP 寄存器数据call sites,即按层次调用顺序显示函数和参数的名称。
Programmers may have questions about stack frames not in a broad term (that it is a single entity in the stack that serves just one function call and keeps return address, arguments and local variables) but in a narrow sense – when the term stack framesis mentioned in the context of compiler options. From the compiler's perspective, a stack frame is just the entry and exit code for the routine, that pushes an anchor to the stack – that can also be used for debugging and for exception handling. Debugging tools may scan the stack data and use these anchors for back-tracing, while locating call sitesin the stack, i.e. to display names of the function in the order they have been called hierarchically.
程序员可能对堆栈帧有疑问,不是广义上的(它是堆栈中的一个实体,只提供一个函数调用并保留返回地址、参数和局部变量),而是狭义上的——当这个术语stack frames在编译器选项的上下文。从编译器的角度来看,堆栈帧只是例程的入口和出口代码,它将一个锚点推送到堆栈——也可以用于调试和异常处理。调试工具可以扫描堆栈数据并使用这些锚点进行回溯,同时call sites在堆栈中定位,即按照它们被分层调用的顺序显示函数的名称。
That's why it is very important to understand for a programmer what a stack frame is in terms of compiler options – because the compiler can control whether to generate this code or not.
这就是为什么对于程序员来说理解什么是编译器选项方面的堆栈帧非常重要——因为编译器可以控制是否生成此代码。
In some cases, the stack frame (entry and exit code for the routine) can be omitted by the compiler, and the variables will directly be accessed via the stack pointer (SP/ESP/RSP) rather than the convenient base pointer (BP/ESP/RSP). Conditions for a compiler to omit the stack frames for some functions may be different, for example: (1) the function is a leaf function (i.e. an end-entity that doesn't call other functions); (2) no exceptions are used; (3) no routines are called with outgoing parameters on the stack; (4) the function has no parameters.
在某些情况下,编译器可以省略堆栈帧(例程的进入和退出代码),直接通过堆栈指针(SP/ESP/RSP)而不是方便的基指针(BP/ ESP/RSP)。对于某些函数,编译器省略堆栈帧的条件可能不同,例如: (1) 该函数是叶函数(即不调用其他函数的终端实体);(2) 没有使用异常;(3) 没有调用栈上传出参数的例程;(4) 函数没有参数。
Omitting stack frames (entry and exit code for the routine) can make code smaller and faster, but may also negatively affect the debuggers' ability to back-trace the data in the stack and to display it to the programmer. These are the compiler options that determine under which conditions a function should satisfy in order the compiler to award it with the stack frame entry and exit code. For example, a compiler may have options to add such entry and exit code to functions in the following cases: (a) always, (b) never, (c) when needed (specifying the conditions).
省略堆栈帧(例程的进入和退出代码)可以使代码更小、更快,但也可能对调试器回溯堆栈中的数据并将其显示给程序员的能力产生负面影响。这些是编译器选项,用于确定函数应满足哪些条件,以便编译器向它授予堆栈帧进入和退出代码。例如,在以下情况下,编译器可以选择将此类进入和退出代码添加到函数中:(a) 始终,(b) 从不,(c) 需要时(指定条件)。
Returning back from generalities to particularities: if you will use the -fomit-frame-pointerGCC compiler option, you may win on both entry and exit code for the routine, and on having an additional register (unless it is already turned on by default either itself or implicitly by other options, in this case, you are already benefiting from the gain of using the EBP/RBP register and no additional gain will be obtained by explicitly specifying this option if it is already on implicitly). Please note, however, that in 16-bit and 32-bit modes, the BP register doesn't have an ability to access 8-bit parts of it like AX has (AL and AH).
从一般性返回到特殊性:如果您将使用-fomit-frame-pointerGCC 编译器选项,您可能会赢得例程的进入和退出代码,以及拥有额外的寄存器(除非默认情况下它本身或由其他人隐式打开选项,在这种情况下,您已经从使用 EBP/RBP 寄存器的增益中受益,并且如果该选项已经隐式开启,则不会通过显式指定此选项获得额外增益)。但是请注意,在 16 位和 32 位模式下,BP 寄存器无法像 AX 那样访问 8 位部分(AL 和 AH)。
Since this option, besides allowing the compiler to use EBP as a general-purpose register in optimizations, also prevents generating exit and entry code for the stack frame which complicates the debugging -- that's why the GCC documentationexplicitly states (unusually emphasizing with a bold style) that enabling this option makes debugging impossible on some machines
由于这个选项,除了允许编译器在优化中使用 EBP 作为通用寄存器之外,还可以防止为堆栈帧生成退出和进入代码,这使调试复杂化——这就是为什么GCC 文档明确指出(通常用粗体强调)样式)启用此选项会使某些机器上的调试无法进行
Please also be aware that other compiler options, related to debugging or optimization, may implicitly turn the -fomit-frame-pointeroption ON or OFF.
另请注意,与调试或优化相关的其他编译器选项可能会隐式地打开-fomit-frame-pointer或关闭该选项。
I didn't find any official information at gcc.gnu.org about how do other options affect -fomit-frame-pointeron x86 platforms,
the https://gcc.gnu.org/onlinedocs/gcc-3.4.4/gcc/Optimize-Options.htmlonly states the following:
我在 gcc.gnu.org 上没有找到任何关于其他选项如何影响-fomit-frame-pointerx86 平台的官方信息,https://gcc.gnu.org/onlinedocs/gcc-3.4.4/gcc/Optimize-Options.html仅声明以下内容:
-O also turns on -fomit-frame-pointer on machines where doing so does not interfere with debugging.
-O 还会在不干扰调试的机器上打开 -fomit-frame-pointer。
So it is not clear from the documentation per sewhether -fomit-frame-pointerwill be turned on if you just compile with a single -Ooption on x86 platform. It may be tested empirically, but in this case there is no commitment from the GCC developers to not change the behavior of this option in future without notice.
因此,如果您在 x86 平台上仅使用单个选项进行编译,则文档本身并不清楚是否-fomit-frame-pointer会打开-O。它可以根据经验进行测试,但在这种情况下,GCC 开发人员没有承诺将来不更改此选项的行为,恕不另行通知。
However, Peter Cordeshas pointed out in comments that there is a difference for the default settings of the -fomit-frame-pointerbetween x86-16 platforms and x86-32/64 platforms.
但是,Peter Cordes在评论中指出-fomit-frame-pointerx86-16 平台和 x86-32/64 平台之间的默认设置存在差异。
This option -- -fomit-frame-pointer-- is also relevant to the Intel C++ Compiler 15.0, not only to the GCC:
此选项 -- -fomit-frame-pointer-- 也与英特尔 C++ 编译器 15.0 相关,不仅与 GCC 相关:
For the Intel Compiler, this option has an alias /Oy.
对于英特尔编译器,此选项具有别名/Oy.
Here is what Intel wrote about it:
这是英特尔写的关于它的内容:
These options determine whether EBP is used as a general-purpose register in optimizations. Options -fomit-frame-pointer and /Oy allow this use. Options -fno-omit-frame-pointer and /Oy- disallow it.
Some debuggers expect EBP to be used as a stack frame pointer, and cannot produce a stack backtrace unless this is so. The -fno-omit-frame-pointer and /Oy- options direct the compiler to generate code that maintains and uses EBP as a stack frame pointer for all functions so that a debugger can still produce a stack backtrace without doing the following:
For -fno-omit-frame-pointer: turning off optimizations with -O0 For /Oy-: turning off /O1, /O2, or /O3 optimizations The -fno-omit-frame-pointer option is set when you specify option -O0 or the -g option. The -fomit-frame-pointer option is set when you specify option -O1, -O2, or -O3.
The /Oy option is set when you specify the /O1, /O2, or /O3 option. Option /Oy- is set when you specify the /Od option.
Using the -fno-omit-frame-pointer or /Oy- option reduces the number of available general-purpose registers by 1 and can result in slightly less efficient code.
NOTE For Linux* systems: There is currently an issue with GCC 3.2 exception handling. Therefore, the Intel compiler ignores this option when GCC 3.2 is installed for C++ and exception handling is turned on (the default).
这些选项决定了 EBP 是否在优化中用作通用寄存器。选项 -fomit-frame-pointer 和 /Oy 允许这种使用。选项 -fno-omit-frame-pointer 和 /Oy- 不允许它。
一些调试器希望 EBP 用作堆栈帧指针,除非是这样,否则无法生成堆栈回溯。-fno-omit-frame-pointer 和 /Oy- 选项指示编译器生成维护和使用 EBP 作为所有函数的堆栈帧指针的代码,以便调试器仍然可以生成堆栈回溯,而无需执行以下操作:
对于 -fno-omit-frame-pointer:使用 -O0 关闭优化 对于 /Oy-:关闭 /O1、/O2 或 /O3 优化 -fno-omit-frame-pointer 选项在您指定选项时设置 - O0 或 -g 选项。-fomit-frame-pointer 选项在您指定选项 -O1、-O2 或 -O3 时设置。
/Oy 选项在您指定 /O1、/O2 或 /O3 选项时设置。选项 /Oy- 在您指定 /Od 选项时设置。
使用 -fno-omit-frame-pointer 或 /Oy- 选项会将可用通用寄存器的数量减少 1,并可能导致代码效率稍低。
注意 对于 Linux* 系统:目前 GCC 3.2 异常处理存在问题。因此,当为 C++ 安装 GCC 3.2 并打开异常处理(默认设置)时,英特尔编译器会忽略此选项。
Please be aware that the above quote is only relevant for the Intel C++ 15 compiler, not to GCC.
请注意,以上引用仅与英特尔 C++ 15 编译器相关,与 GCC 无关。

