C++ 什么是“堆栈对齐”?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/672461/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-27 16:39:05  来源:igfitidea点击:

what is "stack alignment"?

c++data-structuresmingwvisual-c++compiler-construction

提问by DanJ

What is stack alignment? Why is it used? Can it be controlled by compiler settings?

什么是堆栈对齐?为什么使用它?可以通过编译器设置来控制吗?

The details of this question are taken from a problem faced when trying to use ffmpeg libraries with msvc, however what I'm really interested in is an explanation of what is "stack alignment".

这个问题的详细信息取自尝试将 ffmpeg 库与 msvc 一起使用时面临的问题,但是我真正感兴趣的是对“堆栈对齐”的解释。

The Details:

细节:

  • When runnig my msvc complied program which links to avcodec I get the following error: "Compiler did not align stack variables. Libavcodec has been miscompiled", followed by a crash in avcodec.dll.
  • avcodec.dll was not compiled with msvc, so I'm unable to see what is going on inside.
  • When running ffmpeg.exe and using the same avcodec.dll everything works well.
  • ffmpeg.exe was not compiled with msvc, it was complied with gcc / mingw (same as avcodec.dll)
  • 当运行链接到 avcodec 的 msvc 编译程序时,我收到以下错误:“编译器未对齐堆栈变量。Libavcodec 已被错误编译”,随后 avcodec.dll 崩溃。
  • avcodec.dll 不是用 msvc 编译的,所以我看不到里面发生了什么。
  • 当运行 ffmpeg.exe 并使用相同的 avcodec.dll 时,一切正常。
  • ffmpeg.exe不是用msvc编译的,是用gcc/mingw编译的(和avcodec.dll一样)

Thanks,

谢谢,

Dan

回答by Toon Krijthe

Alignment of variables in memory (a short history).

内存中变量的对齐(简短的历史)。

In the past computers had an 8 bits databus. This means, that each clock cycle 8 bits of information could be processed. Which was fine then.

过去的计算机有一个 8 位的数据总线。这意味着,每个时钟周期可以处理 8 位信息。那很好。

Then came 16 bit computers. Due to downward compatibility and other issues, the 8 bit byte was kept and the 16 bit word was introduced. Each word was 2 bytes. And each clock cycle 16 bits of information could be processed. But this posed a small problem.

然后是 16 位计算机。由于向下兼容等问题,保留了8位字节,引入了16位字。每个字是 2 个字节。并且每个时钟周期可以处理 16 位信息。但这带来了一个小问题。

Let's look at a memory map:

让我们看一下内存映射:

+----+
|0000| 
|0001|
+----+
|0002|
|0003|
+----+
|0004|
|0005|
+----+
| .. |

At each address there is a byte which can be accessed individually. But words can only be fetched at even addresses. So if we read a word at 0000, we read the bytes at 0000 and 0001. But if we want to read the word at position 0001, we need two read accesses. First 0000,0001 and then 0002,0003 and we only keep 0001,0002.

每个地址都有一个可以单独访问的字节。但是只能在偶数地址处获取单词。因此,如果我们读取 0000 处的字,我们会读取 0000 和 0001 处的字节。但如果我们想读取位置 0001 处的字,则需要两次读取访问。首先是 0000,0001,然后是 0002,0003,我们只保留 0001,0002。

Of course this took some extra time and that was not appreciated. So that's why they invented alignment. So we store word variables at word boundaries and byte variables at byte boundaries.

当然,这需要一些额外的时间,而这并不值得赞赏。所以这就是他们发明对齐方式的原因。所以我们将字变量存储在字边界,字节变量存储在字节边界。

For example, if we have a structure with a byte field (B) and a word field (W) (and a very naive compiler), we get the following:

例如,如果我们有一个包含字节字段 (B) 和字字段 (W) 的结构(以及一个非常简单的编译器),我们会得到以下结果:

+----+
|0000| B
|0001| W
+----+
|0002| W
|0003|
+----+

Which is not fun. But when using word alignment we find:

这不好玩。但是当使用词对齐时,我们发现:

+----+
|0000| B
|0001| -
+----+
|0002| W
|0003| W
+----+

Here memory is sacrificed for access speed.

这里为了访问速度牺牲了内存。

You can imagine that when using double word (4 bytes) or quad word (8 bytes) this is even more important. That's why with most modern compilers you can chose which alignment you are using while compiling the program.

您可以想象,当使用双字(4 字节)或四字(8 字节)时,这一点更为重要。这就是为什么对于大多数现代编译器,您可以在编译程序时选择使用的对齐方式。

回答by snemarch

Some CPU architectures require specific alignment of various datatypes, and will throw exceptions if you don't honor this rule. In standard mode, x86 doesn't require this for the basic data types, but can suffer performance penalties (check www.agner.org for low-level optimization tips).

某些 CPU 架构需要各种数据类型的特定对齐,如果您不遵守此规则,将抛出异常。在标准模式下,x86 不需要对基本数据类型这样做,但会遭受性能损失(查看 www.agner.org 以获得低级优化技巧)。

However, the SSEinstruction set (often used for high-performance) audio/video procesing has strict alignment requirements, and will throw exceptions if you attempt to use it on unaligned data (unless you use the, on some processors, much slower unaligned versions).

但是,SSE指令集(通常用于高性能)音频/视频处理有严格的对齐要求,如果您尝试在未对齐的数据上使用它会抛出异常(除非您在某些处理器上使用慢得多的未对齐版本) )。

Your issue is probablythat one compiler expects the callerto keep the stack aligned, while the other expects calleeto align the stack when necessary.

你的问题是,可能是一个编译器预计呼叫者保持对齐堆栈,而其他预期的被调用,以在必要时将纸叠。

EDIT: as for why the exception happens, a routine in the DLL probably wants to use SSE instructions on some temporary stack data, and fails because the two different compilers don't agree on calling conventions.

编辑:至于为什么会发生异常,DLL 中的一个例程可能想在一些临时堆栈数据上使用 SSE 指令,并且由于两个不同的编译器在调用约定上不一致而失败。

回答by Shaun Bouckaert

IIRC, stack alignment is when variables are placed on the stack "aligned" to a particular number of bytes. So if you are using a 16 bit stack alignment, each variable on the stack is going to start from a byte that is a multiple of 2 bytes from the current stack pointer within a function.

IIRC,堆栈对齐是将变量放在堆栈上“对齐”到特定数量的字节。因此,如果您使用 16 位堆栈对齐,堆栈上的每个变量将从一个字节开始,该字节是函数内当前堆栈指针的 2 个字节的倍数。

This means that if you use a variable that is < 2 bytes, such as a char (1 byte), there will be 8 bits of unused "padding" between it and the next variable. This allows certain optimisations with assumptions based on variable locations.

这意味着,如果您使用小于 2 个字节的变量,例如 char(1 个字节),则在它和下一个变量之间将有 8 位未使用的“填充”。这允许基于可变位置的假设进行某些优化。

When calling functions, one method of passing arguments to the next function is to place them on the stack (as opposed to placing them directly into registers). Whether or not alignment is being used here is important, as the calling function places the variables on the stack, to be read off by the calling function using offsets. If the calling function aligns the variables, and the called function expects them to be non-aligned, then the called function won't be able to find them.

调用函数时,将参数传递给下一个函数的一种方法是将它们放在堆栈中(而不是将它们直接放入寄存器中)。此处是否使用对齐很重要,因为调用函数将变量放在堆栈上,以便调用函数使用偏移量读取。如果调用函数对齐变量,并且被调用函数期望它们是非对齐的,那么被调用函数将无法找到它们。

It seems that the msvc compiled code is disagreeing about variable alignment. Try compiling with all optimisations turned off.

似乎 msvc 编译的代码不同意变量对齐。尝试在关闭所有优化的情况下进行编译。

回答by Dan Olson

As far as I know, compilers don't typically align variables that are on the stack. The library may be depending on some set of compiler options that isn't supported on your compiler. The normal fix is to declare the variables that need to be aligned as static, but if you go about doing this in other people's code, you'll want to be sure that they variables in question are initialized later on in the function rather than in the declaration.

据我所知,编译器通常不会对齐堆栈上的变量。该库可能取决于您的编译器不支持的某些编译器选项集。通常的解决方法是将需要对齐的变量声明为静态变量,但是如果您在其他人的代码中执行此操作,您将需要确保稍后在函数中而不是在函数中初始化这些变量声明。

// Some compilers won't align this as it's on the stack...
int __declspec(align(32)) needsToBe32Aligned = 0;
// Change to
static int __declspec(align(32)) needsToBe32Aligned;
needsToBe32Aligned = 0;

Alternately, find a compiler switch that aligns the variables on the stack. Obviously the "__declspec" align syntax I've used here may not be what your compiler uses.

或者,找到一个编译器开关来对齐堆栈上的变量。显然,我在这里使用的“__declspec”对齐语法可能不是您的编译器使用的。