C++ 编译器何时以及为何将内存初始化为 malloc/free/new/delete 上的 0xCD、0xDD 等?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/370195/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
When and why will a compiler initialise memory to 0xCD, 0xDD, etc. on malloc/free/new/delete?
提问by LeopardSkinPillBoxHat
I know that the compiler will sometimes initialize memory with certain patterns such as 0xCD
and 0xDD
. What I want to know is whenand whythis happens.
我知道编译器有时会使用某些模式初始化内存,例如0xCD
and 0xDD
。我想知道的是何时以及为什么会发生这种情况。
When
什么时候
Is this specific to the compiler used?
这是特定于使用的编译器吗?
Do malloc/new
and free/delete
work in the same way with regard to this?
做malloc/new
与free/delete
工作以同样的方式对本?
Is it platform specific?
是平台特定的吗?
Will it occur on other operating systems, such as Linux
or VxWorks
?
它会发生在其他操作系统上,例如Linux
或VxWorks
?
Why
为什么
My understanding is this only occurs in Win32
debug configuration, and it is used to detect memory overruns and to help the compiler catch exceptions.
我的理解是这只发生在Win32
调试配置中,它用于检测内存溢出并帮助编译器捕获异常。
Can you give any practical examples as to how this initialization is useful?
你能举出任何实际例子来说明这种初始化是如何有用的吗?
I remember reading something (maybe in Code Complete 2) saying that it is good to initialize memory to a known pattern when allocating it, and certain patterns will trigger interrupts in Win32
which will result in exceptions showing in the debugger.
我记得读过一些内容(可能在 Code Complete 2 中)说在分配内存时将内存初始化为已知模式是很好的,并且某些模式会触发中断,Win32
从而导致在调试器中显示异常。
How portable is this?
这有多便携?
回答by Michael Burr
A quick summary of what Microsoft's compilers use for various bits of unowned/uninitialized memory when compiled for debug mode (support may vary by compiler version):
为调试模式编译时,Microsoft 编译器对各种未拥有/未初始化内存使用的内容的快速摘要(支持可能因编译器版本而异):
Value Name Description
------ -------- -------------------------
0xCD Clean Memory Allocated memory via malloc or new but never
written by the application.
0xDD Dead Memory Memory that has been released with delete or free.
It is used to detect writing through dangling pointers.
0xED or Aligned Fence 'No man's land' for aligned allocations. Using a
0xBD different value here than 0xFD allows the runtime
to detect not only writing outside the allocation,
but to also identify mixing alignment-specific
allocation/deallocation routines with the regular
ones.
0xFD Fence Memory Also known as "no mans land." This is used to wrap
the allocated memory (surrounding it with a fence)
and is used to detect indexing arrays out of
bounds or other accesses (especially writes) past
the end (or start) of an allocated block.
0xFD or Buffer slack Used to fill slack space in some memory buffers
0xFE (unused parts of `std::string` or the user buffer
passed to `fread()`). 0xFD is used in VS 2005 (maybe
some prior versions, too), 0xFE is used in VS 2008
and later.
0xCC When the code is compiled with the /GZ option,
uninitialized variables are automatically assigned
to this value (at byte level).
// the following magic values are done by the OS, not the C runtime:
0xAB (Allocated Block?) Memory allocated by LocalAlloc().
0xBAADF00D Bad Food Memory allocated by LocalAlloc() with LMEM_FIXED,but
not yet written to.
0xFEEEFEEE OS fill heap memory, which was marked for usage,
but wasn't allocated by HeapAlloc() or LocalAlloc().
Or that memory just has been freed by HeapFree().
Disclaimer: the table is from some notes I have lying around - they may not be 100% correct (or coherent).
免责声明:该表格来自我身边的一些笔记 - 它们可能不是 100% 正确(或连贯)。
Many of these values are defined in vc/crt/src/dbgheap.c:
其中许多值在 vc/crt/src/dbgheap.c 中定义:
/*
* The following values are non-zero, constant, odd, large, and atypical
* Non-zero values help find bugs assuming zero filled data.
* Constant values are good, so that memory filling is deterministic
* (to help make bugs reproducible). Of course, it is bad if
* the constant filling of weird values masks a bug.
* Mathematically odd numbers are good for finding bugs assuming a cleared
* lower bit.
* Large numbers (byte values at least) are less typical and are good
* at finding bad addresses.
* Atypical values (i.e. not too often) are good since they typically
* cause early detection in code.
* For the case of no man's land and free blocks, if you store to any
* of these locations, the memory integrity checker will detect it.
*
* _bAlignLandFill has been changed from 0xBD to 0xED, to ensure that
* 4 bytes of that (0xEDEDEDED) would give an inaccessible address under 3gb.
*/
static unsigned char _bNoMansLandFill = 0xFD; /* fill no-man's land with this */
static unsigned char _bAlignLandFill = 0xED; /* fill no-man's land for aligned routines */
static unsigned char _bDeadLandFill = 0xDD; /* fill free objects with this */
static unsigned char _bCleanLandFill = 0xCD; /* fill new objects with this */
There are also a few times where the debug runtime will fill buffers (or parts of buffers) with a known value, for example, the 'slack' space in std::string
's allocation or the buffer passed to fread()
. Those cases use a value given the name _SECURECRT_FILL_BUFFER_PATTERN
(defined in crtdefs.h
). I'm not sure exactly when it was introduced, but it was in the debug runtime by at least VS 2005 (VC++8).
也有几次调试运行时会用已知值填充缓冲区(或缓冲区的一部分),例如,std::string
分配中的“松弛”空间或传递给 的缓冲区fread()
。这些情况使用给定名称的值_SECURECRT_FILL_BUFFER_PATTERN
(在 中定义crtdefs.h
)。我不确定它是什么时候引入的,但至少在 VS 2005 (VC++8) 之前它就在调试运行时中。
Initially, the value used to fill these buffers was 0xFD
- the same value used for no man's land. However, in VS 2008 (VC++9) the value was changed to 0xFE
. I assume that's because there could be situations where the fill operation would run past the end of the buffer, for example, if the caller passed in a buffer size that was too large to fread()
. In that case, the value 0xFD
might not trigger detecting this overrun since if the buffer size were too large by just one, the fill value would be the same as the no man's land value used to initialize that canary. No change in no man's land means the overrun wouldn't be noticed.
最初,用于填充这些缓冲区0xFD
的值与用于无人区的值相同。但是,在 VS 2008 (VC++9) 中,该值已更改为0xFE
. 我认为这是因为在某些情况下填充操作可能会运行超过缓冲区的末尾,例如,如果调用方传入的缓冲区大小过大到fread()
. 在这种情况下,该值0xFD
可能不会触发检测此溢出,因为如果缓冲区大小仅大一,则填充值将与用于初始化该金丝雀的无人区值相同。无人区没有变化意味着不会注意到超支。
So the fill value was changed in VS 2008 so that such a case would change the no man's land canary, resulting in the detection of the problem by the runtime.
所以在VS 2008中改变了填充值,这样这种情况就会改变无人区金丝雀,导致运行时检测到问题。
As others have noted, one of the key properties of these values is that if a pointer variable with one of these values is de-referenced, it will result in an access violation, since on a standard 32-bit Windows configuration, user mode addresses will not go higher than 0x7fffffff.
正如其他人所指出的,这些值的关键属性之一是,如果取消引用具有这些值之一的指针变量,则会导致访问冲突,因为在标准的 32 位 Windows 配置中,用户模式地址不会高于 0x7ffffffff。
回答by Adam Rosenfield
One nice property about the fill value 0xCCCCCCCC is that in x86 assembly, the opcode 0xCC is the int3opcode, which is the software breakpoint interrupt. So, if you ever try to execute code in uninitialized memory that's been filled with that fill value, you'll immediately hit a breakpoint, and the operating system will let you attach a debugger (or kill the process).
关于填充值 0xCCCCCCCC 的一个很好的属性是,在 x86 汇编中,操作码 0xCC 是int3操作码,它是软件断点中断。因此,如果您尝试在填充了该填充值的未初始化内存中执行代码,您将立即遇到断点,并且操作系统会让您附加调试器(或终止进程)。
回答by Martin Beckett
It's compiler and OS specific, Visual studio sets different kinds of memory to different values so that in the debugger you can easily see if you have overun into into malloced memory, a fixed array or an uninitialised object. Somebody will post the details while I am googling them...
它是特定于编译器和操作系统的,Visual Studio 将不同类型的内存设置为不同的值,以便在调试器中您可以轻松查看是否已进入内存分配、固定数组或未初始化的对象。当我在谷歌搜索它们时,有人会发布详细信息......
回答by Airsource Ltd
It's not the OS - it's the compiler. You can modify the behaviour too - see down the bottom of this post.
这不是操作系统——而是编译器。您也可以修改行为 - 请参阅本文底部。
Microsoft Visual Studio generates (in Debug mode) a binary that pre-fills stack memory with 0xCC. It also inserts a space between every stack frame in order to detect buffer overflows. A very simple example of where this is useful is here (in practice Visual Studio would spot this problem and issue a warning):
Microsoft Visual Studio 生成(在调试模式下)一个二进制文件,用 0xCC 预填充堆栈内存。它还在每个堆栈帧之间插入一个空格以检测缓冲区溢出。这里有一个非常简单的例子,说明这很有用(实际上,Visual Studio 会发现这个问题并发出警告):
...
bool error; // uninitialised value
if(something)
{
error = true;
}
return error;
If Visual Studio didn't preinitialise variables to a known value, then this bug could potentially be hard to find. With preinitialised variables (or rather, preinitialised stack memory), the problem is reproducible on every run.
如果 Visual Studio 未将变量预初始化为已知值,则此错误可能很难找到。使用预先初始化的变量(或者更确切地说,预先初始化的堆栈内存),问题在每次运行时都可以重现。
However, there is a slight problem. The value Visual Studio uses is TRUE - anything except 0 would be. It is actually quite likely that when you run your code in Release mode that unitialised variables may be allocated to a piece of stack memory that happens to contain 0, which means you can have an unitialised variable bug which only manifests itself in Release mode.
但是,有一个小问题。Visual Studio 使用的值是 TRUE - 除了 0 之外的任何值都是 TRUE。实际上很有可能当您在 Release 模式下运行代码时,未初始化的变量可能会分配到一块恰好包含 0 的堆栈内存,这意味着您可能会遇到一个仅在 Release 模式下出现的未初始化的变量错误。
That annoyed me, so I wrote a scriptto modify the pre-fill value by directly editing the binary, allowing me to find uninitalized variable problems that only show up when the stack contains a zero. This script only modifies the stack pre-fill; I never experimented with the heap pre-fill, though it should be possible. Might involve editing the run-time DLL, might not.
这让我很恼火,所以我写了一个脚本,通过直接编辑二进制文件来修改预填充值,让我找到只有在堆栈包含零时才会出现的未初始化变量问题。此脚本仅修改堆栈预填充;我从未尝试过堆预填充,尽管它应该是可能的。可能涉及编辑运行时 DLL,也可能不涉及。
回答by Adrian McCarthy
Is this specific to the compiler used?
这是特定于使用的编译器吗?
Actually, it's almost always a feature of the runtime library (like the C runtime library). The runtime is usually strongly correlated with the compiler, but there are some combinations you can swap.
实际上,它几乎总是运行时库(如 C 运行时库)的一个特性。运行时通常与编译器密切相关,但您可以交换一些组合。
I believe on Windows, the debug heap (HeapAlloc, etc.) also uses special fill patterns which are different than the ones that come from the malloc and free implementations in the debug C runtime library. So it may also be an OS feature, but most of the time, it's just the language runtime library.
我相信在 Windows 上,调试堆(HeapAlloc 等)也使用特殊的填充模式,这些模式与调试 C 运行时库中的 malloc 和 free 实现不同。所以它也可能是一个操作系统特性,但大多数时候,它只是语言运行库。
Do malloc/new and free/delete work in the same way with regard to this?
malloc/new 和 free/delete 在这方面是否以相同的方式工作?
The memory management portion of new and delete are usually implemented with malloc and free, so memory allocated with new and delete usuallyhave the same features.
new和delete的内存管理部分通常用malloc和free来实现,所以用new和delete分配的内存通常具有相同的特性。
Is it platform specific?
是平台特定的吗?
The details are runtime specific. The actual values used are often chosen to not only look unusual and obvious when looking at a hex dump, but are designed to have certain properties that may take advantage of features of the processor. For example, odd values are often used, because they could cause an alignment fault. Large values are used (as opposed to 0), because they cause surprising delays if you loop to an uninitialized counter. On x86, 0xCC is an int 3
instruction, so if you execute an uninitialized memory, it'll trap.
详细信息是特定于运行时的。通常选择使用的实际值不仅在查看十六进制转储时看起来不寻常和明显,而且设计为具有某些可以利用处理器功能的属性。例如,经常使用奇数值,因为它们可能导致对齐错误。使用大值(而不是 0),因为如果循环到未初始化的计数器,它们会导致令人惊讶的延迟。在 x86 上,0xCC 是一条int 3
指令,因此如果您执行未初始化的内存,它会陷入陷阱。
Will it occur on other operating systems, such as Linux or VxWorks?
它会出现在其他操作系统上,例如 Linux 或 VxWorks 上吗?
It mostly depends on the runtime library you use.
它主要取决于您使用的运行时库。
Can you give any practical examples as to how this initialisation is useful?
你能举出任何实际例子来说明这种初始化是如何有用的吗?
I listed some above. The values are generally chosen to increase the chances that something unusual happens if you do something with invalid portions of memory: long delays, traps, alignment faults, etc. Heap managers also sometimes use special fill values for the gaps between allocations. If those patterns ever change, it knows there was a bad write (like a buffer overrun) somewhere.
我在上面列出了一些。如果您对无效的内存部分执行某些操作,通常会选择这些值来增加异常情况发生的可能性:长时间延迟、陷阱、对齐错误等。堆管理器有时还使用特殊的填充值来填充分配之间的间隙。如果这些模式发生变化,它就会知道某处发生了错误的写入(如缓冲区溢出)。
I remember reading something (maybe in Code Complete 2) that it is good to initialise memory to a known pattern when allocating it, and certain patterns will trigger interrupts in Win32 which will result in exceptions showing in the debugger.
How portable is this?
我记得读过一些内容(可能在 Code Complete 2 中)说在分配内存时将内存初始化为已知模式是很好的,并且某些模式会在 Win32 中触发中断,这将导致在调试器中显示异常。
这有多便携?
Writing Solid Code(and maybe Code Complete) talks about things to consider when choosing fill patterns. I've mentioned some of them here, and the Wikipedia article on Magic Number (programming)also summarizes them. Some of the tricks depend on the specifics of the processor you're using (like whether it requires aligned reads and writes and what values map to instructions that will trap). Other tricks, like using large values and unusual values that stand out in a memory dump are more portable.
编写可靠代码(可能还有代码完整)讨论了在选择填充模式时要考虑的事项。我在这里提到了其中的一些,维基百科关于Magic Number(编程)的文章也总结了它们。一些技巧取决于您使用的处理器的细节(例如它是否需要对齐的读取和写入以及哪些值映射到将捕获的指令)。其他技巧,例如使用在内存转储中突出的大值和异常值,则更具可移植性。
回答by paxdiablo
It's to easily see that memory has changed from its initial starting value, generally during debugging but sometimes for release code as well, since you can attach debuggers to the process while it's running.
很容易看到内存已从其初始起始值更改,通常在调试期间但有时也用于发布代码,因为您可以在进程运行时将调试器附加到进程。
It's not just memory either, many debuggers will set register contents to a sentinel value when the process starts (some versions of AIX will set the some registers to 0xdeadbeef
which is mildly humorous).
不仅仅是内存,许多调试器会在进程启动时将寄存器内容设置为标记值(某些 AIX 版本会将某些寄存器设置0xdeadbeef
为有点幽默)。
回答by FryGuy
The obvious reason for the "why" is that suppose you have a class like this:
“为什么”的明显原因是假设你有一个这样的类:
class Foo
{
public:
void SomeFunction()
{
cout << _obj->value << endl;
}
private:
SomeObject *_obj;
}
And then you instantiate one a Foo
and call SomeFunction
, it will give an access violation trying to read 0xCDCDCDCD
. This means that you forgot to initialize something. That's the "why part". If not, then the pointer might have lined up with some other memory, and it would be more difficult to debug. It's just letting you know the reason that you get an access violation. Note that this case was pretty simple, but in a bigger class it's easy to make that mistake.
然后你实例化一个 aFoo
并调用SomeFunction
,它会尝试读取一个访问冲突0xCDCDCDCD
。这意味着您忘记初始化某些东西。这就是“为什么部分”。如果不是,则指针可能已与其他内存对齐,并且调试起来会更加困难。它只是让您知道您遇到访问冲突的原因。请注意,这个案例非常简单,但在更大的班级中很容易犯这个错误。
AFAIK, this only works on the Visual Studio compiler when in debug mode (as opposed to release)
AFAIK,这仅在调试模式下(而不是发布)适用于 Visual Studio 编译器
回答by Stephen Kellett
This article describes unusual memory bit patternsand various techniques you can use if you encounter these values.
本文介绍了不寻常的内存位模式以及遇到这些值时可以使用的各种技术。
回答by Anthony Giorgio
The IBM XLC compiler has an "initauto" option that will assign automatic variables a value that you specify. I used the following for my debug builds:
IBM XLC 编译器有一个“initauto”选项,可以为自动变量分配一个您指定的值。我在调试版本中使用了以下内容:
-Wc,'initauto(deadbeef,word)'
-Wc,'initauto(deadbeef,word)'
If I looked at the storage of an uninitialized variable, it would be set to 0xdeadbeef
如果我查看未初始化变量的存储,它将被设置为 0xdeadbeef