C++ 如何调试堆损坏错误?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1010106/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to debug heap corruption errors?
提问by
I am debugging a (native) multi-threaded C++ application under Visual Studio 2008. On seemingly random occasions, I get a "Windows has triggered a break point..." error with a note that this might be due to a corruption in the heap. These errors won't always crash the application right away, although it is likely to crash short after.
我正在 Visual Studio 2008 下调试一个(本机)多线程 C++ 应用程序。在看似随机的情况下,我收到一个“Windows 已触发断点...”错误,并指出这可能是由于堆。这些错误不会总是立即使应用程序崩溃,尽管它很可能在不久之后崩溃。
The big problem with these errors is that they pop up only after the corruption has actually taken place, which makes them very hard to track and debug, especially on a multi-threaded application.
这些错误的最大问题是它们仅在损坏实际发生后才会弹出,这使得它们很难跟踪和调试,尤其是在多线程应用程序上。
What sort of things can cause these errors?
How do I debug them?
什么样的事情会导致这些错误?
我如何调试它们?
Tips, tools, methods, enlightments... are welcome.
欢迎使用技巧、工具、方法、启示……。
采纳答案by leander
Application Verifiercombined with Debugging Tools for Windowsis an amazing setup. You can get both as a part of the Windows Driver Kit or the lighter Windows SDK. (Found out about Application Verifier when researching an earlier question about a heap corruption issue.) I've used BoundsChecker and Insure++ (mentioned in other answers) in the past too, although I was surprised how much functionality was in Application Verifier.
应用程序验证程序与Windows 调试工具相结合是一个了不起的设置。您可以将两者作为Windows 驱动程序工具包的一部分或更轻的 Windows SDK 获取。(在研究有关堆损坏问题的早期问题时发现了 Application Verifier 。)我过去也使用过 BoundsChecker 和 Insure++(在其他答案中提到过),尽管我很惊讶 Application Verifier 中有多少功能。
Electric Fence (aka "efence"), dmalloc, valgrind, and so forth are all worth mentioning, but most of these are much easier to get running under *nix than Windows. Valgrind is ridiculously flexible: I've debugged large server software with many heap issues using it.
Electric Fence(又名“efence”)、dmalloc、valgrind等都值得一提,但其中大多数在 *nix 下比在 Windows 下更容易运行。Valgrind 非常灵活:我使用它调试了具有许多堆问题的大型服务器软件。
When all else fails, you can provide your own global operator new/delete and malloc/calloc/realloc overloads -- how to do so will vary a bit depending on compiler and platform -- and this will be a bit of an investment -- but it may pay off over the long run. The desirable feature list should look familiar from dmalloc and electricfence, and the surprisingly excellent book Writing Solid Code:
当所有其他方法都失败时,您可以提供自己的全局运算符 new/delete 和 malloc/calloc/realloc 重载——如何这样做会因编译器和平台而有所不同——这将是一项投资——但从长远来看,它可能会得到回报。dmalloc 和electricfence 以及令人惊讶的优秀书籍Writing Solid Code 中的理想功能列表应该看起来很熟悉:
- sentry values: allow a little more space before and after each alloc, respecting maximum alignment requirement; fill with magic numbers (helps catch buffer overflows and underflows, and the occasional "wild" pointer)
- alloc fill: fill new allocations with a magic non-0 value -- Visual C++ will already do this for you in Debug builds (helps catch use of uninitialized vars)
- free fill: fill in freed memory with a magic non-0 value, designed to trigger a segfault if it's dereferenced in most cases (helps catch dangling pointers)
- delayed free: don't return freed memory to the heap for a while, keep it free filled but not available (helps catch more dangling pointers, catches proximate double-frees)
- tracking: being able to record where an allocation was made can sometimes be useful
- 哨兵值:在每次分配之前和之后允许多一点空间,尊重最大对齐要求;用幻数填充(有助于捕捉缓冲区溢出和下溢,以及偶尔的“野”指针)
- alloc fill:用一个神奇的非 0 值填充新的分配——Visual C++ 已经在调试版本中为你做这件事(有助于捕捉未初始化变量的使用)
- free fill: 用一个神奇的非 0 值填充释放的内存,如果它在大多数情况下被取消引用,则旨在触发段错误(有助于捕获悬空指针)
- 延迟释放:暂时不要将释放的内存返回到堆中,保持空闲填充但不可用(有助于捕获更多悬空指针,捕获接近双重释放)
- 跟踪:能够记录分配的位置有时很有用
Note that in our local homebrew system (for an embedded target) we keep the tracking separate from most of the other stuff, because the run-time overhead is much higher.
请注意,在我们的本地自制系统(对于嵌入式目标)中,我们将跟踪与大多数其他内容分开,因为运行时开销要高得多。
If you're interested in more reasons to overload these allocation functions/operators, take a look at my answer to "Any reason to overload global operator new and delete?"; shameless self-promotion aside, it lists other techniques that are helpful in tracking heap corruption errors, as well as other applicable tools.
如果您对重载这些分配函数/运算符的更多原因感兴趣,请查看我对“重载全局运算符 new 和 delete 的任何原因?”的回答。; 除了无耻的自我推销之外,它还列出了有助于跟踪堆损坏错误的其他技术,以及其他适用的工具。
Because I keep finding my own answer here when searching for alloc/free/fence values MS uses, here's another answer that covers Microsoft dbgheap fill values.
因为在搜索 MS 使用的 alloc/free/fence 值时,我一直在这里找到自己的答案,所以这里是另一个涵盖 Microsoft dbgheap fill values 的答案。
回答by Canopus
You can detect a lot of heap corruption problems by enabling Page Heap for your application . To do this you need to use gflags.exe that comes as a part of Debugging Tools For Windows
您可以通过为应用程序启用 Page Heap 来检测很多堆损坏问题。为此,您需要使用作为Windows 调试工具的一部分的 gflags.exe
Run Gflags.exe and in the Image file options for your executable, check "Enable Page Heap" option.
运行 Gflags.exe 并在可执行文件的图像文件选项中,选中“启用页面堆”选项。
Now restart your exe and attach to a debugger. With Page Heap enabled, the application will break into debugger whenever any heap corruption occurs.
现在重新启动您的 exe 并附加到调试器。启用页堆后,只要发生任何堆损坏,应用程序就会中断调试器。
回答by Canopus
A very relevant article is Debugging Heap corruption with Application Verifier and Debugdiag.
一篇非常相关的文章是使用 Application Verifier 和 Debugdiag 调试堆损坏。
回答by Dave Van Wagner
To really slow things down and perform a lot of runtime checking, try adding the following at the top of your main()
or equivalent in Microsoft Visual Studio C++
要真正减慢速度并执行大量运行时检查,请尝试main()
在 Microsoft Visual Studio C++ 或等效项的顶部添加以下内容
_CrtSetDbgFlag(_CRTDBG_ALLOC_MEM_DF | _CRTDBG_LEAK_CHECK_DF | _CRTDBG_CHECK_ALWAYS_DF );
回答by StackedCrooked
One quick tip, that I got from Detecting access to freed memoryis this:
我从检测对已释放内存的访问中得到的一个快速提示是:
If you want to locate the error quickly, without checking every statement that accesses the memory block, you can set the memory pointer to an invalid value after freeing the block:
#ifdef _DEBUG // detect the access to freed memory #undef free #define free(p) _free_dbg(p, _NORMAL_BLOCK); *(int*)&p = 0x666; #endif
如果想快速定位错误,不检查每一条访问内存块的语句,可以在释放块后将内存指针设置为无效值:
#ifdef _DEBUG // detect the access to freed memory #undef free #define free(p) _free_dbg(p, _NORMAL_BLOCK); *(int*)&p = 0x666; #endif
回答by ChrisW
What sort of things can cause these errors?
什么样的事情会导致这些错误?
Doing naughty things with memory, e.g. writing after the end of a buffer, or writing to a buffer after it's been freed back to the heap.
用内存做一些讨厌的事情,例如在缓冲区结束后写入,或在缓冲区被释放回堆后写入缓冲区。
How do I debug them?
我如何调试它们?
Use an instrument which adds automated bounds-checking to your executable: i.e. valgrind on Unix, or a tool like BoundsChecker (Wikipedia suggests also Purify and Insure++) on Windows.
使用向可执行文件添加自动边界检查的工具:即 Unix 上的 valgrind,或 Windows 上的 BoundsChecker(维基百科也建议 Purify 和 Insure++)之类的工具。
Beware that these will slow your application, so they may be unusable if yours is a soft-real-time application.
请注意,这些会减慢您的应用程序的速度,因此如果您的应用程序是软实时应用程序,它们可能无法使用。
Another possible debugging aid/tool might be MicroQuill's HeapAgent.
另一种可能的调试辅助工具/工具可能是 MicroQuill 的 HeapAgent。
回答by Shing Yip
The best tool I found useful and worked every time is code review (with good code reviewers).
我发现有用且每次都能使用的最佳工具是代码(有优秀的代码者)。
Other than code review, I'd first try Page Heap. Page Heap takes a few seconds to set up and with luck it might pinpoint your problem.
除了代码,我会首先尝试Page Heap。Page Heap 需要几秒钟的时间来设置,幸运的话它可能会查明您的问题。
If no luck with Page Heap, download Debugging Tools for Windowsfrom Microsoft and learn to use the WinDbg. Sorry couldn't give you more specific help, but debuging multi-threaded heap corruption is more an art than science. Google for "WinDbg heap corruption" and you should find many articles on the subject.
如果页面堆不走运,请从 Microsoft下载Windows 调试工具并学习使用 WinDbg。抱歉,无法为您提供更具体的帮助,但调试多线程堆损坏与其说是科学,不如说是一门艺术。谷歌搜索“WinDbg heap corruption”,你应该会找到很多关于这个主题的文章。
回答by dreadpirateryan
You may also want to check to see whether you're linking against the dynamic or static C runtime library. If your DLL files are linking against the static C runtime library, then the DLL files have separate heaps.
您可能还想检查您是链接到动态还是静态 C 运行时库。如果您的 DLL 文件链接到静态 C 运行时库,则 DLL 文件具有单独的堆。
Hence, if you were to create an object in one DLL and try to free it in another DLL, you would get the same message you're seeing above. This problem is referenced in another Stack Overflow question, Freeing memory allocated in a different DLL.
因此,如果您要在一个 DLL 中创建一个对象并尝试在另一个 DLL 中释放它,您将得到与上面看到的相同的消息。此问题在另一个堆栈溢出问题中引用,释放在不同 DLL 中分配的内存。
回答by Vladimir Obrizan
If these errors occur randomly, there is high probability that you encountered data-races. Please, check: do you modify shared memory pointers from different threads? Intel Thread Checker may help to detect such issues in multithreaded program.
如果这些错误随机发生,您很可能遇到数据竞争。请检查:您是否修改了来自不同线程的共享内存指针?英特尔线程检查器可能有助于检测多线程程序中的此类问题。
回答by JaredPar
What type of allocation functions are you using? I recently hit a similar error using the Heap* style allocation functions.
你使用什么类型的分配函数?我最近在使用 Heap* 样式分配函数时遇到了类似的错误。
It turned out that I was mistakenly creating the heap with the HEAP_NO_SERIALIZE
option. This essentially makes the Heap functions run without thread safety. It's a performance improvement if used properly but shouldn't ever be used if you are using HeapAlloc in a multi-threaded program [1]. I only mention this because your post mentions you have a multi-threaded app. If you are using HEAP_NO_SERIALIZE anywhere, delete that and it will likely fix your problem.
事实证明,我错误地使用该HEAP_NO_SERIALIZE
选项创建了堆。这实质上使堆函数在没有线程安全的情况下运行。如果使用得当,这是一种性能改进,但如果您在多线程程序中使用 HeapAlloc [1],则不应使用它。我之所以提到这一点,是因为您的帖子提到您有一个多线程应用程序。如果您在任何地方使用 HEAP_NO_SERIALIZE,请删除它,它可能会解决您的问题。
[1] There are certain situations where this is legal, but it requires you to serialize calls to Heap* and is typically not the case for multi-threaded programs.
[1] 在某些情况下这是合法的,但它要求您序列化对 Heap* 的调用,并且通常不适用于多线程程序。