C++ 程序仅在发布版本时崩溃——如何调试？

Question

提问by Nik Reiman

I've got a "Schroedinger's Cat" type of problem here -- my program (actually the test suite for my program, but a program nonetheless) is crashing, but only when built in release mode, and only when launched from the command line. Through caveman debugging (ie, nasty printf() messages all over the place), I have determined the test method where the code is crashing, though unfortunately the actual crash seems to happen in some destructor, since the last trace messages I see are in other destructors which execute cleanly.

我在这里遇到了“薛定谔的猫”类型的问题——我的程序（实际上是我的程序的测试套件，但仍然是一个程序）崩溃了，但只有在发布模式下构建时，并且只有从命令行启动时. 通过穴居人调试（即到处都是讨厌的 printf() 消息），我已经确定了代码崩溃的测试方法，但不幸的是，实际崩溃似乎发生在某些析构函数中，因为我看到的最后一条跟踪消息在其他执行干净的析构函数。

When I attempt to run this program inside of Visual Studio, it doesn't crash. Same goes when launching from WinDbg.exe. The crash only occurs when launching from the command line. This is happening under Windows Vista, btw, and unfortunately I don't have access to an XP machine right now to test on.

当我尝试在 Visual Studio 中运行此程序时，它不会崩溃。从 WinDbg.exe 启动时也是如此。崩溃仅在从命令行启动时发生。这发生在 Windows Vista 下，顺便说一句，不幸的是我现在无法访问 XP 机器进行测试。

It would be really nice if I could get Windows to print out a stack trace, or somethingother than simply terminating the program as if it had exited cleanly. Does anyone have any advice as to how I could get some more meaningful information here and hopefully fix this bug?

这将是非常好的，如果我能得到的Windows打印出堆栈跟踪，或一些其他不是简单地结束，如果它已经退出干净方案。有没有人对我如何在这里获得更有意义的信息并希望修复此错误有任何建议？

Edit: The problem was indeed caused by an out-of-bounds array, which I describe more in this post. Thanks everybody for your help in finding this problem!

编辑：问题确实是由越界数组引起的，我在这篇文章中对此进行了更多描述。感谢大家帮助找到这个问题！

Answer 1

回答by James Curran

In 100% of the cases I've seen or heard of, where a C or C++ program runs fine in the debugger but fails when run outside, the cause has been writing past the end of a function local array. (The debugger puts more on the stack, so you're less likely to overwrite something important.)

在我见过或听说过的 100% 的情况下，C 或 C++ 程序在调试器中运行良好但在外部运行时失败，原因是写入函数本地数组的末尾。（调试器会在堆栈上放置更多内容，因此您不太可能覆盖重要的内容。）

Answer 2

回答by David Dibben

When I have encountered problems like this before it has generally been due to variable initialization. In debug mode, variables and pointers get initialized to zero automatically but in release mode they do not. Therefore, if you have code like this

当我之前遇到这样的问题时，通常是由于变量初始化。在调试模式下，变量和指针会自动初始化为零，但在发布模式下不会。因此，如果你有这样的代码

int* p;
....
if (p == 0) { // do stuff }

In debug mode the code in the if is not executed but in release mode p contains an undefined value, which is unlikely to be 0, so the code is executed often causing a crash.

在调试模式下，if 中的代码没有被执行，但在发布模式下 p 包含一个未定义的值，该值不太可能为 0，因此代码执行时经常会导致崩溃。

I would check your code for uninitialized variables. This can also apply to the contents of arrays.

我会检查您的代码是否有未初始化的变量。这也适用于数组的内容。

Answer 3

回答by Sebastian

No answer so far has tried to give a serious overview about the available techniques for debugging release applications:

到目前为止，没有答案试图对调试发布应用程序的可用技术进行认真的概述：

Release and Debug builds behave differently for many reasons.Here is an excellent overview.Each of these differences might cause a bug in the Release build that doesn't exist in the Debug build.
The presence of a debugger may change the behavior of a program too, both for release and debug builds. See this answer.In short, at least the Visual Studio Debugger uses the Debug Heap automatically when attached to a program. You can turn the debug heap off by using environment variable _NO_DEBUG_HEAP . You can specify this either in your computer properties, or in the Project Settings in Visual Studio. That might make the crash reproducible with the debugger attached.
More on debugging heap corruption here.
If the previous solution doesn't work, you need to catch the unhandled exception and attach a post-mortem debuggerthe instance the crash occurs. You can use e.g. WinDbg for this, details about the avaiable post-mortem debuggers and their installation at MSDN
You can improve your exception handling code and if this is a production application, you should:
a. Install a custom termination handler using std::set_terminate
If you want to debug this problem locally, you could run an endless loop inside the termination handler and output some text to the console to notify you that std::terminatehas been called. Then attach the debugger and check the call stack. Or you print the stack trace as described in this answer.
In a production application you might want to send an error report back home, ideally together with a small memory dump that allows you to analyze the problem as described here.
b. Use Microsoft's structured exception handling mechanismthat allows you to catch both hardware and software exceptions. See MSDN. You could guard parts of your code using SEH and use the same approach as in a) to debug the problem. SEH gives more information about the exception that occurred that you could use when sending an error report from a production app.

由于多种原因，发布和调试版本的行为不同。这是一个很好的概述。这些差异中的每一个都可能导致发布版本中存在调试版本中不存在的错误。
调试器的存在也可能改变程序的行为，无论是发布版本还是调试版本。看到这个答案。简而言之，至少 Visual Studio 调试器在附加到程序时会自动使用调试堆。您可以使用环境变量 _NO_DEBUG_HEAP 关闭调试堆。您可以在计算机属性或 Visual Studio 的项目设置中指定此项。这可能会使连接调试器的崩溃重现。
更多关于在此处调试堆损坏的信息。
如果之前的解决方案不起作用，您需要捕获未处理的异常并在发生崩溃的实例上附加一个事后调试器。您可以为此使用例如 WinDbg，有关 MSDN 上可用的事后调试器及其安装的详细信息
您可以改进您的异常处理代码，如果这是一个生产应用程序，您应该：
一种。使用安装自定义终止处理程序std::set_terminate
如果你想在本地调试这个问题，你可以在终止处理程序中运行一个无限循环并向控制台输出一些文本来通知你std::terminate已经被调用。然后附加调试器并检查调用堆栈。或者您按照本答案中的描述打印堆栈跟踪。
在生产应用程序中，您可能希望将错误报告发送回家，最好连同一个小型内存转储一起发送，以便您按此处所述分析问题。
湾 使用 Microsoft 的结构化异常处理机制，该机制允许您捕获硬件和软件异常。请参阅 MSDN。您可以使用 SEH 保护部分代码并使用与 a) 中相同的方法来调试问题。SEH 提供了有关发生的异常的更多信息，您可以在从生产应用程序发送错误报告时使用这些信息。

Answer 4

回答by morechilli

Things to look out for:

需要注意的事项：

Array overruns - the visual studio debugger inserts padding which may stop crashes.

数组溢出 - Visual Studio 调试器插入可能会阻止崩溃的填充。

Race conditions - do you have multiple threads involved if so a race condition many only show up when an application is executed directly.

竞争条件 - 如果这样，竞争条件很多只在直接执行应用程序时出现，您是否涉及多个线程。

Linking - is your release build pulling in the correct libraries.

链接 - 您的发布版本是否包含了正确的库。

Things to try:

尝试的事情：

Minidump - really easy to use (just look it up in msdn) will give you a full crash dump for each thread. You just load the output into visual studio and it is as if you were debugging at the time of the crash.

Minidump - 非常易于使用（只需在 msdn 中查找）将为您提供每个线程的完整崩溃转储。您只需将输出加载到 Visual Studio 中，就好像您在崩溃时正在调试一样。

Answer 5

回答by Franci Penov

You can set WinDbg as your postmortem debugger. This will launch the debugger and attach it to the process when the crash occurs. To install WinDbg for postmortem debugging, use the /I option (note it is capitalized):

您可以将 WinDbg 设置为事后调试器。这将启动调试器并将其附加到发生崩溃时的进程。要安装 WinDbg 进行事后调试，请使用 /I 选项（注意它是大写的）：

windbg /I

More details here.

更多细节在这里。

As to the cause, it's most probably an unitialized variable as the other answers suggest.

至于原因，正如其他答案所暗示的那样，它很可能是一个未初始化的变量。

Answer 6

回答by Nik Reiman

After many hours of debugging, I finally found the cause of the problem, which was indeed caused by a buffer overflow, caused a single byte difference:

经过几个小时的调试，终于找到了问题的原因，确实是缓冲区溢出导致了单字节差异：

char *end = static_cast<char*>(attr->data) + attr->dataSize;

This is a fencepost error (off-by-one error) and was fixed by:

这是一个围栏错误（一对一错误）并通过以下方式修复：

char *end = static_cast<char*>(attr->data) + attr->dataSize - 1;

The weird thing was, I put several calls to _CrtCheckMemory() around various parts of my code, and they always returned 1. I was able to find the source of the problem by placing "return false;" calls in the test case, and then eventually determining through trial-and-error where the fault was.

奇怪的是，我在代码的各个部分多次调用 _CrtCheckMemory()，它们总是返回 1。我能够通过放置“return false;”找到问题的根源。调用测试用例，然后最终通过反复试验确定错误所在。

Thanks everybody for your comments -- I learned a lot about windbg.exe today! :)

谢谢大家的评论——今天我学到了很多关于windbg.exe的知识！:)

Answer 7

回答by Greg Whitfield

Even though you have built your exe as a release one, you can still generate PDB (Program database) files that will allow you to stack trace, and do a limited amount of variable inspection. In your build settings there is an option to create the PDB files. Turn this on and relink. Then try running from the IDE first to see if you get the crash. If so, then great - you're all set to look at things. If not, then when running from the command line you can do one of two things:

即使您已将 exe 构建为发行版，您仍然可以生成 PDB（程序数据库）文件，这些文件将允许您进行堆栈跟踪，并进行有限数量的变量检查。在您的构建设置中，有一个选项可以创建 PDB 文件。打开它并重新链接。然后首先尝试从 IDE 运行以查看是否发生崩溃。如果是这样，那就太好了——你们都准备好看看事情了。如果没有，那么从命令行运行时，您可以执行以下两项操作之一：

Run EXE, and before the crash do an Attach To Process (Tools menu on Visual Studio).
After the crash, select the option to launch debugger.

运行 EXE，并在崩溃之前执行附加到进程（Visual Studio 上的工具菜单）。
崩溃后，选择启动调试器的选项。

When asked to point to PDB files, browse to find them. If the PDB's were put in the same output folder as your EXE or DLL's they will probably be picked up automatically.

当要求指向 PDB 文件时，浏览以找到它们。如果 PDB 与您的 EXE 或 DLL 放在相同的输出文件夹中，它们可能会被自动提取。

The PDB's provide a link to the source with enough symbol information to make it possible to see stack traces, variables etc. You can inspect the values as normal, but do be aware that you can get false readings as the optimisation pass may mean things only appear in registers, or things happen in a different order than you expect.

PDB 提供了一个带有足够符号信息的源链接，可以查看堆栈跟踪、变量等。您可以照常检查这些值，但请注意，您可能会得到错误读数，因为优化传递可能仅意味着某些事情出现在寄存器中，或者事情发生的顺序与您预期的不同。

NB: I'm assuming a Windows/Visual Studio environment here.

注意：我在这里假设一个 Windows/Visual Studio 环境。

Answer 8

回答by Cruachan

Crashes like this are almost always caused because an IDE will usually set the contents of uninitialized variable to zeros, null or some other such 'sensible' value, whereas when running natively you'll get whatever random rubbish that the system picks up.

像这样的崩溃几乎总是引起的，因为 IDE 通常会将未初始化变量的内容设置为零、空或其他一些此类“合理”值，而在本机运行时，您将获得系统拾取的任何随机垃圾。

Your error is therefore almost certainly that you are using something like you are using a pointer before it has been properly initialized and you're getting away with it in the IDE because it doesn't point anywhere dangerous - or the value is handled by your error checking - but in release mode it does something nasty.

因此，您的错误几乎可以肯定是您正在使用类似在正确初始化之前使用指针的东西，并且您在 IDE 中逃脱了它，因为它没有指向任何危险的地方 - 或者该值由您的错误检查 - 但在发布模式下它会做一些令人讨厌的事情。

Answer 9

回答by Yuval Peled

In order to have a crash dump that you can analyze:

为了获得可以分析的故障转储：

Generate pdb files for your code.
You rebase to have your exe and dlls loaded in the same address.
Enable post mortem debugger such as Dr. Watson
Check the crash failures address using a tool such as crash finder.

为您的代码生成 pdb 文件。
你变基让你的 exe 和 dll 加载到相同的地址。
启用事后调试器，例如Dr. Watson
使用诸如crash finder 之类的工具检查崩溃失败地址。

You should also check out the tools in Debugging tools for windows. You can monitor the application and see all the first chance exceptions that were prior to your second chance exception.

您还应该查看Debugging tools for windows中的工具。您可以监视应用程序并查看第二次机会异常之前的所有第一次机会异常。

Hope it helps...

希望能帮助到你...

Answer 10

回答by Mohamad mehdi Kharatizadeh

Sometimes this happens because you have wrapped important operation inside "assert" macro. As you may know, "assert" evaluates expressions only on debug mode.

有时发生这种情况是因为您在“assert”宏中包含了重要的操作。您可能知道，“assert”仅在调试模式下评估表达式。

C++ 程序仅在发布版本时崩溃——如何调试？

提问by Nik Reiman

回答by James Curran

回答by David Dibben

回答by Sebastian

回答by morechilli

回答by Franci Penov

回答by Nik Reiman

回答by Greg Whitfield

回答by Cruachan

回答by Yuval Peled

回答by Mohamad mehdi Kharatizadeh

相关推荐

最近更新

标签

C++ 程序仅在发布版本时崩溃——如何调试？

提问by Nik Reiman

回答by James Curran

回答by David Dibben

回答by Sebastian

回答by morechilli

回答by Franci Penov

回答by Nik Reiman

回答by Greg Whitfield

回答by Cruachan

回答by Yuval Peled

回答by Mohamad mehdi Kharatizadeh

相关推荐

C++ 如何从 std::map 中过滤项目？

C++ 与 win32 CRITICAL_SECTION 相比的 std::mutex 性能

C++ 双精度 - 小数位

在 C++ 中使用“超级”

相关推荐

最近更新

标签