Linux 分段故障处理

Question

提问by user1225606

I have an application which I use to catch any segmentation fault or ctrl-c. Using the below code, I am able to catch the segmentation fault but the handler is being called again and again. How can I stop them. For your information, I don't want to exit my application. I just can take care to free all the corrupted buffers.

我有一个应用程序，用于捕获任何分段错误或 ctrl-c。使用下面的代码，我能够捕获分段错误，但处理程序被一次又一次地调用。我怎么能阻止他们。供您参考，我不想退出我的应用程序。我只是可以小心地释放所有损坏的缓冲区。

Is it possible?

是否可以？

void SignalInit(void )
{

struct sigaction sigIntHandler;

sigIntHandler.sa_handler = mysighandler;
sigemptyset(&sigIntHandler.sa_mask);
sigIntHandler.sa_flags = 0;
sigaction(SIGINT, &sigIntHandler, NULL);
sigaction(SIGSEGV, &sigIntHandler, NULL);

}

and handler goes like this.

和处理程序是这样的。

void mysighandler()
{
MyfreeBuffers(); /*related to my applciation*/
}

Here for Segmentation fault signal, handler is being called multiple times and as obvious MyfreeBuffers() gives me errors for freeing already freed memory. I just want to free only once but still dont want to exit application.

这里对于分段错误信号，处理程序被多次调用，很明显 MyfreeBuffers() 给了我释放已释放内存的错误。我只想释放一次但仍然不想退出应用程序。

Please help.

请帮忙。

Answer 1

采纳答案by Pavan Manjunath

The default action for things like SIGSEGVis to terminate your process but as you've installed a handler for it, it'll call your handler overriding the default behavior. But the problem is segfaulting instruction may be retried after your handler finishes and if you haven't taken measures to fix the first seg fault, the retried instruction will again fault and it goes on and on.

诸如此类的默认操作SIGSEGV是终止您的进程，但是当您为它安装了一个处理程序时，它会调用您的处理程序覆盖默认行为。但问题是在您的处理程序完成后可能会重试段错误指令，如果您没有采取措施修复第一个段错误，则重试的指令将再次出错，并且会继续下去。

So first spot the instruction that resulted in SIGSEGVand try to fix it (you can call something like backtrace()in the handler and see for yourself what went wrong)

所以首先找出导致的指令SIGSEGV并尝试修复它（你可以backtrace()在处理程序中调用类似的东西，然后自己看看出了什么问题）

Also, the POSIX standard says that,

此外，POSIX 标准说，

The behavior of a process is undefined after it returns normally from a signal-catching function for a [XSI] SIGBUS, SIGFPE, SIGILL, or SIGSEGV signal that was not generated by kill(), [RTS] sigqueue(), or raise().

进程的行为在从不是由 kill()、[RTS] sigqueue() 或 raise() 生成的 [XSI] SIGBUS、SIGFPE、SIGILL 或 SIGSEGV 信号的信号捕获函数正常返回后是未定义的）。

So, the ideal thing to do is to fix your segfault in the first place. Handler for segfault is not meant to bypass the underlying error condition

因此，理想的做法是首先修复您的段错误。段错误的处理程序并不意味着绕过底层错误条件

So the best suggestion would be- Don't catch the SIGSEGV. Let it dump core. Analyze the core. Fix the invalid memory reference and there you go!

所以最好的建议是 -不要赶上SIGSEGV. 让它转储核心。分析核心。修复无效的内存引用，然后就可以了！

Answer 2

回答by g13n

Well you could set a state variable and only free memory if its not set. The signal handler will be called everytime, you can't control that AFAIK.

好吧，您可以设置一个状态变量，并且只有在未设置时才释放内存。每次都会调用信号处理程序，您无法控制该 AFAIK。

Answer 3

回答by caf

If the SIGSEGVfires again, the obvious conclusion is that the call to MyfreeBuffers();has notfixed the underlying problem (and if that function really does only free()some allocated memory, I'm not sure why you would think it would).

如果SIGSEGV再次触发，显而易见的结论是对的调用MyfreeBuffers();并没有解决潜在的问题（如果该函数确实只执行了free()一些已分配的内存，我不确定您为什么会这么认为）。

Roughly, a SIGSEGVfires when an attempt is made to access an inaccessible memory address. If you are not going to exit the application, you need to either make that memory address accessible, or change the execution path with longjmp().

粗略地说，SIGSEGV当试图访问一个不可访问的内存地址时会触发。如果您不打算退出应用程序，则需要使该内存地址可访问，或者使用longjmp().

Answer 4

回答by JeremyP

You shouldn't try to continue after SIG_SEGV. It basically means that the environment of your application is corrupted in some way. It could be that you have just dereferenced a null pointer, or it could be that some bug has caused your program to corrupt its stack or the heap or some pointer variable, you just don't know. The onlysafe thing to do is terminate the program.

您不应该尝试在之后继续SIG_SEGV。这基本上意味着您的应用程序环境以某种方式损坏。可能是您刚刚取消了对空指针的引用，或者可能是某些错误导致您的程序损坏了其堆栈或堆或某些指针变量，您只是不知道。该唯一安全的事情就是终止程序。

It's perfectly legitimate to handle control-C. Lots of applications do it, but you have to be really careful exactly what you do in your signal handler. You can't call any function that's not re-entrant. So that means if your MyFreeBuffers()calls the stdlib free()function, you are probably screwed. If the user hits control-C while the program is in the middle of malloc()or free()and thus half way through manipulating the data structures they use to track heap allocations, you will almost certainly corrupt the heap if you call malloc()or free()in the signal handler.

处理 control-C 是完全合法的。很多应用程序都这样做，但是你必须非常小心你在信号处理程序中所做的一切。您不能调用任何不可重入的函数。所以这意味着如果你MyFreeBuffers()调用 stdlibfree()函数，你可能被搞砸了。如果用户在程序处于中间malloc()或free()因此操作他们用来跟踪堆分配的数据结构的中间或中间时点击了 control-C ，那么如果您调用malloc()或free()在信号处理程序中，您几乎肯定会破坏堆。

About the only safe thing you can do in a signal handler is set a flag to say you caught the signal. Your app can then poll the flag at intervals to decide if it needs to perform some action.

关于您在信号处理程序中可以做的唯一安全的事情是设置一个标志来表示您捕获了信号。然后，您的应用程序可以每隔一段时间轮询该标志，以确定它是否需要执行某些操作。

Answer 5

回答by newlogic

I can see at case for recovering from a SIG_SEGV, if your handling events in a loop and one of these events causes a Segmentation Violation then you would only want to skip over this event, continue processing the remaining events. In my eyes SIG_SEGV is similar to the NullPointerException in Java. Yes the state will be inconsistent and unknown after either of these, however in some cases you would like to handle the situation and carry on. For instance in Algo trading you would pause the execution of an order and allow a trader to manually take over, with out crashing the entire system and ruining all other orders.

我可以看到从 SIG_SEGV 恢复的情况，如果您在循环中处理事件并且这些事件之一导致分段违规，那么您只想跳过此事件，继续处理剩余的事件。在我看来，SIG_SEGV 类似于 Java 中的 NullPointerException。是的，在其中任何一个之后状态都会不一致和未知，但是在某些情况下，您希望处理这种情况并继续。例如，在 Algo 交易中，您将暂停订单的执行并允许交易者手动接管，而不会使整个系统崩溃并破坏所有其他订单。

Answer 6

回答by Champignac

I do not agree at all with the statement "Don't catch the SIGSEGV".

我完全不同意“不要抓住 SIGSEGV”的说法。

That's a pretty good pratice to deal with unexpectedconditions. And that's much cleaner to cope with NULLpointers (as given by malloc failures) with signal mechanism associated to setjmp/longjmp, than to distribute error condition management all along your code.

这是处理意外情况的一种很好的做法。与信号机制相关联的NULL指针（由 malloc 故障给出）处理setjmp/longjmp比在代码中分布错误条件管理要干净得多。

Note however that if you use ''sigaction'' on SEGV, you must not forget to say SA_NODEFERin sa_flags- or find another way to deal with the fact SEGVwill trigger your handler just once.

但是请注意，如果您使用 ''sigaction'' on SEGV，您一定不要忘记说SA_NODEFERin sa_flags- 或者找到另一种方法来处理这个事实SEGV只会触发您的处理程序一次。

#include <setjmp.h>
#include <signal.h>
#include <stdio.h>
#include <string.h>

static void do_segv()
{
  int *segv;

  segv = 0; /* malloc(a_huge_amount); */

  *segv = 1;
}

sigjmp_buf point;

static void handler(int sig, siginfo_t *dont_care, void *dont_care_either)
{
   longjmp(point, 1);
}

int main()
{
  struct sigaction sa;

  memset(&sa, 0, sizeof(sigaction));
  sigemptyset(&sa.sa_mask);

  sa.sa_flags     = SA_NODEFER;
  sa.sa_sigaction = handler;

  sigaction(SIGSEGV, &sa, NULL); /* ignore whether it works or not */ 

  if (setjmp(point) == 0)
   do_segv();

  else
    fprintf(stderr, "rather unexpected error\n");

  return 0;
}

Answer 7

回答by Ivan Uskov

Looks like at least under Linux using the trick with -fnon-call-exceptions option can be the solution. It will give an ability to convert the signal to general C++ exception and handle it by general way. Look the linux3/gcc46: "-fnon-call-exceptions", which signals are trapping instructions?for example.

看起来至少在 Linux 下使用带有 -fnon-call-exceptions 选项的技巧可以是解决方案。它将提供将信号转换为通用 C++ 异常并通过通用方式处理它的能力。查看linux3/gcc46: "-fnon-call-exceptions"，哪些信号是捕获指令？例如。

Linux 分段故障处理

提问by user1225606

采纳答案by Pavan Manjunath

回答by g13n

回答by caf

回答by JeremyP

回答by newlogic

回答by Champignac

回答by Ivan Uskov

相关推荐

最近更新

标签

Linux 分段故障处理

提问by user1225606

采纳答案by Pavan Manjunath

回答by g13n

回答by caf

回答by JeremyP

回答by newlogic

回答by Champignac

回答by Ivan Uskov

相关推荐

Linux Java 无法使用“localhost:10.0”作为 DISPLAY 变量的值连接到 X11 窗口服务器

Linux snmpd 未在 Ubuntu 服务器上侦听端口 161

在 C# 中确定字符串的编码

Linux 在 Python 中找到“主目录”？

相关推荐

最近更新

标签