C++ 如何在 Linux 中捕获分段错误?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/2350489/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-27 23:09:53  来源:igfitidea点击:

How to catch segmentation fault in Linux?

c++segmentation-faulttry-catch

提问by Alex F

I need to catch segmentation fault in third party library cleanup operations. This happens sometimes just before my program exits, and I cannot fix the real reason of this. In Windows programming I could do this with __try - __catch. Is there cross-platform or platform-specific way to do the same? I need this in Linux, gcc.

我需要在第三方库清理操作中捕获分段错误。这有时发生在我的程序退出之前,我无法解决这个问题的真正原因。在 Windows 编程中,我可以使用 __try - __catch 来做到这一点。是否有跨平台或特定于平台的方法来做同样的事情?我在 Linux,gcc 中需要这个。

采纳答案by P Shved

On Linux we can have these as exceptions, too.

在 Linux 上,我们也可以将这些作为例外。

Normally, when your program performs a segmentation fault, it is sent a SIGSEGVsignal. You can set up your own handler for this signal and mitigate the consequences. Of course you should really be sure that you canrecover from the situation. In your case, I think, you should debug your code instead.

通常,当您的程序执行分段错误时,它会发送一个SIGSEGV信号。您可以为此信号设置自己的处理程序并减轻后果。当然,您真的应该确定您可以从这种情况中恢复过来。在你的情况下,我认为你应该调试你的代码。

Back to the topic. I recently encountered a library(short manual) that transforms such signals to exceptions, so you can write code like this:

回到主题。我最近遇到了一个将此类信号转换为异常的库简短手册),因此您可以编写如下代码:

try
{
    *(int*) 0 = 0;
}
catch (std::exception& e)
{
    std::cerr << "Exception caught : " << e.what() << std::endl;
}

Didn't check it, though.Works on my x86-64 Gentoo box. It has a platform-specific backend (borrowed from gcc's java implementation), so it can work on many platforms. It just supports x86 and x86-64 out of the box, but you can get backends from libjava, which resides in gcc sources.

不过没查。适用于我的 x86-64 Gentoo 盒子。它有一个特定于平台的后端(借用 gcc 的 java 实现),因此它可以在许多平台上工作。它仅支持开箱即用的 x86 和 x86-64,但您可以从位于 gcc 源代码中的 libjava 获取后端。

回答by JayM

Here's an example of how to do it in C.

这是如何在 C 中执行此操作的示例。

#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

void segfault_sigaction(int signal, siginfo_t *si, void *arg)
{
    printf("Caught segfault at address %p\n", si->si_addr);
    exit(0);
}

int main(void)
{
    int *foo = NULL;
    struct sigaction sa;

    memset(&sa, 0, sizeof(struct sigaction));
    sigemptyset(&sa.sa_mask);
    sa.sa_sigaction = segfault_sigaction;
    sa.sa_flags   = SA_SIGINFO;

    sigaction(SIGSEGV, &sa, NULL);

    /* Cause a seg fault */
    *foo = 1;

    return 0;
}

回答by revo

C++ solution found here (http://www.cplusplus.com/forum/unices/16430/)

在这里找到 C++ 解决方案(http://www.cplusplus.com/forum/unices/16430/

#include <signal.h>
#include <stdio.h>
#include <unistd.h>
void ouch(int sig)
{
    printf("OUCH! - I got signal %d\n", sig);
}
int main()
{
    struct sigaction act;
    act.sa_handler = ouch;
    sigemptyset(&act.sa_mask);
    act.sa_flags = 0;
    sigaction(SIGINT, &act, 0);
    while(1) {
        printf("Hello World!\n");
        sleep(1);
    }
}

回答by 18446744073709551615

Sometimes we want to catch a SIGSEGVto find out if a pointer is valid, that is, if it references a valid memory address. (Or even check if some arbitrary value may be a pointer.)

有时我们想捕捉 aSIGSEGV来确定一个指针是否有效,即它是否引用了一个有效的内存地址。(或者甚至检查某个任意值是否可能是一个指针。)

One option is to check it with isValidPtr()(worked on Android):

一种选择是检查它isValidPtr()(在 Android 上工作):

int isValidPtr(const void*p, int len) {
    if (!p) {
    return 0;
    }
    int ret = 1;
    int nullfd = open("/dev/random", O_WRONLY);
    if (write(nullfd, p, len) < 0) {
    ret = 0;
    /* Not OK */
    }
    close(nullfd);
    return ret;
}
int isValidOrNullPtr(const void*p, int len) {
    return !p||isValidPtr(p, len);
}

Another option is to read the memory protection attributes, which is a bit more tricky (worked on Android):

另一种选择是读取内存保护属性,这有点棘手(适用于 Android):

re_mprot.c:

re_mprot.c:

#include <errno.h>
#include <malloc.h>
//#define PAGE_SIZE 4096
#include "dlog.h"
#include "stdlib.h"
#include "re_mprot.h"

struct buffer {
    int pos;
    int size;
    char* mem;
};

char* _buf_reset(struct buffer*b) {
    b->mem[b->pos] = 0;
    b->pos = 0;
    return b->mem;
}

struct buffer* _new_buffer(int length) {
    struct buffer* res = malloc(sizeof(struct buffer)+length+4);
    res->pos = 0;
    res->size = length;
    res->mem = (void*)(res+1);
    return res;
}

int _buf_putchar(struct buffer*b, int c) {
    b->mem[b->pos++] = c;
    return b->pos >= b->size;
}

void show_mappings(void)
{
    DLOG("-----------------------------------------------\n");
    int a;
    FILE *f = fopen("/proc/self/maps", "r");
    struct buffer* b = _new_buffer(1024);
    while ((a = fgetc(f)) >= 0) {
    if (_buf_putchar(b,a) || a == '\n') {
        DLOG("/proc/self/maps: %s",_buf_reset(b));
    }
    }
    if (b->pos) {
    DLOG("/proc/self/maps: %s",_buf_reset(b));
    }
    free(b);
    fclose(f);
    DLOG("-----------------------------------------------\n");
}

unsigned int read_mprotection(void* addr) {
    int a;
    unsigned int res = MPROT_0;
    FILE *f = fopen("/proc/self/maps", "r");
    struct buffer* b = _new_buffer(1024);
    while ((a = fgetc(f)) >= 0) {
    if (_buf_putchar(b,a) || a == '\n') {
        char*end0 = (void*)0;
        unsigned long addr0 = strtoul(b->mem, &end0, 0x10);
        char*end1 = (void*)0;
        unsigned long addr1 = strtoul(end0+1, &end1, 0x10);
        if ((void*)addr0 < addr && addr < (void*)addr1) {
            res |= (end1+1)[0] == 'r' ? MPROT_R : 0;
            res |= (end1+1)[1] == 'w' ? MPROT_W : 0;
            res |= (end1+1)[2] == 'x' ? MPROT_X : 0;
            res |= (end1+1)[3] == 'p' ? MPROT_P
                 : (end1+1)[3] == 's' ? MPROT_S : 0;
            break;
        }
        _buf_reset(b);
    }
    }
    free(b);
    fclose(f);
    return res;
}

int has_mprotection(void* addr, unsigned int prot, unsigned int prot_mask) {
    unsigned prot1 = read_mprotection(addr);
    return (prot1 & prot_mask) == prot;
}

char* _mprot_tostring_(char*buf, unsigned int prot) {
    buf[0] = prot & MPROT_R ? 'r' : '-';
    buf[1] = prot & MPROT_W ? 'w' : '-';
    buf[2] = prot & MPROT_X ? 'x' : '-';
    buf[3] = prot & MPROT_S ? 's' : prot & MPROT_P ? 'p' :  '-';
    buf[4] = 0;
    return buf;
}

re_mprot.h:

re_mprot.h:

#include <alloca.h>
#include "re_bits.h"
#include <sys/mman.h>

void show_mappings(void);

enum {
    MPROT_0 = 0, // not found at all
    MPROT_R = PROT_READ,                                 // readable
    MPROT_W = PROT_WRITE,                                // writable
    MPROT_X = PROT_EXEC,                                 // executable
    MPROT_S = FIRST_UNUSED_BIT(MPROT_R|MPROT_W|MPROT_X), // shared
    MPROT_P = MPROT_S<<1,                                // private
};

// returns a non-zero value if the address is mapped (because either MPROT_P or MPROT_S will be set for valid addresses)
unsigned int read_mprotection(void* addr);

// check memory protection against the mask
// returns true if all bits corresponding to non-zero bits in the mask
// are the same in prot and read_mprotection(addr)
int has_mprotection(void* addr, unsigned int prot, unsigned int prot_mask);

// convert the protection mask into a string. Uses alloca(), no need to free() the memory!
#define mprot_tostring(x) ( _mprot_tostring_( (char*)alloca(8) , (x) ) )
char* _mprot_tostring_(char*buf, unsigned int prot);

PS DLOG()is printf()to the Android log. FIRST_UNUSED_BIT()is defined here.

PSDLOG()printf()到Android日志。在这里FIRST_UNUSED_BIT()定义。

PPS It may not be a good idea to call alloca()in a loop -- the memory may be not freed until the function returns.

PPS在循环中调用alloca()可能不是一个好主意——在函数返回之前可能不会释放内存。

回答by Julien Villemure-Fréchette

For portability, one should probably use std::signalfrom the standard C++ library, but there is a lot of restriction on what a signal handler can do. Unfortunately, it is not possible to catch a SIGSEGV from within a C++ programwithout introducing undefined behavior because the specification says:

为了可移植性,可能应该使用std::signal标准 C++ 库,但是信号处理程序可以做什么有很多限制。不幸的是,不引入未定义行为的情况下从 C++ 程序中捕获 SIGSEGV 是不可能的,因为规范说:

  1. it is undefined behavior to call any library function from within the handler other than a very narrow subset of the standard library functions (abort, exit, some atomic functions, reinstall current signal handler, memcpy, memmove, type traits, move, forward, and some more).
  2. it is undefined behavior if handler use a throw expression.
  3. it is undefined behavior if the handler returns when handling SIGFPE, SIGILL, SIGSEGV
  1. 除了标准库函数的一个非常狭窄的子集(abort、exit、一些原子函数、重新安装当前信号处理程序、memcpy、memmove、type traits、move、forward 和多一点)。
  2. 如果处理程序使用 throw 表达式,则这是未定义的行为。
  3. 如果处理程序在处理 SIGFPE、SIGILL、SIGSEGV 时返回,则这是未定义的行为

This proves that it is impossible to catch SIGSEGV from within a programusing strictly standard and portable C++. SIGSEGV is still caught by the operating system and is normally reported to the parent process when a waitfamily function is called.

这证明使用严格标准和可移植的 C++从程序中捕获 SIGSEGV 是不可能的。SIGSEGV 仍然被操作系统捕获,并且通常在调用等待系列函数时报告给父进程。

You will probably run into the same kind of trouble using POSIX signal because there is a clause that says in 2.4.3 Signal Actions:

您可能会在使用 POSIX 信号时遇到同样的问题,因为在2.4.3 Signal Actions 中有一个子句:

The behavior of a process is undefined after it returns normally from a signal-catching function for a SIGBUS, SIGFPE, SIGILL, or SIGSEGV signal that was not generated by kill(), sigqueue(), or raise().

进程的行为在从不是由 kill()、sigqueue() 或 raise() 生成的 SIGBUS、SIGFPE、SIGILL 或 SIGSEGV 信号的信号捕获函数正常返回后是未定义的。

A word about the longjumps. Assuming we are using POSIX signals, using longjumpto simulate stack unwinding won't help:

关于longjumps的一句话。假设我们使用 POSIX 信号,longjump用于模拟堆栈展开将无济于事:

Although longjmp() is an async-signal-safe function, if it is invoked from a signal handler which interrupted a non-async-signal-safe function or equivalent (such as the processing equivalent to exit() performed after a return from the initial call to main()), the behavior of any subsequent call to a non-async-signal-safe function or equivalent is undefined.

尽管 longjmp() 是一个异步信号安全函数,但如果它是从中断非异步信号安全函数或等效函数的信号处理程序调用的(例如在从对 main() 的初始调用,对非异步信号安全函数或等效函数的任何后续调用的行为都是未定义的。

This means that the continuation invoked by the call to longjump cannot reliably call usually useful library function such as printf, mallocor exitor return from main without inducing undefined behavior. As such, the continuation can only do a restricted operations and may only exit through some abnormal termination mechanism.

这意味着对 longjump 的调用所调用的延续不能可靠地调用通常有用的库函数,例如printf,mallocexit从 main 返回,而不会引起未定义的行为。因此,continuation 只能进行受限操作,并且只能通过某种异常终止机制退出。

To put things short, catching a SIGSEGV andresuming execution of the program in a portable is probably infeasible without introducing UB. Even if you are working on a Windows platform for which you have access to Structured exception handling, it is worth mentioning that MSDN suggest to never attempt to handle hardware excetpions: Hardware Exceptions

简而言之,在不引入 UB 的情况下,捕获 SIGSEGV在便携式中恢复程序的执行可能是不可行的。即使您在 Windows 平台上工作,您可以访问结构化异常处理,值得一提的是 MSDN 建议永远不要尝试处理硬件异常:硬件异常