java 由于 SIGSEGV 导致 JVM 崩溃
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/28403852/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
JVM Crash due to SIGSEGV
提问by Manisankar
Our server hung because of a SIGSEGV fault..
由于 SIGSEGV 故障,我们的服务器挂了。
A fatal error has been detected by the Java Runtime Environment:
Java 运行时环境检测到一个致命错误:
SIGSEGV (0xb) at pc=0x00007ff5c7195aaa, pid=262778, tid=140690480097024
JRE version: 6.0_35-b10
Java VM: Java HotSpot(TM) 64-Bit Server VM (20.10-b01 mixed mode linux-amd64 compressed oops)
Problematic frame:
C [libdtagentcore.so+0xb7aaa] long double restrict+0x506f6
I am curious to know what could be the root cause of this?
我很想知道这可能是什么根本原因?
Any help is highly appreciated..Thanks..
任何帮助都非常感谢..谢谢..
回答by CodeWalker
Signal Description
SIGSEGV, SIGBUS, SIGFPE, SIGPIPE, SIGILL -- Used in the implementation for implicit null check, and so forth.
SIGQUIT Thread dump support -- To dump Java stack traces at the standard error stream. (Optional.)
SIGTERM, SIGINT, SIGHUP -- Used to support the shutdown hook mechanism (java.lang.Runtime.addShutdownHook) when the VM is terminated abnormally. (Optional.)
SIGUSR1 -- Used in the implementation of the java.lang.Thread.interrupt method. (Configurable.) Not used starting with Solaris 10 OS. Reserved on Linux. SIGUSR2 Used internally. (Configurable.) Not used starting with Solaris 10 OS. SIGABRT The HotSpot VM does not handle this signal. Instead it calls the abort function after fatal error handling. If an application uses this signal then it should terminate the process to preserve the expected semantics.
信号说明
SIGSEGV、SIGBUS、SIGFPE、SIGPIPE、SIGILL——用于隐式空值检查等的实现。
SIGQUIT 线程转储支持——在标准错误流中转储 Java 堆栈跟踪。(选修的。)
SIGTERM, SIGINT, SIGHUP -- 用于支持VM异常终止时的shutdown hook机制(java.lang.Runtime.addShutdownHook)。(选修的。)
SIGUSR1 —— 用于 java.lang.Thread.interrupt 方法的实现。(可配置。)从 Solaris 10 OS 开始不使用。在 Linux 上保留。SIGUSR2 内部使用。(可配置。)从 Solaris 10 OS 开始不使用。SIGABRT HotSpot VM 不处理此信号。相反,它在处理致命错误后调用 abort 函数。如果应用程序使用此信号,则它应该终止进程以保留预期的语义。
The fatal error log indicates that the crash was in a native library, there might be a bug in native code or JNI library code. The crash could of course be caused by something else, but analysis of the library and any core file or crash dump is a good starting place.
致命错误日志表明崩溃发生在本机库中,本机代码或 JNI 库代码可能存在错误。崩溃当然可能是由其他原因引起的,但对库和任何核心文件或崩溃转储的分析是一个很好的起点。
In this case a SIGSEGV occurred with a thread executing in the library libdtagentcore.so . In some cases a bug in a native library manifests itself as a crash in Java VM code. Consider the following crash where a JavaThread fails while in the _thread_in_vm state (meaning that it is executing in Java VM code)
在这种情况下,在库 libdtagentcore.so 中执行的线程发生了 SIGSEGV。在某些情况下,本机库中的错误表现为 Java VM 代码中的崩溃。考虑以下崩溃,其中 JavaThread 在 _thread_in_vm 状态下失败(意味着它正在 Java VM 代码中执行)
- If you get a crash in a native application library (as in your case), then you might be able to attach the native debugger to the core file or crash dump, if it is available. Depending on the operating system, the native debugger is dbx, gdb, or windbg.
- Another approach is to run with the `-Xcheck:jni` option added to the command line. This option is not guaranteed to find all issues with JNI code, but it can help identify a significant number of issues.
- If the native library where the crash occurred is part of the Java runtime environment (for example awt.dll, net.dll, and so forth), then it is possible that you have encountered a library or API bug. If after further analysis you conclude this is a library or API bug, then gather a much data as possible and submit a bug or support call.
- 如果您在本机应用程序库中崩溃(如您的情况),那么您可能能够将本机调试器附加到核心文件或故障转储(如果可用)。根据操作系统,本机调试器是 dbx、gdb 或 windbg。
- 另一种方法是在命令行中添加“-Xcheck:jni”选项来运行。此选项不能保证找到 JNI 代码的所有问题,但它可以帮助识别大量问题。
- 如果发生崩溃的本机库是 Java 运行时环境的一部分(例如 awt.dll、net.dll 等),那么您可能遇到了库或 API 错误。如果在进一步分析后您得出结论,这是一个库或 API 错误,则收集尽可能多的数据并提交错误或支持呼叫。
回答by Steves
It is telling you that an error occurred in code loaded from libdtagentcore.so
. More specifically it happened in function named restrict
and at offset 0x506f6
. The first offset mentioned (0xb7aaa
) is offset within the library itself. If it was build with debugging symbols (-g) you can look at the code that caused the exception, on Linux something along the lines of:
它告诉您从libdtagentcore.so
. 更具体地说,它发生在函数命名restrict
和偏移处0x506f6
。提到的第一个偏移量 ( 0xb7aaa
) 是库本身内的偏移量。如果它是使用调试符号 (-g) 构建的,您可以查看导致异常的代码,在 Linux 上,类似以下内容:
addr2line -e libdtagentcore.so -C -f 0xb7aaa
In case this is read by someone on Windows, see https://community.oracle.com/blogs/kohsuke/2009/02/19/crash-course-jvm-crash-analysis
如果有人在 Windows 上阅读此内容,请参阅https://community.oracle.com/blogs/kohsuke/2009/02/19/crash-course-jvm-crash-analysis
More details in https://www.youtube.com/watch?v=jd6dJa7tSNU
回答by Ales Teska
There is one catchy situation in JNI code: when such a code blocks SIGSEGV signal e.g. because it blocks all signals (quite common approach in threaded C code how to ensure that only main thread will process signals) ANDit calls 'back' Java VM (aka callback) then it can result in quite random SIGSEGV-triggered aborting of the process.
And there is virtually nothing wrong - SIGSEGV is actually triggered by Java VM in order to detect certain conditions in memory (it acts as memory barrier … etc) and it expects that such a signal will be handled by Java VM. Unfortunately when SIGSEGV is blocked, then 'standard' SIGSEGV reaction is triggered => VM process crashes.
JNI 代码中存在一种吸引人的情况:当这样的代码阻塞 SIGSEGV 信号时,例如因为它阻塞了所有信号(线程 C 代码中非常常见的方法如何确保只有主线程会处理信号)并且它调用“返回”Java VM(又名回调)然后它可能导致非常随机的 SIGSEGV 触发的过程中止。
并且实际上没有任何问题 - SIGSEGV 实际上是由 Java VM 触发的,以检测内存中的某些条件(它充当内存屏障等),并且它期望这样的信号将由 Java VM 处理。不幸的是,当 SIGSEGV 被阻塞时,就会触发“标准”SIGSEGV 反应 => VM 进程崩溃。