Java 崩溃的应用程序 - 如何找出 Java 崩溃的原因？

Question

提问by Libor Havlicek

My java server started to crash repeatedly, and I can't find why.

我的java服务器开始反复崩溃，我找不到原因。

I have server with 7.5GB memory and I have allocated 3GB for the java process.

我有 7.5GB 内存的服务器，我为 java 进程分配了 3GB。

Server was running fine, and ran garbage collection many times, but the JVM crashed when under memory pressure.

服务器运行良好，并多次运行垃圾收集，但在内存压力下JVM崩溃。

Here is the info from JConsole, after the crash:

以下是崩溃后来自 JConsole 的信息：

Current heap size:?
2?958?868 kbytes
Maximum heap size:?
3?066?816 kbytes
Committed memory:?
3?066?816 kbytes
Pending finalization:?
0 objects
Garbage collector:?
Name = 'PS MarkSweep', Collections = 66, Total time spent = 7 minutes
Garbage collector:?
Name = 'PS Scavenge', Collections = 43?055, Total time spent = 44 minutes



Operating System:?
Linux 2.6.31-302-ec2
Architecture:?
amd64
Number of processors:?
2
Committed virtual memory:?
8?405?760 kbytes
Total physical memory:?
7?882?780 kbytes
Free physical memory:?
???34?540 kbytes
Total swap space:?
????????0 kbytes
Free swap space:?
????????0 kbytes

I have 0.5 GB after a GC run, so all the time it raises from 0.5 to 3 GB and than fall back to 0.5, it is absolutely not problem with hanging objects. In fact it should throw OutOfMemoryExceptioninstead of crashing. I am using those parameters:

在 GC 运行后我有 0.5 GB，所以它一直从 0.5 GB 增加到 3 GB 然后回落到 0.5，这绝对不是悬挂对象的问题。事实上，它应该抛出OutOfMemoryException而不是崩溃。我正在使用这些参数：

-Xmn256m -Xms768m -Xmx3000m -XX:NewRatio=2 -server -verbosegc -XX:PermSize=256m -XX:MaxPermSize=256m -XX:SurvivorRatio=8 -XX:+UseParallelGC -XX:ParallelGCThreads=2 -XX:+UseParallelOldGC

What is wrong and what shall I do? The output shown was:

出了什么问题，我该怎么办？显示的输出是：

Current thread (0x00007fe899755800):  JavaThread "508616253@qtp-1871151428-3352" [_thread_in_vm, id=11941, stack(0x00007fe86a4e5000,0x00007fe86a5e6000)]

siginfo:si_signo=SIGSEGV: si_errno=0, si_code=128 (), si_addr=0x0000000000000000

Registers:
RAX=0x00007fe9c60333b8, RBX=0x00007fe899755800, RCX=0x0d00007fe8f58787, RDX=0x00007fe9c6031888
RSP=0x00007fe86a5e3fd0, RBP=0x00007fe86a5e4020, RSI=0x00007fe899755800, RDI=0x00007fe95bae1770
R8 =0x00007fe9be341620, R9 =0x0000000000000001, R10=0x00007fe9c5b84460, R11=0x00007fe9c051a52b
R12=0x00007fe9c051a529, R13=0x00007fe9c6034ac0, R14=0x00007fe9c051a599, R15=0x0900007fe8f58787
RIP=0x00007fe9c5bd562d, EFL=0x0000000000010246, CSGSFS=0x000000000000e033, ERR=0x0000000000000000
  TRAPNO=0x000000000000000d

Stack: [0x00007fe86a4e5000,0x00007fe86a5e6000],  sp=0x00007fe86a5e3fd0,  free space=3fb0000000000000030k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x64d62d]
V  [libjvm.so+0x5fc4df]

Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
v  ~RuntimeStub::_complete_monitor_locking_Java
J  sun.nio.ch.SocketChannelImpl.write(Ljava/nio/ByteBuffer;)I
J  org.mortbay.io.nio.ChannelEndPoint.flush(Lorg/mortbay/io/Buffer;)I
J  org.mortbay.jetty.HttpGenerator.flush()J
...

Answer 1

回答by Mike Tunnicliffe

From the crash doc you linked, the error is a SIGSEGV which is a fault reading/writing to native memory. The thread stack shows it crashed in JVM code.

从您链接的崩溃文档中，错误是 SIGSEGV，这是对本机内存的读取/写入错误。线程堆栈显示它在 JVM 代码中崩溃了。

Current thread (0x00007fe899755800):  JavaThread "508616253@qtp-1871151428-3352" [_thread_in_vm, id=11941, stack(0x00007fe86a4e5000,0x00007fe86a5e6000)]

siginfo:si_signo=SIGSEGV: si_errno=0, si_code=128 (), si_addr=0x0000000000000000

Registers:
RAX=0x00007fe9c60333b8, RBX=0x00007fe899755800, RCX=0x0d00007fe8f58787, RDX=0x00007fe9c6031888
RSP=0x00007fe86a5e3fd0, RBP=0x00007fe86a5e4020, RSI=0x00007fe899755800, RDI=0x00007fe95bae1770
R8 =0x00007fe9be341620, R9 =0x0000000000000001, R10=0x00007fe9c5b84460, R11=0x00007fe9c051a52b
R12=0x00007fe9c051a529, R13=0x00007fe9c6034ac0, R14=0x00007fe9c051a599, R15=0x0900007fe8f58787
RIP=0x00007fe9c5bd562d, EFL=0x0000000000010246, CSGSFS=0x000000000000e033, ERR=0x0000000000000000
  TRAPNO=0x000000000000000d

Stack: [0x00007fe86a4e5000,0x00007fe86a5e6000],  sp=0x00007fe86a5e3fd0,  free space=3fb0000000000000030k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x64d62d]
V  [libjvm.so+0x5fc4df]

Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
v  ~RuntimeStub::_complete_monitor_locking_Java
J  sun.nio.ch.SocketChannelImpl.write(Ljava/nio/ByteBuffer;)I
J  org.mortbay.io.nio.ChannelEndPoint.flush(Lorg/mortbay/io/Buffer;)I
J  org.mortbay.jetty.HttpGenerator.flush()J
<snip>

Could be a JVM bug, or perhaps memory corruption.

可能是 JVM 错误，也可能是内存损坏。

Answer 2

回答by Andreas Dolk

Sounds like a memory leak. The gc can only clean up objects that aren't referenced anymore. And, if your application (or the server itself) doesn't "free" unused ressources, after a while, even 3GB are not enough.

听起来像内存泄漏。gc 只能清理不再被引用的对象。而且，如果您的应用程序（或服务器本身）没有“释放”未使用的资源，一段时间后，即使 3GB 也不够。

A Profiler might help to identify datastructures that grow unexpectedly.

Profiler 可能有助于识别意外增长的数据结构。

Idea: start the server with -verbose:gcoption and check what happens just before it dies. Decrease heap space for the test so that you don't have to wait to long. If it's a memory leak I expect that you see regular full gc cycle where the gc can free less memory each time it runs.

想法：使用-verbose:gc选项启动服务器并检查在它死之前会发生什么。减少测试的堆空间，这样您就不必等待太久。如果是内存泄漏，我希望您看到常规的完整 gc 循环，其中 gc 每次运行时可以释放更少的内存。

Update

更新

I was mislead by the outofmemoryerrortag. In fact, it's a JVM crash and all you can do is trying to update the installed Java. There are already some reports on "SIGSEGV (0xb)" crashes for builds 1.6.0_17 and 1.6.0_18 (like this question on SO).

我被outofmemoryerror标签误导了。事实上，这是 JVM 崩溃，您所能做的就是尝试更新已安装的 Java。已经有一些关于构建 1.6.0_17 和 1.6.0_18 的“SIGSEGV (0xb)”崩溃的报告（就像 SO 上的这个问题）。

It's an JVM internal problem.

这是JVM内部问题。

Answer 3

回答by Thorbj?rn Ravn Andersen

If you with "a memory problem" mean that you are worried that your physical hardware is defective, you should strongly consider stress testing it.

如果您遇到“内存问题”意味着您担心您的物理硬件有缺陷，您应该强烈考虑对其进行压力测试。

For traditional PC's the usual way to do this is with memtest86. The latest version appears to be available from here: http://www.memtest.org/

对于传统 PC，通常的方法是使用 memtest86。最新版本似乎可以从这里获得：http: //www.memtest.org/

If the memory pass an all-night test with memtest86 you can be pretty certain it works correctly.

如果内存通过了 memtest86 的通宵测试，您可以非常确定它可以正常工作。

Answer 4

回答by Peter Lawrey

When you say you allocated 3 GB to the JVM, was this the heap size or the total size (which can be quite a bit bigger) e.g. a JVM with a 3 GB heap could use clsoe to 4 GB in total.

当您说您为 JVM 分配了 3 GB 时，这是堆大小还是总大小（可能更大一些），例如，具有 3 GB 堆的 JVM 总共可以使用 clsoe 到 4 GB。

If the JVM has crashed on a GC, I would check you have a current version of the JVM, like Java 6 update 23.

如果 JVM 在 GC 上崩溃，我会检查您是否拥有 JVM 的当前版本，例如 Java 6 update 23。

What was the crash? Sometimes the same crash is reported by other people and you can google for it. Sometimes there is a suggested solution.

车祸是怎么回事？有时，其他人也报告了同样的崩溃，您可以通过谷歌搜索。有时有一个建议的解决方案。

Java 崩溃的应用程序 - 如何找出 Java 崩溃的原因？

提问by Libor Havlicek

回答by Mike Tunnicliffe

回答by Andreas Dolk

回答by Thorbj?rn Ravn Andersen

回答by Peter Lawrey

相关推荐

最近更新

标签

Java 崩溃的应用程序 - 如何找出 Java 崩溃的原因？

提问by Libor Havlicek

回答by Mike Tunnicliffe

回答by Andreas Dolk

回答by Thorbj?rn Ravn Andersen

回答by Peter Lawrey

相关推荐

java 由于“无法在类型 javax.persistence.Table 中找到注释方法 name()”而导致编译错误

java 如何使用 Netbean 的 GUI Builder 单独调整表列的大小？

java Android 排序数组

java 通过运行线程更改 main 方法中变量 x 的值

相关推荐

最近更新

标签