Java Old Gen 堆已满,Eden 和 Survivor 低且几乎为空
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/19404207/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Old Gen heap is full and the Eden and Survivor are low and almost empty
提问by Bogdan
A production environment became very slow recently. The cpu of the process took 200%. It kept working however. After I restarted the service it functioned normal again. I have several symptoms : The Par survivor space heap was empty for a long time and garbage collection took about 20% of the cpu time.
最近生产环境变得很慢。进程的cpu占用了200%。然而它一直在工作。重新启动服务后,它再次正常运行。我有几个症状: Par 幸存者空间堆很长一段时间都是空的,垃圾收集占用了大约 20% 的 cpu 时间。
JVM options:
JVM 选项:
X:+CMSParallelRemarkEnabled, -XX:+HeapDumpOnOutOfMemoryError, -XX:+UseConcMarkSweepGC, - XX:+UseParNewGC, -XX:HeapDumpPath=heapdump.hprof, -XX:MaxNewSize=700m, -XX:MaxPermSize=786m, -XX:NewSize=700m, -XX:ParallelGCThreads=8, -XX:SurvivorRatio=25, -Xms2048m, -Xmx2048m
Arch amd64
Dispatcher Apache Tomcat
Dispatcher Version 7.0.27
Framework java
Heap initial (MB) 2048.0
Heap max (MB) 2022.125
Java version 1.6.0_35
Log path /opt/newrelic/logs/newrelic_agent.log
OS Linux
Processors 8
System Memory 8177.964, 8178.0
More info in the attached pic When the problem occurred on the non-heap the used code cache and used cms perm gen dropped to half.
附加图片中的更多信息当问题发生在非堆上时,使用的代码缓存和使用的 cms perm gen 下降到一半。
I took the info from the newrelic.
我从新遗物那里获取了信息。
The question is why does the server start to work so slow.
问题是为什么服务器开始工作这么慢。
Sometimes the server stops completely, but we found that there is a problem with PDFBox, when uploading some pdf and contains some fonts it crashes the JVM.
有时服务器完全停止,但我们发现 PDFBox 有问题,当上传一些 pdf 并包含一些字体时,它会导致 JVM 崩溃。
More info: I observed that every day the Old gen is filling up. Now I restart the server daily. After restart it's all nice and dandy but the old gen is filling up till next day and the server slows down till needs a restart.
更多信息:我观察到老一代每天都在填满。现在我每天都重新启动服务器。重新启动后,一切都很好,但旧的 gen 会在第二天填满,服务器变慢直到需要重新启动。
采纳答案by R.Moeller
By Default CMS starts to collect concurrently if OldGen is 70%. If it can't free memory below this boundary, it will run permanently concurrent which will slow down operation significantly. If OldSpace is getting near full OldGen usage, it will panic and fall back to stop-the-world GC pause which can be very long (like 20 seconds). You probably need more headroom in OldGen (ensure your app does not leak memory ofc !). Additionally you can lower the threshold to start a concurrent collection (default 70%) using
默认情况下,如果 OldGen 为 70%,CMS 开始同时收集。如果它不能在此边界以下释放内存,它将永久并发运行,这将显着减慢操作速度。如果 OldSpace 接近 OldGen 的使用量,它会恐慌并退回到 stop-the-world GC 暂停,这可能会很长(比如 20 秒)。您可能需要在 OldGen 中留出更多空间(确保您的应用程序不会泄漏内存!)。此外,您可以使用以下方法降低启动并发收集的阈值(默认为 70%)
-XX:+UseCMSInitiatingOccupancyOnly -XX:CMSInitiatingOccupancyFraction=50
-XX:+UseCMSInitiatingOccupancyOnly -XX:CMSInitiatingOccupancyFraction=50
this will trigger concurrent collection starting with 50% occupancy and increase chance CMS finishes GC in time. This will only help in case your allocation rate is too high, from your charts it looks like not-enough-headrooom/memleak + too high XX:CMSInitiatingOccupancyFraction. Give at least 500MB to 1 GB more OldGen space
这将触发并发收集,从 50% 的占用率开始,并增加 CMS 及时完成 GC 的机会。这只会在您的分配率太高的情况下有所帮助,从您的图表来看,它看起来像没有足够的空间/内存泄漏 + 太高 XX:CMSInitiatingOccupancyFraction。至少多给 500MB 到 1GB 的 OldGen 空间