Java GC 调优 - 防止 Full GC
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/9792590/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
GC Tuning - preventing a Full GC
提问by Kalisen
I'm trying to avoid the Full GC (from gc.log sample below) running a Grails application in Tomcat in production. Any suggestions on how to better configure the GC?
我试图避免在生产中的 Tomcat 中运行 Grails 应用程序的 Full GC(来自下面的 gc.log 示例)。关于如何更好地配置 GC 的任何建议?
14359.317: [Full GC 14359.317: [CMS: 3453285K->3099828K(4194304K), 13.1778420 secs] 4506618K->3099828K(6081792K), [CMS Perm : 261951K->181304K(264372K)] icms_dc=0 , 13.1786310 secs] [Times: user=13.15 sys=0.04, real=13.18 secs]
14359.317:[全GC 14359.317:[CMS:3453285K-> 3099828K(4194304K)13.1778420秒] 4506618K-> 3099828K(6081792K),[CMS彼尔姆:261951K-> 181304K(264372K)] icms_dc = 0,13.1786310秒] [时报:用户=13.15 系统=0.04,真实=13.18 秒]
My VM params are as follow:
-Xms=6G
-Xmx=6G
-XX:MaxPermSize=1G
-XX:NewSize=2G
-XX:MaxTenuringThreshold=8
-XX:SurvivorRatio=7
-XX:+UseConcMarkSweepGC
-XX:+CMSClassUnloadingEnabled
-XX:+CMSPermGenSweepingEnabled
-XX:+CMSIncrementalMode
-XX:CMSInitiatingOccupancyFraction=60
-XX:+UseCMSInitiatingOccupancyOnly
-XX:+HeapDumpOnOutOfMemoryError
-XX:+PrintGCDetails
-XX:+PrintGCTimeStamps
-XX:+PrintTenuringDistribution
-Dsun.reflect.inflationThreshold=0
我的虚拟机参数如下:
-Xms=6G
-Xmx=6G
-XX:MaxPermSize=1G
-XX:NewSize=2G
-XX:MaxTenuringThreshold=8
-XX:SurvivorRatio=7
-XX:+UseConcMarkSweepGC
-XX:+CMSClassUnloadingEnabled
- XX:+ CMSPermGenSweepingEnabled
-XX:+ CMSIncrementalMode
-XX:CMSInitiatingOccupancyFraction = 60
-XX:+ UseCMSInitiatingOccupancyOnly
-XX:+ HeapDumpOnOutOfMemoryError
-XX:+ PrintGCDetails
-XX:+ PrintGCTimeStamps
-XX:+ PrintTenuringDistribution
通过-Dsun.reflect.inflationThreshold = 0
14169.764: [GC 14169.764: [ParNew Desired survivor size 107347968 bytes, new threshold 8 (max 8) - age 1: 15584312 bytes, 15584312 total - age 2: 20053704 bytes, 35638016 total - age 3: 13624872 bytes, 49262888 total - age 4: 14469608 bytes, 63732496 total - age 5: 10553288 bytes, 74285784 total - age 6: 11797648 bytes, 86083432 total - age 7: 12591328 bytes, 98674760 total : 1826161K->130133K(1887488K), 0.1726640 secs] 5216326K->3537160K(6081792K) icms_dc=0 , 0.1733010 secs] [Times: user=0.66 sys=0.03, real=0.17 secs] 14218.712: [GC 14218.712: [ParNew Desired survivor size 107347968 bytes, new threshold 8 (max 8) - age 1: 25898512 bytes, 25898512 total - age 2: 10308160 bytes, 36206672 total - age 3: 16927792 bytes, 53134464 total - age 4: 13493608 bytes, 66628072 total - age 5: 14301832 bytes, 80929904 total - age 6: 10448408 bytes, 91378312 total - age 7: 11724056 bytes, 103102368 total - age 8: 12299528 bytes, 115401896 total : 1807957K->147911K(1887488K), 0.1664510 secs] 5214984K->3554938K(6081792K) icms_dc=0 , 0.1671290 secs] [Times: user=0.61 sys=0.00, real=0.17 secs] 14251.429: [GC 14251.430: [ParNew Desired survivor size 107347968 bytes, new threshold 7 (max 8) - age 1: 25749296 bytes, 25749296 total - age 2: 20111888 bytes, 45861184 total - age 3: 7580776 bytes, 53441960 total - age 4: 16819072 bytes, 70261032 total - age 5: 13209968 bytes, 83471000 total - age 6: 14088856 bytes, 97559856 total - age 7: 10371160 bytes, 107931016 total - age 8: 11426712 bytes, 119357728 total : 1825735K->155304K(1887488K), 0.1888880 secs] 5232762K->3574222K(6081792K) icms_dc=0 , 0.1895340 secs] [Times: user=0.74 sys=0.06, real=0.19 secs] 14291.342: [GC 14291.343: [ParNew Desired survivor size 107347968 bytes, new threshold 7 (max 8) - age 1: 25786480 bytes, 25786480 total - age 2: 21991848 bytes, 47778328 total - age 3: 16650000 bytes, 64428328 total - age 4: 7387368 bytes, 71815696 total - age 5: 16777584 bytes, 88593280 total - age 6: 13098856 bytes, 101692136 total - age 7: 14029704 bytes, 115721840 total : 1833128K->151603K(1887488K), 0.1941170 secs] 5252046K->3591384K(6081792K) icms_dc=0 , 0.1947390 secs] [Times: user=0.82 sys=0.04, real=0.20 secs] 14334.142: [GC 14334.143: [ParNew Desired survivor size 107347968 bytes, new threshold 6 (max 8) - age 1: 31541800 bytes, 31541800 total - age 2: 20826888 bytes, 52368688 total - age 3: 19155264 bytes, 71523952 total - age 4: 16422240 bytes, 87946192 total - age 5: 7235616 bytes, 95181808 total - age 6: 16549000 bytes, 111730808 total - age 7: 13026064 bytes, 124756872 total : 1829427K->167467K(1887488K), 0.1890190 secs] 5269208K->3620753K(6081792K) icms_dc=0 , 0.1896630 secs] [Times: user=0.80 sys=0.03, real=0.19 secs] 14359.317: [Full GC 14359.317: [CMS: 3453285K->3099828K(4194304K), 13.1778420 secs] 4506618K->3099828K(6081792K), [CMS Perm : 261951K->181304K(264372K)] icms_dc=0 , 13.1786310 secs] [Times: user=13.15 sys=0.04, real=13.18 secs] 14373.287: [GC [1 CMS-initial-mark: 3099828K(4194304K)] 3100094K(6081792K), 0.0107380 secs] [Times: user=0.01 sys=0.00, real=0.00 secs] 14373.298: [CMS-concurrent-mark-start] 14472.579: [GC 14472.579: [ParNew Desired survivor size 107347968 bytes, new threshold 8 (max 8) - age 1: 42849392 bytes, 42849392 total : 1677824K->86719K(1887488K), 0.1056680 secs] 4777652K->3186547K(6081792K) icms_dc=0 , 0.1063280 secs] [Times: user=0.61 sys=0.00, real=0.11 secs] 14506.980: [GC 14506.980: [ParNew Desired survivor size 107347968 bytes, new threshold 8 (max 8) - age 1: 42002904 bytes, 42002904 total - age 2: 35733928 bytes, 77736832 total : 1764543K->96136K(1887488K), 0.0982790 secs] 4864371K->3195964K(6081792K) icms_dc=0 , 0.0988960 secs] [Times: user=0.53 sys=0.01, real=0.10 secs] 14544.285: [GC 14544.286: [ParNew Desired survivor size 107347968 bytes, new threshold 8 (max 8) - age 1: 26159736 bytes, 26159736 total - age 2: 37842840 bytes, 64002576 total - age 3: 33192784 bytes, 97195360 total : 1773960K->130799K(1887488K), 0.1208590 secs] 4873788K->3230628K(6081792K) icms_dc=0 , 0.1215900 secs] [Times: user=0.59 sys=0.02, real=0.13 secs] 14589.266: [GC 14589.266: [ParNew Desired survivor size 107347968 bytes, new threshold 4 (max 8) - age 1: 28010360 bytes, 28010360 total - age 2: 21136704 bytes, 49147064 total - age 3: 35081376 bytes, 84228440 total - age 4: 32468056 bytes, 116696496 total : 1808623K->148284K(1887488K), 0.1423150 secs] 4908452K->3248112K(6081792K) icms_dc=0 , 0.1429440 secs] [Times: user=0.70 sys=0.02, real=0.14 secs] 14630.947: [GC 14630.947: [ParNew Desired survivor size 107347968 bytes, new threshold 8 (max 8) - age 1: 28248240 bytes, 28248240 total - age 2: 20712320 bytes, 48960560 total - age 3: 18217168 bytes, 67177728 total - age 4: 34834832 bytes, 102012560 total : 1826108K->140347K(1887488K), 0.1784680 secs] 4925936K->3275469K(6081792K) icms_dc=0 , 0.1790920 secs] [Times: user=0.98 sys=0.03, real=0.18 secs] 14664.779: [GC 14664.779: [ParNew Desired survivor size 107347968 bytes, new threshold 5 (max 8) - age 1: 25841000 bytes, 25841000 total - age 2: 22264960 bytes, 48105960 total - age 3: 17730104 bytes, 65836064 total - age 4: 17988048 bytes, 83824112 total - age 5: 34739384 bytes, 118563496 total : 1818171K->147603K(1887488K), 0.1714160 secs] 4953293K->3282725K(6081792K) icms_dc=0 , 0.1720530 secs] [Times: user=0.82 sys=0.11, real=0.17 secs] 14702.488: [GC 14702.489: [ParNew Desired survivor size 107347968 bytes, new threshold 8 (max 8) - age 1: 26887368 bytes, 26887368 total - age 2: 21403352 bytes, 48290720 total - age 3: 18732224 bytes, 67022944 total - age 4: 17640576 bytes, 84663520 total - age 5: 17942952 bytes, 102606472 total : 1825427K->142695K(1887488K), 0.2118320 secs] 4960549K->3312168K(6081792K) icms_dc=0 , 0.2124630 secs] [Times: user=1.13 sys=0.14, real=0.21 secs]
The strategy I was aiming at: I want to limit to the minimum what gets Tenured, I'm serving requests and expect that beyond a certain amount of shared objects, every other objects are useful only to the request at hand. Therefore by using a big NewSize and an increased TenuringThreshold and was hoping to have none of these single serving objects stick around.
我的目标是:我想将 Tenured 的数量限制在最低限度,我正在为请求提供服务,并期望除了一定数量的共享对象外,所有其他对象仅对手头的请求有用。因此,通过使用大的 NewSize 和增加的 TenuringThreshold,并希望这些单一服务对象都不会留下来。
The following are there to support my strategy:
-Xms=6G
-Xmx=6G
-XX:NewSize=2G // big space so that ParNew doesn't occur to often and let time for objects to expire
-XX:MaxTenuringThreshold=8 // to limit the tenuring some more
-XX:SurvivorRatio=7 // based on examples
-XX:CMSInitiatingOccupancyFraction=60
// to prevent a Full GC caused by promotion allocation failed
-XX:+UseCMSInitiatingOccupancyOnly
// to go with the one above based on example
以下是支持我的策略:
-Xms=6G
-Xmx=6G
-XX:NewSize=2G // 大空间,这样 ParNew 不会经常出现,让对象有时间到期
-XX:MaxTenuringThreshold=8 / / 限制更多的任期
-XX:SurvivorRatio=7 // 基于示例 -XX:CMSInitiatingOccupancyFraction=60
// 防止由于升级分配失败而导致的 Full GC
-XX:+UseCMSInitiatingOccupancyOnly
// 与上述基于例如
MaxPermSize=1G and "-Dsun.reflect.inflationThreshold=0" are related to another issue I'd rather keep separated.
MaxPermSize=1G 和“-Dsun.reflect.inflationThreshold=0”与另一个我宁愿分开的问题有关。
"-XX:+CMSClassUnloadingEnabled" and "-XX:+CMSPermGenSweepingEnabled" are there because of grails which rely heavily and extra classes for closures and reflexion
"-XX:+CMSClassUnloadingEnabled" 和 "-XX:+CMSPermGenSweepingEnabled" 之所以存在是因为 grails 严重依赖于闭包和反射的额外类
-XX:+CMSIncrementalMode is an experiment which hasn't yield much success
-XX:+CMSIncrementalMode 是一个没有取得太大成功的实验
采纳答案by Matt
The log snippet posted shows you have a substantial number of objects that are live for >320s (approx 40s per young collection and objects survive through 8 collections before promotion). The remaining objects then bleed into tenured and eventually you hit an apparently unexpected full gc which doesn't actually collect very much.
发布的日志片段显示您有大量对象存活超过 320 秒(每个年轻集合大约 40 秒,并且对象在升级前通过 8 个集合存活)。剩余的对象然后流进终身,最终你遇到了一个明显意想不到的完整 gc,它实际上并没有收集太多。
3453285K->3099828K(4194304K)
3453285K->3099828K(4194304K)
i.e. you have a 4G tenured which is ~82% full (3453285/4194304) when it is triggered and is ~74% full after 13 long seconds.
即,您有一个 4G 终身使用,它在触发时为 ~82% (3453285/4194304),在 13 秒后为 ~74%。
This means it took 13s to collect the grand total of ~350M which, in the context of a 6G heap is not v much.
这意味着收集总共约 350M 需要 13 秒,这在 6G 堆的情况下并不多。
This basically means your heap is not big enough or, perhaps more likely, you have a memory leak. A leak like this is a terrible thing for CMS because a concurrent tenured collection is a non compacting event which means tenured is a collection of free lists which means fragmentation can be a big problem for CMS which means that your utilisation of tenured becomes increasingly inefficient which means that there is an increased probability of promotion failure events (though if this were such an event then I'd expect to see a log message saying that) because it wants to promote (or thinks it will need to promote) X MB into tenured but it does not have a (contiguous) free list >= X MB available. This triggers an unexpected tenured collection which is a not remotely concurrent STW event. If you actually have v little to collect (as you do) then there is no surprise you're sitting twiddling your thumbs.
这基本上意味着您的堆不够大,或者更有可能是内存泄漏。像这样的泄漏对于 CMS 来说是一件可怕的事情,因为并发的tenured 集合是一个非压缩事件,这意味着tenured 是一个空闲列表的集合,这意味着碎片对于CMS 来说可能是一个大问题,这意味着您对tenured 的利用变得越来越低效意味着提升失败事件的可能性增加(尽管如果这是这样的事件,那么我希望看到一条日志消息说)因为它想要提升(或认为它需要提升)X MB 为终身但它没有(连续的)空闲列表 >= X MB 可用。这会触发意外的终身收集,这是一个非远程并发 STW 事件。
Some general pointers, to a large extent reiterating what Vladimir Sitnitov has said...
一些一般性指示,在很大程度上重申了弗拉基米尔·西特尼托夫 (Vladimir Sitnitov) 所说的话......
- using iCMS on a multicore box makes no sense (unless you have lotsof JVMs or other processes running on that box such that the JVM really is short of CPU) therefore remove this switch
- your young collections are unnecessarily long because of the impact of copying relatively substantial quantities of memory between the survivor spaces on every collection, 150-200ms is a really quite massive
ParNew
collection- the right answer to the young gen issue depends on what the allocation behaviour really is (e.g. perhaps you'd be better off tenuring early and reducing the impact of fragmentation on tenured collections OR perhaps you'd be better off having a much more massive new gen and reducing the frequency of young gen collections such that fewer objects are promoted so that there is minimal bleed into tenured).
- 在多核机器上使用 iCMS 是没有意义的(除非你有很多JVM 或其他进程在那个机器上运行,以至于 JVM 确实缺乏 CPU)因此删除这个开关
- 由于在每个集合的幸存者空间之间复制相对大量内存的影响,您的年轻集合不必要地长,150-200ms 是一个非常庞大的
ParNew
集合- 年轻代问题的正确答案取决于分配行为的真正含义(例如,也许您最好尽早使用并减少碎片对终身收藏的影响,或者也许您最好拥有一个更大规模的新gen 并减少年轻代收集的频率,以便提升更少的对象,从而最大限度地减少对tenured 的影响)。
Some questions...
一些问题...
- does it eventually go OoM or does it recover?
- is the application in a steady state (subject to consistent load at some point well beyond startup) during this log snippet or is it under stress?
- 它最终会出现 OoM 还是会恢复?
- 在此日志片段期间,应用程序是否处于稳定状态(在启动后的某个时间点受到一致负载的影响)还是处于压力之下?
回答by DNA
Your survivor sizes aren't decreasing much, if at all - ideally they should be decreasing steeply, because you only want a minority of objects to survive long enough to reach the Old generation.
你的幸存者大小并没有减少太多,如果有的话 - 理想情况下它们应该急剧减少,因为你只希望少数对象存活足够长的时间到达老年代。
This suggests that many objects are living a relatively long time - which can happen when you have many open connections, threads etc that are not handled quickly, for example.
这表明许多对象的存活时间相对较长 - 例如,当您有许多未快速处理的打开连接、线程等时,就会发生这种情况。
(Do you have any options to change the application, incidentally, or can you only modify the GC settings? There might also be Tomcat settings that would have an effect...)
(顺便提一下,您是否有任何选项可以更改应用程序,还是只能修改 GC 设置?可能还有 Tomcat 设置会产生影响...)
回答by Vladimir Sitnikov
Please, describe how many CPUs can be used for Tomcat? 4?
请描述一下 Tomcat 可以使用多少个 CPU?4?
What java version are you using? (>1.6.0u23 ?)
你用的是什么java版本?(> 1.6.0u23 ?)
0) From the Full GC output, it definitely looks like you are hitting memory limit: even after full gc, there is still 3099828K of used memory (out of 4194304K). There is just no way to prevent Full GC when you are out of memory.
0) 从 Full GC 输出来看,您肯定会遇到内存限制:即使在 full gc 之后,仍然有 3099828K 的已用内存(总共 4194304K)。当内存不足时,没有办法阻止 Full GC。
Is 3.1Gb working set expected for your application? That is 3.1Gb of non-garbage memory!
您的应用程序是否需要 3.1Gb 工作集?那是 3.1Gb 的非垃圾内存!
If that is expected, it is time to increase -Xmx/-Xms. Otherwise, it is time to collect and analyze heap dump to identify memory hog.
如果这是预期的,则是增加 -Xmx/-Xms 的时候了。否则,是时候收集和分析堆转储以识别内存占用了。
After you solve the problem of 3Gb working set, you may find the following advice useful: From my point of view, regular (non incremental) CMS mode, and reducing NewSize are worth trying.
在你解决了 3Gb 工作集的问题后,你可能会发现以下建议很有用: 在我看来,常规(非增量)CMS 模式和减少 NewSize 是值得尝试的。
1) Incremental mode is targeted at single cpu machines, when CMS thread yields CPU to other threads.
1)增量模式针对单cpu机器,当CMS线程将CPU让给其他线程时。
In case you have some spare CPU (e.g. you are running multicore machine) it is better to perform GC in the background without yields.
如果您有一些空闲 CPU(例如您正在运行多核机器),最好在后台执行 GC 而不产生任何收益。
Thus I would recommend removing -XX:+CMSIncrementalMode.
因此我建议删除 -XX:+CMSIncrementalMode。
2) -XX:CMSInitiatingOccupancyFraction=60 tells CMS to start background GC after OLD gen is 60% full.
2) -XX:CMSInitiatingOccupancyFraction=60 告诉 CMS 在 OLD gen 满 60% 后启动后台 GC。
In case there is garbage in the heap, and CMS does not keep up with it, it makes sense lowering CMSInitiatingOccupancyFraction. For instance, -XX:CMSInitiatingOccupancyFraction=30, so CMS would start concurrent collection when old gen is 30% full. Currently it is hard to tell if it is the case, since you just do not have garbage in the heap.
如果堆中有垃圾,而 CMS 没有跟上,降低 CMSInitiatingOccupancyFraction 是有意义的。例如,-XX:CMSInitiatingOccupancyFraction=30,所以当旧代达到 30% 时,CMS 将开始并发收集。目前很难判断是否是这种情况,因为堆中没有垃圾。
3) Looks like "extended tenuring" does not help -- the objects just do not die out even after 7-8 tenurings. I would recommend reducing SurvivorRatio (e.g., SurvivorRatio=2, or just remove the option and stick with default). That would reduce the number of tenurings resulting in reduced minor gc pauses.
3)看起来“延长任期”没有帮助——即使在 7-8 次任期后,对象也不会消失。我建议降低 SurvivorRatio(例如,SurvivorRatio=2,或者只是删除该选项并坚持默认)。这将减少任期数,从而减少轻微的 gc 暂停。
4) -XX:NewSize=2G. Did you try lower values for NewSize? Say, NewSize=512m. That should reduce minor gc pauses and make promotions young->old less massive, simplifying work for CMS.
4) -XX:NewSize=2G。您是否尝试过较低的 NewSize 值?比如说,NewSize=512m。这应该会减少轻微的 gc 停顿,并使年轻->年老的促销规模较小,从而简化 CMS 的工作。
回答by the8472
I'm serving requests and expect that beyond a certain amount of shared objects, every other objects are useful only to the request at hand. That's the theory, but any kind of caches can easily void that assumption and create objects that live beyond the request.
我正在处理请求,并希望除了一定数量的共享对象外,所有其他对象仅对手头的请求有用。这就是理论,但任何类型的缓存都可以轻松地使该假设无效并创建超出请求的对象。
As others have noted neither your huge young generation nor the extended tenuring seems to work.
正如其他人所指出的那样,您庞大的年轻一代和延长的任期似乎都不起作用。
You should profile your application and analyze the age-distribution of objects. I'm pretty sure Grails caches all kinds of things beyond the scope of a request and that's what leaks into the old gen.
您应该分析您的应用程序并分析对象的年龄分布。我很确定 Grails 会缓存超出请求范围的所有内容,这就是泄漏到旧版本中的内容。
What you're essentially trying is to sacrifice the young generation pause times (for a young gen of 2GB) to postpone the inevitable - an old gen collection of 6GB. This is not exactly a good tradeoff you're making there.
你本质上想要的是牺牲年轻代的暂停时间(对于 2GB 的年轻代)来推迟不可避免的 - 6GB 的老一代集合。这不是你在那里做的一个很好的权衡。
Instead you probably should aim for better young gen pause times and allow CMS to burn more CPU time: more conrrent-phase GC threads (can't remember the option for that one), higher GCTimeRatio
, a MaxGCPauseMillis
> MaxGCMinorPauseMillis
to take pressure of the minor collections and allow them to meet their pause goals instead of having to resize to fit the major collection limit.
相反,您可能应该瞄准更好的年轻代暂停时间并允许 CMS 消耗更多的 CPU 时间:更多的 conrrent-phase GC 线程(不记得那个选项)、更高的GCTimeRatio
、 a MaxGCPauseMillis
>MaxGCMinorPauseMillis
来承受次要集合的压力和允许他们满足暂停目标,而不必调整大小以适应主要收集限制。
To make major GCs less painful you might want to read this: http://blog.ragozin.info/2012/03/secret-hotspot-option-improving-gc.html(this patch should be in j7u4). CMSParallelRemarkEnabled
should be enabled too, not sure if this is the default.
为了使主要 GC 不那么痛苦,您可能需要阅读以下内容:http: //blog.ragozin.info/2012/03/secret-hotspot-option-improving-gc.html(此补丁应在 j7u4 中)。CMSParallelRemarkEnabled
也应该启用,不确定这是否是默认值。
Alternative: Use G1GC
替代方案:使用 G1GC
Personally I have some horrible experiences with G1GC working itself into a corner due to some very large LRU-like workloads and then falling back to a large, stop-the-world collection far more often than CMS experienced concurrent mode failures for the same workload.
就我个人而言,由于一些非常大的类似 LRU 的工作负载,G1GC 将自己陷入困境,然后回退到一个大型的、停止世界的集合,这比 CMS 遇到相同工作负载的并发模式故障的频率要高得多,我个人有一些可怕的经历。
But for other workloads (like yours) it might actually do the job and collect the old generation incrementally, while also compacting and thus avoiding any big pauses.
但是对于其他工作负载(比如你的),它实际上可能会完成这项工作并逐步收集老年代,同时也进行压缩,从而避免任何大的停顿。
Give it a try if you haven't already. Again, update to the newest java7 before you do so, G1 still has some issues with its heuristics that they're trying to iron out.
如果您还没有尝试,请尝试一下。同样,在你这样做之前更新到最新的 java7,G1 仍然有一些他们试图解决的启发式问题。
Edit: Oracle has improved G1GC's heuristics and some bottlenecks since I have written this answer. It should definitely be worth a try now.
编辑:自从我写下这个答案以来,Oracle 已经改进了 G1GC 的启发式和一些瓶颈。现在绝对值得一试。
Another alternative: Throughput collector
另一种选择:吞吐量收集器
As you're already using a parallel collector for a 2GB young gen and get away with 200ms pause times... why not try the parallel old gen collector on your 6G heap? It would probably take less than the 10s+ major collections you're seeing with CMS. Whenever CMS runs into one of its failure modes it does a single-threaded, stop-the-world collection.
由于您已经在为 2GB 年轻代使用并行收集器,并且暂停时间为 200 毫秒……为什么不在您的 6G 堆上尝试使用并行旧代收集器?它可能比您在 CMS 中看到的 10 多个主要系列还少。每当 CMS 遇到其故障模式之一时,它就会执行单线程、停止世界收集。
回答by stones333
Below is my setting for 4 core Linux box.
下面是我对 4 核 Linux 机器的设置。
In my experience, you can tune -XX:NewSize -XX:MaxNewSize -XX:GCTimeRatio to achieve high throughput and low latency.
根据我的经验,您可以调整 -XX:NewSize -XX:MaxNewSize -XX:GCTimeRatio 以实现高吞吐量和低延迟。
-server
-Xms2048m
-Xmx2048m
-Dsun.rmi.dgc.client.gcInterval=86400000
-Dsun.rmi.dgc.server.gcInterval=86400000
-XX:+AggressiveOpts
-XX:GCTimeRatio=20
-XX:+UseParNewGC
-XX:ParallelGCThreads=4
-XX:+CMSParallelRemarkEnabled
-XX:ParallelCMSThreads=2
-XX:+CMSScavengeBeforeRemark
-XX:+UseConcMarkSweepGC
-XX:+UseCMSInitiatingOccupancyOnly
-XX:CMSInitiatingOccupancyFraction=50
-XX:NewSize=512m
-XX:MaxNewSize=512m
-XX:PermSize=256m
-XX:MaxPermSize=256m
-XX:SurvivorRatio=90
-XX:TargetSurvivorRatio=90
-XX:MaxTenuringThreshold=15
-XX:MaxGCMinorPauseMillis=1
-XX:MaxGCPauseMillis=5
-XX:+PrintGCDateStamps
-XX:+PrintGCDetails
-XX:+PrintTenuringDistribution
-Xloggc:./logs/gc.log
-server
-Xms2048m
-Xmx2048m
-Dsun.rmi.dgc.client.gcInterval = 86400000
-Dsun.rmi.dgc.server.gcInterval = 86400000
-XX:+ AggressiveOpts
-XX:GCTimeRatio = 20
-XX:+ UseParNewGC
-XX:ParallelGCThreads = 4
-XX:+ CMSParallelRemarkEnabled
-XX:ParallelCMSThreads = 2
-XX:+ CMSScavengeBeforeRemark
-XX:+ UseConcMarkSweepGC
-XX:+ UseCMSInitiatingOccupancyOnly
-XX:CMSInitiatingOccupancyFraction = 50
-XX:新尺寸=512米
-XX:MaxNewSize =512米
-XX:PermSize =256m
-XX:MaxPermSize=256m
-XX:SurvivorRatio=90
-XX:TargetSurvivorRatio=90
-XX:MaxTenuringThreshold=15
-XX:MaxGCMinorPauseMillis=1
-XX:MaxGCPauseMillis=5
-XX:+PrintGCDateStamps
-XX:+PrintGCDetails
-XX:+PrintTenuringDistribution
-Xloggc:./logs/gc.log