对 CPU 使用率非常高的 Java 进程进行故障排除 - Tomcat 应用程序
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/7034931/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Troubleshooting Java process with very high CPU usage - Tomcat application
提问by Nicole S.
I have a java application that runs on Tomcat (which runs as a service on Windows), the java process for which continues to eat up CPU before eventually requiring me to restart the Tomcat service.
我有一个在 Tomcat 上运行的 Java 应用程序(在 Windows 上作为服务运行),Java 进程在最终要求我重新启动 Tomcat 服务之前继续占用 CPU。
First my setup: Windows 2003 server Tomcat 6, running as service using Wrapper JDK: 1.6.0_20
首先我的设置:Windows 2003 服务器 Tomcat 6,使用 Wrapper JDK 作为服务运行:1.6.0_20
I was seeing catch issues here and there leading up to yesterday. I had to restart midday yesterday, then at 2:30 this morning, then today I could barely restart the application and open jconsole to monitor it before it was hitting 99% CPU usage again. Through a combination of things I'm not quite sure of, it seems like I got the JVM to cycle itself and the app was hovering in the 10-30% CPU usage range for a couple hours. However, then it started to creep up again, finally going into its 99% CPU usage breakdown. I was also having trouble with high memory usage, but that has stayed fairly normal and steady since I so-called got the JVM to "cycle" (bad terminology perhaps, but this is really what it seemed to do - and in the wrapper log there was a dump of all the classes it was reloading after).
直到昨天,我在这里和那里看到了捕获问题。我不得不在昨天中午重新启动,然后在今天早上 2:30 重新启动,然后今天我几乎无法重新启动应用程序并打开 jconsole 来监视它,然后它再次达到 99% 的 CPU 使用率。通过一系列我不太确定的事情,似乎我让 JVM 自行循环,并且应用程序在 10-30% 的 CPU 使用率范围内徘徊了几个小时。然而,然后它又开始爬升,最终进入 99% 的 CPU 使用率崩溃。我也遇到了高内存使用率的问题,但是自从我所谓的让 JVM“循环”以来,这一直保持相当正常和稳定(也许是错误的术语,但这确实是它似乎所做的 - 并且在包装器日志中之后重新加载的所有类都转储了)。
Then I was digging around some more and found a JRE 6 Update 24 installed on the server (I didn't install it as I do thorough testing with each java update - but maybe my server admin did the update). I attempted, but can't uninstall this. Thus, I get different versions when I do a java -version
versus javac -version
然后我又挖了一些,发现服务器上安装了一个 JRE 6 Update 24(我没有安装它,因为我对每个 Java 更新进行了彻底的测试 - 但也许是我的服务器管理员做了更新)。我尝试过,但无法卸载它。因此,当我做java -version
vs时,我会得到不同的版本javac -version
java -version
java version "1.6.0_24"
Java(TM) SE Runtime Environment (build 1.6.0_24-b07)
Java HotSpot(TM) Client VM (build 19.1-b02, mixed mode, sharing)
javac -version
javac 1.6.0_20
Could this difference be causing a JVM conflict of sorts? JAVA_HOME and my PATH variables both point to the correct JDK installation.
这种差异会导致 JVM 冲突吗?JAVA_HOME 和我的 PATH 变量都指向正确的 JDK 安装。
Hoping for more stability, I decided to change my app to run on the previous JDK that was still installed - JDK 1.6.0_04. I changed the wrapper.conf, set env variables, cleaned and rebuilt, and started. This does seem more stable and has been up for about 4 hours. The CPU usage has climbed to the 90s, then it seems to clear itself out again.
为了获得更高的稳定性,我决定将我的应用程序更改为在仍然安装的以前的 JDK 上运行 - JDK 1.6.0_04。我更改了 wrapper.conf,设置了 env 变量,进行了清理和重建,然后开始了。这看起来确实更稳定,并且已经持续了大约 4 个小时。CPU 使用率已经攀升到 90 年代,然后似乎又清除了。
I've done heapdumps then ran them through the Memory Analyzer in Eclipse (nothing new found there), I've used jconsole with jtop to look at threads - nothing jumps out, thus why I continue to be curious if it's a java/jvm issue. So, I know this is a long post - but I don't really know where to go from here. Any ideas?
我已经完成了堆转储,然后通过 Eclipse 中的内存分析器运行它们(那里没有发现任何新内容),我使用 jconsole 和 jtop 来查看线程 - 没有任何内容跳出,因此为什么我继续好奇它是否是 java/jvm问题。所以,我知道这是一篇很长的文章——但我真的不知道从哪里开始。有任何想法吗?
(I've done exhaustive web searching on this and some articles have pointed to possibly a Quartz issue or Hibernate queries not flushing. Nothing has changed in the app since I started seeing the CPU issues, so I'm not sure where to start troubleshooting if it could indeed be linked to either.)
(我已经对此进行了详尽的网络搜索,有些文章指出可能是 Quartz 问题或 Hibernate 查询未刷新。自从我开始看到 CPU 问题以来,应用程序中没有任何变化,所以我不确定从哪里开始进行故障排除如果它确实可以链接到任何一个。)
采纳答案by chubbsondubs
This isn't an easy problem. You are doing all of the basics to see if it something jumps out. It sounds like there is either a slow leak that builds up over time to the point where it can't operate. That sounds like GC is thrashing and app comes unresponsive. It could also be runaway background job(s) eating on the CPU and just doesn't complete, that might explain the long delay. You could try turning off any quartz to see if it stays up longer that might help lead you in a direction, or crank it up so it shows up sooner.
这不是一个容易的问题。你正在做所有的基础工作,看看它是否有什么东西跳出来了。这听起来像是缓慢的泄漏随着时间的推移而累积到无法运行的程度。这听起来像是 GC 正在颠簸并且应用程序没有响应。它也可能是在 CPU 上吃掉的失控后台作业并且没有完成,这可能解释了长时间的延迟。您可以尝试关闭任何石英,看看它是否能保持更长时间,这可能有助于引导您朝着一个方向前进,或者将其调高使其更快出现。
I know you've done some jconsole watching, but I think you need to revisit and watch your memory usage, the threads run time, how much time you're spending in GC, and watching what portions of memory are being eaten up (is it Eden, Tenure that's running out?).
我知道你已经做了一些 jconsole 观察,但我认为你需要重新审视和观察你的内存使用情况、线程运行时间、你在 GC 上花费了多少时间,以及观察内存的哪些部分被吃掉了(是它的伊甸园,任期即将用完?)。
I'd make sure you are writing out start and end messages for your background jobs running in Quartz. Then you can correlate when they start and finish with when this problem starts. Also will tell you if your jobs are finishing or not.
我会确保您为在 Quartz 中运行的后台作业写出开始和结束消息。然后,您可以将它们开始和结束的时间与此问题开始的时间相关联。还会告诉您您的工作是否完成。
It's probably time to drop it into a profiler (instead of jconsole) so you can see where in the code it's spending time or what's blowing up memory. A real profiler will let you see all that data mashed up on your code and classes. My favorites is JProfiler, but YourKit is also good. You can get a 7-30 day trial so you'll have plenty of time to profile and figure your issue out without having to buy it.
可能是时候将它放入分析器(而不是 jconsole)中,以便您可以查看代码中的哪些地方花费了时间或内存爆满。一个真正的分析器将让您看到在您的代码和类中混搭的所有数据。我最喜欢的是 JProfiler,但 YourKit 也不错。您可以获得 7-30 天的试用期,这样您就有足够的时间来分析和解决您的问题,而无需购买。
Start this early in the morning so you'll hopefully see something by early night.
一大早开始,这样你就有望在清晨看到一些东西。