不断增长的 Java 进程驻留内存使用 (RSS)

Question

提问by Erhan Bagdemir

Our recent observation on our production system, tells us the resident memory usage of our Java container grows up. Regarding to this problem, we have made some investigations to understand, why java process consumes much more memory than Heap + Thread Stacks + Shared Objects + Code Cache + etc, using some native tools like pmap. As a result of this, we found some 64M memory blocks (in pairs) allocated by native process (probably with malloc/mmap) :

我们最近对生产系统的观察告诉我们 Java 容器的常驻内存使用量在增长。针对这个问题，我们做了一些调查，了解为什么java进程使用pmap等原生工具比Heap + Thread Stacks + Shared Objects + Code Cache +等消耗更多的内存。结果，我们发现了一些 64M 内存块（成对）由本机进程（可能使用 malloc/mmap）分配：

0000000000400000      4K r-x--  /usr/java/jdk1.7.0_17/bin/java
0000000000600000      4K rw---  /usr/java/jdk1.7.0_17/bin/java
0000000001d39000   4108K rw---    [ anon ]
0000000710000000  96000K rw---    [ anon ]
0000000715dc0000  39104K -----    [ anon ]
00000007183f0000 127040K rw---    [ anon ]
0000000720000000 3670016K rw---    [ anon ]
00007fe930000000  62876K rw---    [ anon ]
00007fe933d67000   2660K -----    [ anon ]
00007fe934000000  20232K rw---    [ anon ]
00007fe9353c2000  45304K -----    [ anon ]
00007fe938000000  65512K rw---    [ anon ]
00007fe93bffa000     24K -----    [ anon ]
00007fe940000000  65504K rw---    [ anon ]
00007fe943ff8000     32K -----    [ anon ]
00007fe948000000  61852K rw---    [ anon ]
00007fe94bc67000   3684K -----    [ anon ]
00007fe950000000  64428K rw---    [ anon ]
00007fe953eeb000   1108K -----    [ anon ]
00007fe958000000  42748K rw---    [ anon ]
00007fe95a9bf000  22788K -----    [ anon ]
00007fe960000000   8080K rw---    [ anon ]
00007fe9607e4000  57456K -----    [ anon ]
00007fe968000000  65536K rw---    [ anon ]
00007fe970000000  22388K rw---    [ anon ]
00007fe9715dd000  43148K -----    [ anon ]
00007fe978000000  60972K rw---    [ anon ]
00007fe97bb8b000   4564K -----    [ anon ]
00007fe980000000  65528K rw---    [ anon ]
00007fe983ffe000      8K -----    [ anon ]
00007fe988000000  14080K rw---    [ anon ]
00007fe988dc0000  51456K -----    [ anon ]
00007fe98c000000  12076K rw---    [ anon ]
00007fe98cbcb000  53460K -----    [ anon ]

I interpret the line with 0000000720000000 3670016K refers to the heap space, of which size we define using JVM parameter "-Xmx". Right after that, the pairs begin, of which sum is 64M exactly. We are using CentOS release 5.10 (Final) 64-bit arch and JDK 1.7.0_17 .

我将 0000000720000000 3670016K 的行解释为堆空间，我们使用 JVM 参数“-Xmx”定义其大小。紧接着，对开始，其总和正好是 64M。我们使用 CentOS 5.10 版（最终版）64 位架构和 JDK 1.7.0_17 。

The question is, what are those blocks? Which subsystem does allocate these?

问题是，这些块是什么？哪个子系统分配这些？

Update: We do not use JIT and/or JNI native code invocations.

更新：我们不使用 JIT 和/或 JNI 本机代码调用。

Answer 1

采纳答案by Lari Hotari

I ran in to the same problem. This is a known problem with glibc >= 2.10

我遇到了同样的问题。这是 glibc >= 2.10 的一个已知问题

The cure is to set this env variable export MALLOC_ARENA_MAX=4

解决方法是设置这个环境变量 export MALLOC_ARENA_MAX=4

IBM article about setting MALLOC_ARENA_MAX https://www.ibm.com/developerworks/community/blogs/kevgrig/entry/linux_glibc_2_10_rhel_6_malloc_may_show_excessive_virtual_memory_usage?lang=en

IBM 关于设置 MALLOC_ARENA_MAX 的文章 https://www.ibm.com/developerworks/community/blogs/kevgrig/entry/linux_glibc_2_10_rhel_6_malloc_may_show_excessive_virtual_memory_usage?lang=en

Google for MALLOC_ARENA_MAX or search for it on SO to find a lot of references.

谷歌搜索 MALLOC_ARENA_MAX 或在 SO 上搜索它以找到很多参考资料。

You might want to tune also other malloc options to optimize for low fragmentation of allocated memory:

您可能还想调整其他 malloc 选项以优化已分配内存的低碎片化：

# tune glibc memory allocation, optimize for low fragmentation
# limit the number of arenas
export MALLOC_ARENA_MAX=2
# disable dynamic mmap threshold, see M_MMAP_THRESHOLD in "man mallopt"
export MALLOC_MMAP_THRESHOLD_=131072
export MALLOC_TRIM_THRESHOLD_=131072
export MALLOC_TOP_PAD_=131072
export MALLOC_MMAP_MAX_=65536

Answer 2

回答by Lari Hotari

It's also possible that there is a native memory leak. A common problem is native memory leaks caused by not closing a ZipInputStream/GZIPInputStream.

也可能存在本机内存泄漏。一个常见的问题是未关闭ZipInputStream/导致本机内存泄漏GZIPInputStream。

A typical way that a ZipInputStreamis opened is by a call to Class.getResource/ClassLoader.getResourceand calling openConnection().getInputStream()on the java.net.URLinstance or by calling Class.getResourceAsStream/ClassLoader.getResourceAsStream. One must ensure that these streams always get closed.

一个典型的方法ZipInputStream是打开是通过调用Class.getResource/ ClassLoader.getResource，并呼吁openConnection().getInputStream()对java.net.URL实例或致电Class.getResourceAsStream/ ClassLoader.getResourceAsStream。必须确保这些流始终关闭。

Some commonly used open source libraries have had bugs that leak unclosed java.util.zip.Inflateror java.util.zip.Deflaterinstances. For example, Nimbus Jose JWT library has fixed a related memory leakin 6.5.1 version. Java JWT (jjwt) had a similar bug that was fixedin 0.10.7 version. The bug pattern in these 2 cases was the fact that calls to DeflaterOutputStream.close()and InflaterInputStream.close()do not call Deflater.end()/Inflater.end()when an Deflater/Inflaterinstance is provided. In those cases, it's not enough to check the code for streams being closed. Every Deflater/Inflaterinstances created in the code must have handling that .end()gets called.

一些常用的开源库存在泄漏未关闭java.util.zip.Inflater或java.util.zip.Deflater实例的错误。例如，Nimbus Jose JWT 库在 6.5.1 版本中修复了相关的内存泄漏。Java JWT (jjwt) 有一个类似的错误，已在 0.10.7 版本中修复。这两种情况下的错误模式是当提供/实例时调用DeflaterOutputStream.close()和InflaterInputStream.close()不调用Deflater.end()/的事实。在这些情况下，检查正在关闭的流的代码是不够的。在代码中创建的每个/实例都必须具有被调用的处理。Inflater.end()DeflaterInflaterDeflaterInflater.end()

One way to check for Zip*Stream leaks is to get a heap dump and search for instances of any class with "zip", "Inflater" or "Deflater" in the name. This is possible in many heap dump analysis tools such as Yourkit Java Profiler, JProfiler or Eclipse MAT. It's also worth checking objects in finalization state since in some cases memory is released only after finalization. Checking for classes that might use native libraries is useful. This applies to TLS/ssl libraries too.

检查 Zip*Stream 泄漏的一种方法是获取堆转储并搜索名称中带有“zip”、“Inflater”或“Deflater”的任何类的实例。这在许多堆转储分析工具中都是可能的，例如 Yourkit Java Profiler、JProfiler 或 Eclipse MAT。还值得检查处于终结状态的对象，因为在某些情况下，内存仅在终结后才被释放。检查可能使用本机库的类很有用。这也适用于 TLS/ssl 库。

There is an OSS tool called leakcheckerfrom Elastic that is a Java Agent that can be used to find the sources of java.util.zip.Inflaterinstances that haven't been closed (.end()not called).

Elastic有一个叫做leakchecker的OSS 工具，它是一个Java Agent，可用于查找java.util.zip.Inflater尚未关闭（.end()未调用）的实例的来源。

For native memory leaks in general (not just for zip library leaks), you can use jemallocto debug native memory leaks by enabling malloc sampling profiling by specifying the settings in MALLOC_CONFenvironment variable. Detailed instructions are available in this blog post: http://www.evanjones.ca/java-native-leak-bug.html. This blog postalso has information about using jemalloc to debug a native memory leak in java applications. There's also a blog post from Elasticfeaturing jemalloc and mentioning leakchecker, the tool that Elastic has opensourced to track down problems caused by unclosed zip inflater resources.

对于一般的本机内存泄漏（不仅仅是 zip 库泄漏），您可以使用jemalloc来调试本机内存泄漏，方法是通过指定MALLOC_CONF环境变量中的设置来启用 malloc 采样分析。此博客文章中提供了详细说明：http: //www.evanjones.ca/java-native-leak-bug.html。这篇博文还包含有关使用 jemalloc 调试 Java 应用程序中的本机内存泄漏的信息。还有一篇来自 Elastic 的博客文章，介绍了 jemalloc 并提到了leakchecker，这是 Elastic 开源的工具，用于追踪由未关闭的 zip 充气器资源引起的问题。

There is also a blog post about a native memory leak related to ByteBuffers. Java 8u102has a special system property jdk.nio.maxCachedBufferSizeto limit the cache issue described in that blog post.

还有一篇关于与 ByteBuffers 相关的本机内存泄漏的博客文章。Java 8u102有一个特殊的系统属性jdk.nio.maxCachedBufferSize来限制该博客文章中描述的缓存问题。

-Djdk.nio.maxCachedBufferSize=262144

It's also good to always check open file handles to see if the memory leak is caused by a large amount of mmap:ed files. On Linux lsofcan be used to list open files and open sockets:

始终检查打开的文件句柄以查看内存泄漏是否由大量 mmap:ed 文件引起也很好。在 Linuxlsof上可用于列出打开的文件和打开的套接字：

lsof -Pan -p PID

The report of the memory map of the process could also help investigate native memory leaks

进程的内存映射报告也可以帮助调查本机内存泄漏

pmap -x PID

For Java processes running in Docker, it should be possible to execute the lsof or pmap command on the "host". You can find the PID of the containerized process with this command

对于在 Docker 中运行的 Java 进程，应该可以在“主机”上执行 lsof 或 pmap 命令。您可以使用此命令找到容器化进程的 PID

docker inspect --format '{{.State.Pid}}' container_id

It's also useful to get a thread dump (or use jconsole/JMX) to check the number of threads since each thread consumes 1MB of native memory for its stack. A large number of threads would use a lot of memory.

获取线程转储（或使用 jconsole/JMX）来检查线程数也很有用，因为每个线程为其堆栈消耗 1MB 的本机内存。大量线程将使用大量内存。

There is also Native Memory Tracking (NMT) in the JVM. That might be useful to check if it's the JVM itself that is using up the native memory.

JVM 中还有本机内存跟踪 (NMT)。这可能有助于检查是否是 JVM 本身耗尽了本机内存。

The jattach toolcan be used also in containerized (docker) environment to trigger threaddumps or heapdumps from the host. It is also able to run jcmd commands which is needed for controlling NMT.

所述jattach工具也可用于在集装箱（搬运工）环境中从主机触发threaddumps或heapdumps。它还能够运行控制 NMT 所需的 jcmd 命令。

不断增长的 Java 进程驻留内存使用 (RSS)

提问by Erhan Bagdemir

采纳答案by Lari Hotari

回答by Lari Hotari

相关推荐

最近更新

标签

不断增长的 Java 进程驻留内存使用 (RSS)

提问by Erhan Bagdemir

采纳答案by Lari Hotari

回答by Lari Hotari

相关推荐

在 Java Android App 中将输入流转换为字符串值

如何将 Java 中的字符串从全名到姓氏、名字分开？

java 未找到提供程序类：运行 Jersey REST 服务时

java @EnableTransactionManagement 的范围是什么？

相关推荐

最近更新

标签