如何在 Java 中创建内存泄漏?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/6470651/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-16 06:14:10  来源:igfitidea点击:

How to create a memory leak in Java?

javamemorymemory-leaks

提问by Mat B.

I just had an interview, and I was asked to create a memory leakwith Java.
Needless to say, I felt pretty dumb having no clue on how to even start creating one.

我刚刚接受了一次面试,我被要求用 Java创建内存泄漏
不用说,我什至不知道如何开始创建一个,我觉得很愚蠢。

What would an example be?

一个例子是什么?

采纳答案by Daniel Pryden

Here's a good way to create a true memory leak (objects inaccessible by running code but still stored in memory) in pure Java:

这是在纯 Java 中创建真正内存泄漏(运行代码无法访问但仍存储在内存中的对象)的好方法:

  1. The application creates a long-running thread (or use a thread pool to leak even faster).
  2. The thread loads a class via an (optionally custom) ClassLoader.
  3. The class allocates a large chunk of memory (e.g. new byte[1000000]), stores a strong reference to it in a static field, and then stores a reference to itself in a ThreadLocal. Allocating the extra memory is optional (leaking the class instance is enough), but it will make the leak work that much faster.
  4. The application clears all references to the custom class or the ClassLoaderit was loaded from.
  5. Repeat.
  1. 应用程序创建一个长时间运行的线程(或者使用线程池来更快地泄漏)。
  2. 线程通过(可选的自定义)加载一个类ClassLoader
  3. 该类分配一大块内存(例如new byte[1000000]),将对其的强引用存储在静态字段中,然后将对其自身的引用存储在ThreadLocal. 分配额外的内存是可选的(泄漏类实例就足够了),但它会使泄漏工作得更快。
  4. 应用程序清除对自定义类或ClassLoader从其加载的所有引用。
  5. 重复。

Due to the way ThreadLocalis implemented in Oracle's JDK, this creates a memory leak:

由于该方式ThreadLocal是在 Oracle 的 JDK 中实现的,这会造成内存泄漏:

  • Each Threadhas a private field threadLocals, which actually stores the thread-local values.
  • Each keyin this map is a weak reference to a ThreadLocalobject, so after that ThreadLocalobject is garbage-collected, its entry is removed from the map.
  • But each valueis a strong reference, so when a value (directly or indirectly) points to the ThreadLocalobject that is its key, that object will neither be garbage-collected nor removed from the map as long as the thread lives.
  • 每个Thread都有一个私有字段threadLocals,它实际上存储线程本地值。
  • 此映射中的每个都是对ThreadLocal对象的弱引用,因此在该ThreadLocal对象被垃圾收集后,其条目将从映射中删除。
  • 但是每个都是一个强引用,因此当一个值(直接或间接)指向ThreadLocal作为其键的对象时,只要线程存在,该对象既不会被垃圾收集也不会从映射中删除。

In this example, the chain of strong references looks like this:

在此示例中,强引用链如下所示:

Threadobject → threadLocalsmap → instance of example class → example class → static ThreadLocalfield → ThreadLocalobject.

Thread对象→threadLocals映射→示例类的实例→示例类→静态ThreadLocal字段→ThreadLocal对象。

(The ClassLoaderdoesn't really play a role in creating the leak, it just makes the leak worse because of this additional reference chain: example class → ClassLoader→ all the classes it has loaded. It was even worse in many JVM implementations, especially prior to Java 7, because classes and ClassLoaders were allocated straight into permgen and were never garbage-collected at all.)

ClassLoader在创建泄漏中并没有真正发挥作用,它只是因为这个额外的引用链使泄漏变得更糟:示例类→→ClassLoader它加载的所有类。在许多JVM实现中甚至更糟,尤其是在Java 7,因为 classes 和ClassLoaders 直接分配到 permgen 并且根本没有被垃圾收集。)

A variation on this pattern is why application containers (like Tomcat) can leak memory like a sieve if you frequently redeploy applications which happen to use ThreadLocals that in some way point back to themselves. This can happen for a number of subtle reasons and is often hard to debug and/or fix.

这种模式的一个变体是为什么如果您经常重新部署碰巧使用ThreadLocal以某种方式指向自身的 s 的应用程序,应用程序容器(如 Tomcat)会像筛子一样泄漏内存。发生这种情况的原因有很多,而且通常很难调试和/或修复。

Update: Since lots of people keep asking for it, here's some example code that shows this behavior in action.

更新:由于很多人不断要求它,这里有一些示例代码显示了这种行为

回答by Vineet Reynolds

The following is a pretty pointless example, if you do not understand JDBC. Or at least how JDBC expects a developer to close Connection, Statementand ResultSetinstances before discarding them or losing references to them, instead of relying on the implementation of finalize.

如果您不了解JDBC,以下是一个非常无意义的示例。或者至少 JDBC 如何期望开发人员在丢弃它们或丢失对它们的引用之前关闭Connection,StatementResultSet实例,而不是依赖于finalize.

void doWork()
{
   try
   {
       Connection conn = ConnectionFactory.getConnection();
       PreparedStatement stmt = conn.preparedStatement("some query"); // executes a valid query
       ResultSet rs = stmt.executeQuery();
       while(rs.hasNext())
       {
          ... process the result set
       }
   }
   catch(SQLException sqlEx)
   {
       log(sqlEx);
   }
}

The problem with the above is that the Connectionobject is not closed, and hence the physical connection will remain open, until the garbage collector comes around and sees that it is unreachable. GC will invoke the finalizemethod, but there are JDBC drivers that do not implement the finalize, at least not in the same way that Connection.closeis implemented. The resulting behavior is that while memory will be reclaimed due to unreachable objects being collected, resources (including memory) associated with the Connectionobject might simply not be reclaimed.

上面的问题是Connection对象没有关闭,因此物理连接将保持打开状态,直到垃圾收集器出现并发现它无法访问。GC 将调用该finalize方法,但是有些 JDBC 驱动程序没有实现finalize,至少不是以相同的方式Connection.close实现。由此产生的行为是,虽然由于收集了无法访问的对象而将回收内存,但与该Connection对象关联的资源(包括内存)可能根本不会被回收。

In such an event where the Connection's finalizemethod does not clean up everything, one might actually find that the physical connection to the database server will last several garbage collection cycles, until the database server eventually figures out that the connection is not alive (if it does), and should be closed.

在这种Connection'sfinalize方法没有清除所有内容的情况下,人们实际上可能会发现到数据库服务器的物理连接将持续几个垃圾收集周期,直到数据库服务器最终发现连接不活动(如果它确实),并且应该关闭。

Even if the JDBC driver were to implement finalize, it is possible for exceptions to be thrown during finalization. The resulting behavior is that any memory associated with the now "dormant" object will not be reclaimed, as finalizeis guaranteed to be invoked only once.

即使 JDBC 驱动程序要实现finalize,也有可能在完成过程中抛出异常。由此产生的行为是,与现在“休眠”对象关联的任何内存都不会被回收,因为finalize保证只被调用一次。

The above scenario of encountering exceptions during object finalization is related to another other scenario that could possibly lead to a memory leak - object resurrection. Object resurrection is often done intentionally by creating a strong reference to the object from being finalized, from another object. When object resurrection is misused it will lead to a memory leak in combination with other sources of memory leaks.

上述在对象终结过程中遇到异常的场景与另一个可能导致内存泄漏的场景有关——对象复活。对象复活通常是通过从另一个对象创建一个对最终确定的对象的强引用来有意完成的。当对象复活被误用时,它将与其他内存泄漏源一起导致内存泄漏。

There are plenty more examples that you can conjure up - like

你可以想象出更多的例子——比如

  • Managing a Listinstance where you are only adding to the list and not deleting from it (although you should be getting rid of elements you no longer need), or
  • Opening Sockets or Files, but not closing them when they are no longer needed (similar to the above example involving the Connectionclass).
  • Not unloading Singletons when bringing down a Java EE application. Apparently, the Classloader that loaded the singleton class will retain a reference to the class, and hence the singleton instance will never be collected. When a new instance of the application is deployed, a new class loader is usually created, and the former class loader will continue to exist due to the singleton.
  • 管理List您只添加到列表而不从列表中删除的实例(尽管您应该删除不再需要的元素),或
  • 打开Sockets 或Files,但在不再需要它们时不关闭它们(类似于上面涉及Connection类的示例)。
  • 在关闭 Java EE 应用程序时不卸载 Singleton。显然,加载单例类的类加载器将保留对该类的引用,因此永远不会收集单例实例。当应用程序的新实例被部署时,通常会创建一个新的类加载器,由于单例,以前的类加载器将继续存在。

回答by Pavel Molchanov

I want to give an advice on how to monitor application for the memory leaks with the tools that are available in JVM. It doesn't show how to generate the memory leak but explains how to detect it with minimum tools available.

我想就如何使用 JVM 中可用的工具监视应用程序的内存泄漏提出建议。它没有展示如何产生内存泄漏,但解释了如何使用最少的可用工具来检测它。

You need to monitor Java memory consumption first.

您需要先监视 Java 内存消耗。

The simplest way to do this is to use jstat utility that comes with JVM.

最简单的方法是使用 JVM 附带的 jstat 实用程序。

jstat -gcutil <process_id> <timeout>

It will report memory consumption for each generation (Young, Eldery and Old) and garbage collection times (Young and Full).

它将报告每一代(Young、Eldery 和 Old)的内存消耗和垃圾收集时间(Young 和 Full)。

As soon as you spot that Full Garbage Collection is executed too often and takes too much time, you can assume that application is leaking memory.

一旦您发现 Full Garbage Collection 执行过于频繁并花费太多时间,您就可以假设应用程序正在泄漏内存。

Then you need to create a memory dump using jmap utility:

然后您需要使用 jmap 实用程序创建内存转储:

jmap -dump:live,format=b,file=heap.bin <process_id>

Then you need to analyse heap.bin file with Memory Analyser, Eclipse Memory Analyzer (MAT) for example.

然后您需要使用内存分析器,例如 Eclipse 内存分析器 (MAT) 来分析 heap.bin 文件。

MAT will analyze the memory and provide you suspect information about memory leaks.

MAT 将分析内存并为您提供有关内存泄漏的可疑信息。

回答by Praveen Kumar

A real time example of memory leak before JDK 1.7

JDK 1.7 之前的内存泄漏实时示例

suppose you read a file of 1000 lines of text and keep in String object

假设您阅读了一个包含 1000 行文本的文件并保存在 String 对象中

String fileText = 1000 characters from file

fileText = fileText.subString(900, fileText.length());

In above code I initially read 1000 char and then did substring to get only 100 last characters. Now fileText should only refer to 100 chars and all other characters should get garbage collected as I lost the reference but before JDK 1.7 substring function indirectly refer to original string of last 100 chars and prevents whole string from garbage collection and whole 1000 chars will be there in memory until you loose reference of substring.

在上面的代码中,我最初读取 1000 个字符,然后执行 substring 以仅获取最后 100 个字符。现在 fileText 应该只引用 100 个字符,所有其他字符都应该被垃圾收集,因为我丢失了引用,但在 JDK 1.7 子字符串函数间接引用最后 100 个字符的原始字符串并防止整个字符串被垃圾收集时,整个 1000 个字符将在那里在内存中,直到您失去对子字符串的引用。

you can create memory leak example like the above

你可以像上面那样创建内存泄漏示例

回答by Praveen Kumar

A memory leak in java is not your typical C/C++ memory leak.

Java 中的内存泄漏不是典型的 C/C++ 内存泄漏。

To understand how the JVM works, read the Understanding Memory Management.

要了解 JVM 的工作原理,请阅读了解内存管理

Basically, the important part is:

基本上,重要的部分是:

The Mark and Sweep Model

The JRockit JVM uses the mark and sweep garbage collection model for performing garbage collections of the whole heap. A mark and sweep garbage collection consists of two phases, the mark phase and the sweep phase.

During the mark phase all objects that are reachable from Java threads, native handles and other root sources are marked as alive, as well as the objects that are reachable from these objects and so forth. This process identifies and marks all objects that are still used, and the rest can be considered garbage.

During the sweep phase the heap is traversed to find the gaps between the live objects. These gaps are recorded in a free list and are made available for new object allocation.

The JRockit JVM uses two improved versions of the mark and sweep model. One is mostly concurrent mark and sweep and the other is parallel mark and sweep. You can also mix the two strategies, running for example mostly concurrent mark and parallel sweep.

标记和清除模型

JRockit JVM 使用标记和清除垃圾收集模型来执行整个堆的垃圾收集。标记和清除垃圾收集由两个阶段组成,标记阶段和清除阶段。

在标记阶段,所有可从 Java 线程、本机句柄和其他根源访问的对象都被标记为活动的,以及从这些对象等可访问的对象。这个过程识别并标记所有仍在使用的对象,其余的可以认为是垃圾。

在扫描阶段,遍历堆以找到活动对象之间的间隙。这些间隙被记录在一个空闲列表中,并可用于新的对象分配。

JRockit JVM 使用标记和清除模型的两个改进版本。一种多为并发标记和扫描,另一种是并行标记和扫描。您还可以混合使用这两种策略,例如主要运行并发标记和并行扫描。

So, to create a memory leak in Java; the easiest way to do that is to create a database connection, do some work, and simply not Close()it; then generate a new database connection while staying in scope. This isn't hard to do in a loop for example. If you have a worker that pulls from a queue and pushes to a database you can easily create a memory leak by forgetting to Close()connections or opening them when not necessary, and so forth.

因此,要在 Java 中创建内存泄漏;最简单的方法是创建一个数据库连接,做一些工作,而不是Close()它;然后在保持范围内生成一个新的数据库连接。例如,这在循环中并不难做到。如果您有一个从队列中拉取并推送到数据库的工作程序,您可以通过忘记Close()连接或在不需要时打开它们等来轻松创建内存泄漏。

Eventually, you'll consume the heap that has been allocated to the JVM by forgetting to Close()the connection. This will result in the JVM garbage collecting like crazy; eventually resulting in java.lang.OutOfMemoryError: Java heap spaceerrors. It should be noted that the error may not mean there is a memory leak; it could just mean you don't have enough memory; databases like Cassandra and ElasticSearch for example can throw that error because they don't have enough heap space.

最终,您将通过忘记Close()连接来消耗已分配给 JVM 的堆。这会导致 JVM 垃圾收集像疯了一样;最终导致java.lang.OutOfMemoryError: Java heap space错误。需要注意的是,该错误可能并不意味着存在内存泄漏;这可能只是意味着您没有足够的内存;例如,像 Cassandra 和 ElasticSearch 这样的数据库可能会抛出该错误,因为它们没有足够的堆空间。

Its worth noting that this is true for all GC languages. Below, are some examples I've seen working as an SRE:

值得注意的是,这适用于所有 GC 语言。下面是我看到的一些作为 SRE 工作的例子:

  • Node using Redis as a queue; the development team created new connections every 12 hours and forgot to close the old ones. Eventually node was OOMd because it consumed all the memory.
  • Golang (I'm guilty of this one); parsing large json files with json.Unmarshaland then passing the results by reference and keeping them open. Eventually, this resulted in the entire heap being consumed by accidental refs I kept open to decode json.
  • 节点使用Redis作为队列;开发团队每 12 小时创建一次新连接,而忘记关闭旧连接。最终节点 OOMd 因为它消耗了所有内存。
  • Golang(我犯了这个);解析大型 json 文件,json.Unmarshal然后通过引用传递结果并保持它们打开。最终,这导致整个堆被意外的 refs 消耗掉,我保持打开以解码 json。

回答by Viraj

String.substring method in java 1.6 create a memory leak. This blog post explains it.

java 1.6 中的 String.substring 方法会造成内存泄漏。这篇博文对此进行了解释。

http://javarevisited.blogspot.com/2011/10/how-substring-in-java-works.html

http://javarevisited.blogspot.com/2011/10/how-substring-in-java-works.html

回答by Audrius Meskauskas

A thread that does not terminate (say sleeps indefinitely in its run method). It will not be garbage collected even if we loose a reference to it. You can add fields to make the thread object is a big as you want.

一个不会终止的线程(比如在其 run 方法中无限期地休眠)。即使我们丢失了对它的引用,它也不会被垃圾收集。您可以根据需要添加字段以使线程对象变大。

The currently top answer lists more tricks around this but these seem redundant.

当前的最佳答案列出了更多关于此的技巧,但这些似乎是多余的。

回答by user1050755

If you don't use a compacting garbage collector, you can have some sort of a memory leak due to heap fragmentation.

如果您不使用压缩垃圾收集器,则可能会由于堆碎片而导致某种内存泄漏。

回答by arnt

Most of the memory leaks I've seen in java concern processes getting out of sync.

我在 java 中看到的大多数内存泄漏都与进程不同步有关。

Process A talks to B via TCP, and tells process B to create something. B issues the resource an ID, say 432423, which A stores in an object and uses while talking to B. At some point the object in A is reclaimed by garbage collection (maybe due to a bug), but A never tells B that (maybe another bug).

进程 A 通过 TCP 与 B 对话,并告诉进程 B 创建一些东西。B 向资源发出一个 ID,比如 432423,A 将其存储在一个对象中并在与 B 交谈时使用它。在某个时刻,A 中的对象被垃圾收集回收(可能是由于错误),但 A 从未告诉 B(也许是另一个错误)。

Now A doesn't have the ID of the object it's created in B's RAM any more, and B doesn't know that A has no more reference to the object. In effect, the object is leaked.

现在 A 不再拥有它在 B 的 RAM 中创建的对象的 ID,并且 B 不知道 A 不再引用该对象。实际上,该对象已泄漏。

回答by Wesley Tarle

The interviewer might have be looking for a circular reference solution:

面试官可能一直在寻找循环参考解决方案:

    public static void main(String[] args) {
        while (true) {
            Element first = new Element();
            first.next = new Element();
            first.next.next = first;
        }
    }

This is a classic problem with reference counting garbage collectors. You would then politely explain that JVMs use a much more sophisticated algorithm that doesn't have this limitation.

这是引用计数垃圾收集器的经典问题。然后,您会礼貌地解释说 JVM 使用了一种没有此限制的更复杂的算法。

-Wes Tarle

-韦斯·塔尔

回答by mschonaker

I think that a valid example could be using ThreadLocal variables in an environment where threads are pooled.

我认为一个有效的例子可能是在线程池化的环境中使用 ThreadLocal 变量。

For instance, using ThreadLocal variables in Servlets to communicate with other web components, having the threads being created by the container and maintaining the idle ones in a pool. ThreadLocal variables, if not correctly cleaned up, will live there until, possibly, the same web component overwrites their values.

例如,在 Servlet 中使用 ThreadLocal 变量与其他 Web 组件通信,由容器创建线程并将空闲线程维护在池中。ThreadLocal 变量,如果没有正确清理,将一直存在,直到可能相同的 Web 组件覆盖它们的值。

Of course, once identified, the problem can be solved easily.

当然,一旦确定,问题就可以轻松解决。