Java 为什么在向 HashMap 中插入 50,000 个对象时会出现 OutOfMemoryError?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/235047/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Why do I get an OutOfMemoryError when inserting 50,000 objects into HashMap?
提问by Frank Krueger
I am trying to insert about 50,000 objects (and therefore 50,000 keys) into a java.util.HashMap<java.awt.Point, Segment>
. However, I keep getting an OutOfMemory exception. (Segment
is my own class - very light weight - one String
field, and 3 int
fields).
我正在尝试将大约 50,000 个对象(因此也有 50,000 个键)插入到java.util.HashMap<java.awt.Point, Segment>
. 但是,我不断收到 OutOfMemory 异常。(Segment
是我自己的班级 - 重量很轻 - 一个String
字段和 3 个int
字段)。
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space at java.util.HashMap.resize(HashMap.java:508) at java.util.HashMap.addEntry(HashMap.java:799) at java.util.HashMap.put(HashMap.java:431) at bus.tools.UpdateMap.putSegment(UpdateMap.java:168)
This seems quite ridiculous since I see that there is plenty of memory available on the machine - both in free RAM and HD space for virtual memory.
这看起来很荒谬,因为我看到机器上有足够的内存可用 - 在空闲 RAM 和用于虚拟内存的 HD 空间中。
Is it possible Java is running with some stringent memory requirements? Can I increase these?
Java 是否有可能在一些严格的内存要求下运行?我可以增加这些吗?
Is there some weird limitation with HashMap
? Am I going to have to implement my own? Are there any other classes worth looking at?
有HashMap
什么奇怪的限制吗?我将不得不实施我自己的吗?还有其他值得一看的课程吗?
(I am running Java 5 under OS X 10.5 on an Intel machine with 2GB RAM.)
(我在具有 2GB RAM 的 Intel 机器上在 OS X 10.5 下运行 Java 5。)
采纳答案by Michael Myers
You can increase the maximum heap size by passing -Xmx128m (where 128 is the number of megabytes) to java. I can't remember the default size, but it strikes me that it was something rather small.
您可以通过将 -Xmx128m(其中 128 是兆字节数)传递给 java 来增加最大堆大小。我不记得默认大小,但我觉得它很小。
You can programmatically check how much memory is available by using the Runtimeclass.
您可以使用Runtime类以编程方式检查可用内存量。
// Get current size of heap in bytes
long heapSize = Runtime.getRuntime().totalMemory();
// Get maximum size of heap in bytes. The heap cannot grow beyond this size.
// Any attempt will result in an OutOfMemoryException.
long heapMaxSize = Runtime.getRuntime().maxMemory();
// Get amount of free memory within the heap in bytes. This size will increase
// after garbage collection and decrease as new objects are created.
long heapFreeSize = Runtime.getRuntime().freeMemory();
(Example from Java Developers Almanac)
(来自Java 开发人员年鉴的示例)
This is also partially addressed in Frequently Asked Questions About the Java HotSpot VM, and in the Java 6 GC Tuning page.
这也在关于 Java HotSpot VM 的常见问题和Java 6 GC 调优页面中得到了部分解决。
回答by Allain Lalonde
回答by JasonTrue
You probably need to set the flag -Xmx512m or some larger number when starting java. I think 64mb is the default.
您可能需要在启动 java 时设置标志 -Xmx512m 或更大的数字。我认为 64mb 是默认值。
Edited to add: After you figure out how much memory your objects are actually using with a profiler, you may want to look into weak references or soft references to make sure you're not accidentally holding some of your memory hostage from the garbage collector when you're no longer using them.
编辑添加:在您通过分析器计算出您的对象实际使用了多少内存后,您可能需要查看弱引用或软引用,以确保您不会意外地从垃圾收集器中保留一些内存作为人质你不再使用它们。
回答by davetron5000
Implicit in these answers it that Java has a fixed size for memory and doesn't grow beyond the configured maximum heap size. This is unlike, say, C, where it's constrained only by the machine on which it's being run.
这些答案暗示 Java 具有固定的内存大小,并且不会超过配置的最大堆大小。这与 C 不同,在 C 中它仅受运行它的机器的约束。
回答by erickson
By default, the JVM uses a limited heap space. The limit is JVM implementation-dependent, and it's not clear what JVM you are using. On OS's other than Windows, a 32-bit Sun JVM on a machine with 2 Gb or more will use a default maximum heap size of 1/4 of the physical memory, or 512 Mb in your case. However, the default for a "client" mode JVM is only 64 Mb maximum heap size, which may be what you've run into. Other vendor's JVM's may select different defaults.
默认情况下,JVM 使用有限的堆空间。该限制取决于 JVM 实现,并且不清楚您使用的是什么 JVM。在 Windows 以外的操作系统上,2 Gb 或更大容量的机器上的 32 位 Sun JVM 将使用物理内存的 1/4 的默认最大堆大小,或者在您的情况下为 512 Mb。但是,“客户端”模式 JVM 的默认值只有 64 Mb 的最大堆大小,这可能是您遇到的问题。其他供应商的 JVM 可能会选择不同的默认值。
Of course, you can specify the heap limit explicitly with the -Xmx<NN>m
option to java
, where <NN>
is the number of megabytes for the heap.
当然,您可以使用-Xmx<NN>m
to 选项显式指定堆限制java
,其中<NN>
是堆的兆字节数。
As a rough guess, your hash table should only be using about 16 Mb, so there must be some other large objects on the heap. If you could use a Comparable
key in a TreeMap
, that would save some memory.
粗略估计,您的哈希表应该只使用大约 16 Mb,因此堆上肯定还有其他一些大对象。如果您可以Comparable
在 a 中使用一个键TreeMap
,那将节省一些内存。
See "Ergonomics in the 5.0 JVM"for more details.
有关更多详细信息,请参阅“5.0 JVM 中的人体工程学”。
回答by sk.
Another thing to try if you know the number of objects beforehand is to use the HashMap(int capacity,double loadfactor) constructor instead of the default no-arg one which uses defaults of (16,0.75). If the number of elements in your HashMap exceeds (capacity * loadfactor) then the underlying array in the HashMap will be resized to the next power of 2 and the table will be rehashed. This array also requires a contiguous area of memory so for example if you're doubling from a 32768 to a 65536 size array you'll need a 256kB chunk of memory free. To avoid the extra allocation and rehashing penalties, just use a larger hash table from the start. It'll also decrease the chance that you won't have a contiguous area of memory large enough to fit the map.
如果您事先知道对象的数量,那么要尝试的另一件事是使用 HashMap(int capacity,double loadfactor) 构造函数,而不是使用默认值 (16,0.75) 的默认无参数构造函数。如果 HashMap 中的元素数量超过 (capacity * loadfactor),则 HashMap 中的底层数组将调整为 2 的下一个幂,并且表将重新散列。该数组还需要一个连续的内存区域,例如,如果您要从 32768 大小的数组翻倍到 65536 大小的数组,您将需要 256kB 的空闲内存块。为了避免额外的分配和重新散列惩罚,只需从一开始就使用更大的哈希表。它还会降低您没有足够大的连续内存区域来容纳地图的可能性。
回答by Josh
The implementations are backed by arrays usually. Arrays are fixed size blocks of memory. The hashmap implementation starts by storing data in one of these arrays at a given capacity, say 100 objects.
这些实现通常由数组支持。数组是固定大小的内存块。hashmap 实现首先以给定的容量将数据存储在这些数组之一中,比如 100 个对象。
If it fills up the array and you keep adding objects the map needs to secretly increase its array size. Since arrays are fixed, it does this by creating an entirely new array, in memory, along with the current array, that is slightly larger. This is referred to as growing the array. Then all the items from the old array are copied into the new array and the old array is dereferenced with the hope it will be garbage collected and the memory freed at some point.
如果它填满数组并且您不断添加对象,则地图需要秘密增加其数组大小。由于数组是固定的,因此它通过在内存中创建一个全新的数组以及当前数组来实现这一点,该数组稍大。这称为增长阵列。然后旧数组中的所有项目都被复制到新数组中,旧数组被取消引用,希望它会被垃圾收集并在某个时候释放内存。
Usually the code that increases the capacity of the map by copying items into a larger array is the cause of such a problem. There are "dumb" implementations and smart ones that use a growth or load factor that determines the size of the new array based on the size of the old array. Some implementations hide these parameters and some do not so you cannot always set them. The problem is that when you cannot set it, it chooses some default load factor, like 2. So the new array is twice the size of the old. Now your supposedly 50k map has a backing array of 100k.
通常通过将项目复制到更大的数组来增加地图容量的代码是导致此类问题的原因。有“愚蠢”的实现和使用增长或负载因子的智能实现,它们根据旧数组的大小确定新数组的大小。有些实现隐藏了这些参数,有些则没有,所以你不能总是设置它们。问题是当你不能设置它时,它会选择一些默认的加载因子,比如 2。所以新数组的大小是旧数组的两倍。现在你所谓的 50k 地图有一个 100k 的支持数组。
Look to see if you can reduce the load factor down to 0.25 or something. this causes more hash map collisions which hurts performance but you are hitting a memory bottleneck and need to do so.
看看你是否可以将负载因子降低到 0.25 或其他什么。这会导致更多的哈希映射冲突,从而损害性能,但您遇到了内存瓶颈,需要这样做。
Use this constructor:
使用这个构造函数:
(http://java.sun.com/javase/6/docs/api/java/util/HashMap.html#HashMap(int, float))
( http://java.sun.com/javase/6/docs/api/java/util/HashMap.html#HashMap(int, float))
回答by Uri
The Java heap space is limited by default, but that still sounds extreme (though how big are your 50000 segments?)
默认情况下,Java 堆空间是有限的,但这听起来仍然很极端(尽管您的 50000 个段有多大?)
I am suspecting that you have some other problem, like the arrays in the set growing too big because everything gets assigned into the same "slot" (also affects performance, of course). However, that seems unlikely if your points are uniformly distributed.
我怀疑你有其他一些问题,比如集合中的数组变得太大,因为所有东西都被分配到同一个“插槽”中(当然也会影响性能)。但是,如果您的点是均匀分布的,这似乎不太可能。
I'm wondering though why you're using a HashMap rather than a TreeMap? Even though points are two dimensional, you could subclass them with a compare function and then do log(n) lookups.
我想知道你为什么使用 HashMap 而不是 TreeMap?即使点是二维的,您也可以使用比较函数对它们进行子类化,然后进行 log(n) 查找。
回答by Michael Myers
Some people are suggesting changing the parameters of the HashMap to tighten up the memory requirements. I would suggest to measure instead of guessing; it might be something else causing the OOME. In particular, I'd suggest using either NetBeans Profileror VisualVM(which comes with Java 6, but I see you're stuck with Java 5).
有些人建议更改 HashMap 的参数以收紧内存要求。我建议测量而不是猜测;可能是其他原因导致 OOME。特别是,我建议使用NetBeans Profiler或VisualVM(它随 Java 6 一起提供,但我发现您被 Java 5 困住了)。
回答by Kevin Day
Random thought: The hash buckets associated with HashMap are not particularly memory efficient. You may want to try out TreeMap as an alternative and see if it still provide sufficient performance.
随机想法:与HashMap关联的哈希桶并不是特别节省内存。您可能想尝试使用 TreeMap 作为替代方案,看看它是否仍能提供足够的性能。