JAXB - 解组 OutOfMemory:Java 堆空间
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/7968694/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
JAXB - unmarshal OutOfMemory: Java Heap Space
提问by TyC
I'm currently trying to use JAXB to unmarshal an XML file, but it seems that the XML file is too large (~500mb) for the unmarshaller to handle. I keep getting java.lang.OutOfMemoryError: Java heap space
@
我目前正在尝试使用 JAXB 来解组 XML 文件,但 XML 文件似乎太大(~500mb),解组器无法处理。我不断收到java.lang.OutOfMemoryError: Java heap space
@
Unmarshaller um = JAXBContext.newInstance("com.sample.xml");
Export e = (Export)um.unmarhsal(new File("SAMPLE.XML"));
I'm guessing this is becuase it's trying to open the large XML file as an object, but the file is just too large for the java heap space.
我猜这是因为它试图将大型 XML 文件作为对象打开,但该文件对于 java 堆空间来说太大了。
Is there any other more 'memory efficient' method of parsing large XML files ~ 500mb? Or perhaps an unmarshaller property that may help me handle the large XML file?
有没有其他更“内存高效”的方法来解析大约 500mb 的大型 XML 文件?或者也许可以帮助我处理大型 XML 文件的解组器属性?
Here's what my XML looks like
这是我的 XML 的样子
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<!-- -->
<Export xmlns="wwww.foo.com" xmlns:xsi="www.foo1.com" xsi:schemaLocation="www.foo2.com/.xsd">
<!--- --->
<Origin ID="foooo" />
<!---- ---->
<WorkSets>
<WorkSet>
<Work>
.....
<Work>
....
<Work>
.....
</WorkSet>
<WorkSet>
....
</WorkSet>
</WorkSets>
I'd like to unmarshal at the WorkSet level, still being able to read through all of the work for each WorkSet.
我想在工作集级别解组,仍然能够通读每个工作集的所有工作。
回答by bdoughan
What does your XML look like? Typically for large documents I recommend people leverage a StAX XMLStreamReader so that the document can be unmarshalled by JAXB in chunks.
你的 XML 是什么样的?通常对于大型文档,我建议人们利用 StAX XMLStreamReader,以便 JAXB 可以将文档分块解组。
input.xml
输入文件
In the document below there are many instances of the person
element. We can use JAXB with a StAX XMLStreamReader
to unmarshal the corresponding Person
objects one at a time to avoid running out of memory.
在下面的文档中有许多person
元素的实例。我们可以使用带有 StAX 的 JAXB 一次XMLStreamReader
解组相应的Person
对象以避免内存不足。
<people>
<person>
<name>Jane Doe</name>
<address>
...
</address>
</person>
<person>
<name>John Smith</name>
<address>
...
</address>
</person>
....
</people>
Demo
演示
import java.io.*;
import javax.xml.stream.*;
import javax.xml.bind.*;
public class Demo {
public static void main(String[] args) throws Exception {
XMLInputFactory xif = XMLInputFactory.newInstance();
XMLStreamReader xsr = xif.createXMLStreamReader(new FileReader("input.xml"));
xsr.nextTag(); // Advance to statements element
JAXBContext jc = JAXBContext.newInstance(Person.class);
Unmarshaller unmarshaller = jc.createUnmarshaller();
while(xsr.nextTag() == XMLStreamConstants.START_ELEMENT) {
Person person = (Person) unmarshaller.unmarshal(xsr);
}
}
}
Person
人
Instead of matching on the root element of the XML document we need to add @XmlRootElement
annotations on the local root of the XML fragment that we will be unmarshalling from.
我们需要@XmlRootElement
在要解组的 XML 片段的本地根上添加注释,而不是在 XML 文档的根元素上进行匹配。
@XmlRootElement
public class Person {
}
回答by Dave Newton
You could increase the heap space using the -Xmx
startup argument.
您可以使用-Xmx
启动参数增加堆空间。
For large files, SAX processing is more memory-efficient since it's event driven, and doesn't load the entire structure in to memory.
对于大文件,SAX 处理的内存效率更高,因为它是事件驱动的,并且不会将整个结构加载到内存中。
回答by Lolke Dijkstra
I've been doing a lot of research in particular with regards to parsing very large input sets conveniently. It's true that you could combine StaX and JaxB to selectively parse XML fragments, but it's not always possible or preferable. If you're interested to read more on the topic please have a look at:
我一直在做很多研究,特别是关于方便地解析非常大的输入集。的确,您可以结合 StaX 和 JaxB 来选择性地解析 XML 片段,但这并不总是可行或可取的。如果您有兴趣阅读有关该主题的更多信息,请查看:
http://xml2java.net/documents/XMLParserTechnologyForProcessingHugeXMLfiles.pdf
http://xml2java.net/documents/XMLParserTechnologyForProcessingHugeXMLfiles.pdf
In this document I describe an alternative approach that is very straight forward and convenient to use. It parses arbitrarily large input sets, whilst giving you access to your data in a javabeans fashion.
在本文档中,我描述了一种非常直接且易于使用的替代方法。它解析任意大的输入集,同时让您以 javabeans 的方式访问您的数据。
回答by JB Nizet
Use SAXor StAX. But if the goal is to have an in-memory object representation of the file, you'll still need lots of memory to hold the contents of such a big file. In this case, your only hope is to increase the heap size using the -Xmx1024m
JVM option (which sets the max heap size to 1024 MBs)
使用SAX或StAX。但是,如果目标是拥有文件的内存对象表示,您仍然需要大量内存来保存如此大文件的内容。在这种情况下,您唯一的希望是使用-Xmx1024m
JVM 选项增加堆大小(将最大堆大小设置为 1024 MB)
回答by JustTry
You can try this too this is kind of not good practice but its working :) who cares
你也可以试试这个,这不是一个好的做法,但它的工作:) 谁在乎
http://amitsavm.blogspot.in/2015/02/partially-parsing-xml-using-jaxb-by.html
http://amissavm.blogspot.in/2015/02/partially-parsing-xml-using-jaxb-by.html
Other wise use STAX or SAX or what Blaise Doughan is saying is also good and you can say a standard way, But if you have complex XML structure and you don't want to annotate your classes manually and use XJC tool.
其他明智的使用 STAX 或 SAX 或 Blaise Doughan 所说的也很好,您可以说一种标准方式,但是如果您有复杂的 XML 结构并且您不想手动注释您的类并使用 XJC 工具。
In this case this might be helpful.
在这种情况下,这可能会有所帮助。
回答by AdrianS
SAX but you will have to construct your Export object yourself
SAX,但您必须自己构建导出对象