java 使用 DOM 解析 XML 时的最大大小

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/4182355/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-30 05:12:03  来源:igfitidea点击:

Maximum size when parsing XML with DOM

javaandroidxml

提问by Robert Strauch

Currently I'm implementing a REST client which shall parse the XML response messages. It is intended, later, to be running on an Android device. Thus, memory and processing speed is quite an issue. However there will be only one XML response at a time so processing or holding multiple XML documents at a time is not an issue.

目前我正在实现一个 REST 客户端,它将解析 XML 响应消息。它打算稍后在 Android 设备上运行。因此,内存和处理速度是一个相当大的问题。但是,一次只有一个 XML 响应,因此一次处理或保存多个 XML 文档不是问题。

As fas as I understood there are three ways of parsing XML with the Android SDK:

据我所知,使用 Android SDK 解析 XML 的方法有以下三种:

  • SAX
  • XmlPullParser
  • DOM
  • 萨克斯
  • XmlPullParser
  • DOM

Reading about these different parsing methods I got that SAX is recommended for large XML files as it won't hold the complete tree in memory like DOM.

阅读这些不同的解析方法后,我发现 SAX 推荐用于大型 XML 文件,因为它不会像 DOM 那样将完整的树保存在内存中。

However, I'm asking myself what is large in terms of kilobytes, megabytes, ...? Is there a practical size up to which it does not really matter whether using SAX or DOM?

但是,我在问自己什么是千字节、兆字节……?是否有一个实际大小,使用 SAX 还是 DOM 并不重要?

Thanks,
Robert

谢谢,
罗伯特

回答by James Anderson

There are no standard limits set for XML documents or DOM size so it depends entirely on what the host machine can cope with.

没有为 XML 文档或 DOM 大小设置标准限制,因此它完全取决于主机可以处理的内容。

As you are implementing on Android you should assume a pretty limited amount of memory, and remember the DOM, the XML parser, your program logic, the display logic, the JVM and Android itself all have to fit in the available memory!.

当您在 Android 上实现时,您应该假设内存量非常有限,并记住 DOM、XML 解析器、您的程序逻辑、显示逻辑、JVM 和 Android 本身都必须适合可用内存!

As a rule of thumb you can expect the DOM occupy memory about four times the size of the source XML document. So assume 512MB of available memory, aim to take no more than half of this for you DOM and you end up with 512/8 or a practical maximum of 64MB for the XML doc.

根据经验,您可以预期 DOM 占用的内存大约是源 XML 文档大小的四倍。因此,假设有 512MB 的可用内存,目标是为您的 DOM 占用不超过其中的一半,最终得到 512/8 或 XML 文档的实际最大 64MB。

Just to be on the safe side I would half that again to a 32MB max. So if you expect many documents of this size I would swithc to SAX parsing!.

为了安全起见,我会将其减半至最大 32MB。因此,如果您希望有许多这种大小的文档,我会切换到 SAX 解析!。

If you want the app to respond with any speed on large documents the SAX is the way to go. A SAX parser can start returning results as soon as the first element is read a DOM parser needs to read the whole document before any output can be sent to your program.

如果您希望应用程序以任何速度响应大型文档,那么 SAX 是您的最佳选择。SAX 解析器可以在读取第一个元素后立即开始返回结果,而 DOM 解析器需要读取整个文档,然后才能将任何输出发送到您的程序。

回答by Robert Strauch

Excerpt from this article:

摘自这篇文章

DOM parsers suffer from memory bloat. With smaller XML sets this isn't such an issue but as the XML size grows DOM parsers become less and less efficient making them not very scaleable in terms of growing your XML. Push parsers are a happy medium since they allow you to control parsing, thereby eliminating any kind of complex state management since the state is always known, and they don't suffer from the memory bloat of DOM parsers.

DOM 解析器遭受内存膨胀。对于较小的 XML 集,这不是一个问题,但是随着 XML 大小的增长,DOM 解析器的效率越来越低,这使得它们在扩展 XML 方面不是很可扩展。推送解析器是一种快乐的媒介,因为它们允许您控制解析,从而消除任何复杂的状态管理,因为状态总是已知的,并且它们不会受到 DOM 解析器的内存膨胀的影响。

This could be the reason SAX is recommended over DOM: SAX functions as an XML push parser. Also, check out the Wikipedia article for SAX here.

这可能是推荐 SAX 而不是 DOM 的原因:SAX 用作 XML 推送解析器。此外,请在此处查看有关 SAX 的维基百科文章。

EDIT: To address size specifically you would have to look at your implementation. An example of DOM Documentobject size in the memory of a Java-based XML parser is here. Java, like a lot of languages, defines some memory-based limitations such as the JVM heap size, and the Android web services/XML DOM API may also define some internal limits at the programmers' discretion (mentioned in part here). There is no one definitive answer as to maximum allowed size.

编辑:要专门解决大小,您必须查看您的实现。一个Document基于 Java 的 XML 解析器内存中 DOM对象大小的示例是这里。Java 与许多语言一样,定义了一些基于内存的限制,例如JVM 堆大小,Android Web 服务/XML DOM API 也可能定义一些由程序员自行决定的内部限制(此处部分提及)。关于允许的最大大小,没有一个明确的答案。

回答by mauretto

My experience let me say that using DOM the memory used is 2x the file size, but of course it's just an indication. If the XML tree has just one field containing the entire data, the memory used is similar to file size!

我的经验告诉我,使用 DOM 所使用的内存是文件大小的 2 倍,但这当然只是一个指示。如果 XML 树只有一个字段包含整个数据,则使用的内存与文件大小相似!