Perl 最好的 XML 解析器是什么?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/487213/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
What's the best XML parser for Perl?
提问by Xetius
I have tried many of the Perl XML Parsers. I was quite interested in the Sablotron Parser, but it is such a pain to install on a Windows box. Currently I have started using XML::LibXMLand XML::LibXSLTboth of which seem to do everything I need.
我尝试了许多 Perl XML 解析器。我对 Sablotron Parser 非常感兴趣,但是在 Windows 机器上安装它是如此痛苦。目前我已经开始使用XML::LibXML和XML::LibXSLT,这两者似乎都能满足我的需求。
They seem to be quite standard as well. Are there any better XML Parsers to use than this?
他们似乎也很标准。还有比这更好的 XML 解析器吗?
采纳答案by mmcdole
I think you are using a pretty good one. XML::LibXML, Matt Sergeant and Christian Glahn's Perl interface to Daniel Velliard's libxml2is one of the faster XML Parsers that I know of.
我认为你正在使用一个很好的。 XML::LibXML、Matt Sergeant 和 Christian Glahn 与 Daniel Velliard 的libxml2的 Perl 接口是我所知道的更快的 XML 解析器之一。
回答by Dotan Dimet
It really depends on your needs, as people have said. To parse XML files that were ~100Mb in size (gene annotations from TAIR, 1 file per chromosome), I used mirod's XML::Twigmodule, which lets you set callbacks to parse the elements that interest you, presenting each sub-document as an XML::Simple tree. It combines the benefits of a SAX parser (scanning the file as a stream) with a DOM parser (working more easily with the interesting pieces).
正如人们所说,这真的取决于您的需求。为了解析大约 100Mb 的 XML 文件(来自TAIR 的基因注释,每个染色体 1 个文件),我使用了 mirod 的XML::Twig模块,它允许您设置回调来解析您感兴趣的元素,将每个子文档呈现为一个 XML::Simple 树。它结合了 SAX 解析器(将文件作为流扫描)和 DOM 解析器(更容易处理有趣的部分)的优点。
回答by Joe Casadonte
If you need speed, power or features, XML::LibXML is the way to go. If you're after ease of use, though, XML::Simpleis a viable alternative.
如果您需要速度、功能或功能,XML::LibXML 是您的最佳选择。不过,如果您追求易用性,那么XML::Simple是一个可行的替代方案。
回答by aekeus
In my experience XML::Simpleis best for quick and dirty parsing of XML. We use it for parsing data from third parties that do not always conform to the XML standard. XML::Simple throws informative errors and gets you up an running extremely quickly.
根据我的经验,XML::Simple最适合快速且脏地解析 XML。我们使用它来解析不总是符合 XML 标准的第三方数据。XML::Simple 会抛出信息性错误并让您非常快速地开始运行。
回答by Zvika
(Actually it's not an answer, but a comment - however, I cannot comment...)
(实际上这不是答案,而是评论-但是,我无法发表评论......)
XML::Simple has been mentioned here.
(I know it's few from few years ago, but this appeared up in Google today...)
XML::Simple 已在此处提及。
(我知道几年前很少,但今天出现在谷歌......)
However, it's site (http://metacpan.org/pod/XML::Simple) now says:
但是,它的站点(http://metacpan.org/pod/XML::Simple)现在说:
STATUS OF THIS MODULE
本模块的状态
The use of this module in new code is discouraged. Other modules are available which provide more straightforward and consistent interfaces. In particular, XML::LibXML is highly recommended.
不鼓励在新代码中使用此模块。其他模块可提供更直接和一致的接口。特别是,强烈推荐使用 XML::LibXML。
The major problems with this module are the large number of options and the arbitrary ways in which these options interact - often with unexpected results.
这个模块的主要问题是大量的选项和这些选项交互的任意方式——通常会产生意想不到的结果。
Patches with bug fixes and documentation fixes are welcome, but new features are unlikely to be added.
欢迎使用带有错误修复和文档修复的补丁,但不太可能添加新功能。
回答by singingfish
You could also look at XML::Liberalwhich uses LibXML underneath.
您还可以查看在下面使用 LibXML 的XML::Liberal。
回答by alexk
I think you should give XML::MyXMLa try, too. It's very easy to use.
我认为您也应该尝试一下XML::MyXML。它非常容易使用。
回答by HoldOffHunger
I'll offer one that SHOULD NOTbe used: XML::Parser.
我将提供一个不应该使用的:XML::Parser。
It automatically expands HTML entities to their UTF-8 equivalents, and the option to disable this behavior does not work on the most characteristic of all entities, &.
它会自动将 HTML 实体扩展为它们的 UTF-8 等效项,并且禁用此行为的选项不适用于所有实体中最具特征的&.
Additionally, its XMLDecl-parser will interpret and display the standaloneattribute in the <?xml ... ?>block as "standalone"="1", which is absolutely incorrect -- it should be "standalone"="yes".
此外,它的 XMLDecl 解析器会将块中的standalone属性解释并显示<?xml ... ?>为"standalone"="1",这是绝对不正确的——它应该是"standalone"="yes"。

