C++ 使用Boost读写XML文件
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1042855/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Using Boost to read and write XML files
提问by Nuno
Is there any good way (and a simple way too) using Boost to read and write XML files?
有没有什么好方法(也是一种简单的方法)使用 Boost 来读写 XML 文件?
I can't seem to find any simple sample to read XML files using Boost. Can you point me a simple sample that uses Boost for reading and writing XML files?
我似乎找不到任何简单的示例来使用 Boost 读取 XML 文件。你能指出一个使用 Boost 读写 XML 文件的简单示例吗?
If not Boost, is there any good and simple library to read and write XML files that you can recommend? (it must be a C++ library)
如果不是 Boost,有什么好的简单的库来读写 XML 文件可以推荐吗?(它必须是一个 C++ 库)
采纳答案by Cristian Adam
You should Try pugixmlLight-weight, simple and fast XML parser for C++
你应该试试pugixml轻量级、简单、快速的 C++ XML 解析器
The nicest thing about pugixml is the XPathsupport, which TinyXML and RapidXML lack.
pugixml 最好的一点是XPath支持,这是 TinyXML 和 RapidXML 所缺乏的。
Quoting RapidXML's author "I would like to thank Arseny Kapoulkine for his work on pugixml, which was an inspiration for this project" and "5% - 30% faster than pugixml, the fastest XML parser I know of" He had tested against version 0.3 of pugixml, which has reached recently version 0.42.
引用 RapidXML 的作者“我要感谢 Arseny Kapoulkine 在 pugixml 上所做的工作,这是该项目的灵感来源”和“比 pugixml 快 5% - 30%,这是我所知道的最快的 XML 解析器”他已经针对 0.3 版进行了测试pugixml 的版本,最近已经达到了 0.42 版本。
Here is an excerpt from pugixml documentation:
以下是 pugixml 文档的摘录:
The main features are:
主要特点是:
- low memory consumption and fragmentation (the win over pugxml is ~1.3 times, TinyXML - ~2.5 times, Xerces (DOM) - ~4.3 times 1). Exact numbers can be seen in Comparison with existing parsers section.
- extremely high parsing speed (the win over pugxml is ~6 times, TinyXML - ~10 times, Xerces-DOM - ~17.6 times 1
- extremely high parsing speed (well, I'm repeating myself, but it's so fast, that it outperforms Expat by 2.8 times on test XML) 2
- more or less standard-conformant (it will parse any standard-compliant file correctly, with the exception of DTD related issues)
- pretty much error-ignorant (it will not choke on something like You & Me, like expat will; it will parse files with data in wrong encoding; and so on)
- clean interface (a heavily refactored pugxml's one)
- more or less Unicode-aware (actually, it assumes UTF-8 encoding of the input data, though it will readily work with ANSI - no UTF-16 for now (see Future work), with helper conversion functions (UTF-8 <-> UTF-16/32 (whatever is the default for std::wstring & wchar_t))
- fully standard compliant C++ code (approved by Comeau strict mode); the library is multiplatform (see reference for platforms list)
- high flexibility. You can control many aspects of file parsing and DOM tree building via parsing options.
- 低内存消耗和碎片化(胜过 pugxml 是 ~1.3 倍,TinyXML - ~2.5 倍,Xerces (DOM) - ~4.3 倍 1)。在与现有解析器部分的比较中可以看到确切的数字。
- 极高的解析速度(胜过 pugxml ~6 倍,TinyXML - ~10 倍,Xerces-DOM - ~17.6 倍 1
- 极高的解析速度(好吧,我在重复一遍,但它太快了,在测试 XML 上比 Expat 高 2.8 倍)2
- 或多或少符合标准(它将正确解析任何符合标准的文件,但与 DTD 相关的问题除外)
- 几乎不知道错误(它不会像你和我一样窒息,就像 expat 一样;它会解析包含错误编码数据的文件;等等)
- 干净的界面(经过大量重构的 pugxml 界面)
- 或多或少 Unicode 感知(实际上,它假定输入数据的 UTF-8 编码,尽管它很容易与 ANSI 一起使用 - 现在没有 UTF-16(请参阅未来工作),具有辅助转换功能(UTF-8 <- > UTF-16/32(无论 std::wstring 和 wchar_t 的默认值是什么))
- 完全符合标准的 C++ 代码(由 Comeau 严格模式批准);该库是多平台的(请参阅平台列表参考)
- 高灵活性。您可以通过解析选项控制文件解析和 DOM 树构建的许多方面。
Okay, you might ask - what's the catch? Everything is so cute - it's small, fast, robust, clean solution for parsing XML. What is missing? Ok, we are fair developers - so here is a misfeature list:
好吧,你可能会问 - 有什么收获?一切都很可爱 - 它是解析 XML 的小型、快速、健壮、干净的解决方案。有什么不见了?好的,我们是公平的开发人员 - 所以这是一个错误功能列表:
- memory consumption. It beats every DOM-based parser that I know of - but when SAX parser comes, there is no chance. You can't process a 2 Gb XML file with less than 4 Gb of memory - and do it fast. Though pugixml behaves better, than all other DOM-based parser, so if you're stuck with DOM, it's not a problem.
- memory consumption. Ok, I'm repeating myself. Again. When other parsers will allow you to provide XML file in a constant storage (or even as a memory mapped area), pugixml will not. So you'll have to copy the entire data into a non-constant storage. Moreover, it should persist during the parser's lifetime (the reasons for that and more about lifetimes is written below). Again, if you're ok with DOM - it should not be a problem, because the overall memory consumption is less (well, though you'll need a contiguous chunk of memory, which can be a problem).
- lack of validation, DTD processing, XML namespaces, proper handling of encoding. If you need those - go take MSXML or XercesC or anything like that.
- 内存消耗。它击败了我所知道的所有基于 DOM 的解析器——但是当 SAX 解析器出现时,就没有机会了。您无法处理内存不足 4 Gb 的 2 Gb XML 文件 - 并且要快速处理。尽管 pugixml 的表现比所有其他基于 DOM 的解析器都要好,所以如果您坚持使用 DOM,这不是问题。
- 内存消耗。好的,我在重复自己。再次。当其他解析器允许您在常量存储(或什至作为内存映射区域)中提供 XML 文件时,pugixml 不会。因此,您必须将整个数据复制到非常量存储中。此外,它应该在解析器的生命周期内持续存在(原因和更多关于生命周期的内容在下面写到)。同样,如果你对 DOM 没问题——它应该不是问题,因为整体内存消耗更少(好吧,虽然你需要一块连续的内存,这可能是一个问题)。
- 缺乏验证、DTD 处理、XML 命名空间、正确处理编码。如果你需要这些 - 去拿 MSXML 或 XercesC 或类似的东西。
回答by stephan
TinyXMLis probably a good choice. As for Boost:
TinyXML可能是一个不错的选择。至于升压:
There is the Property_Treelibrary in the Boost Repository. It has been accepted, but support seems to be lacking at the moment (EDIT: Property_Treeis now part of Boost since version 1.41, read the documentationregarding its XML functionality).
Boost Repository 中有Property_Tree库。它已被接受,但目前似乎缺乏支持(编辑:Property_Tree现在是 Boost自 1.41 版以来的一部分,请阅读有关其 XML 功能的文档)。
Daniel Nuffer has implemented an xml parserfor Boost Spirit.
Daniel Nuffer为 Boost Spirit实现了一个xml 解析器。
回答by Anteru
回答by olibre
Boostuses RapidXMLas described in chapter XML Parserof page How to Populate a Property Tree:
Boost使用RapidXML,如页面How to Populate a Property Tree 的XML Parser一章所述:
Unfortunately, there is no XMLparser in Boostas of the time of this writing. The library therefore contains the fast and tiny RapidXMLparser (currently in version 1.13) to provide XML parsing support. RapidXML does not fully support the XML standard; it is not capable of parsing DTDs and therefore cannot do full entity substitution.
不幸的是,截至撰写本文时,Boost 中还没有XML解析器。因此,该库包含快速且小巧的RapidXML解析器(当前为 1.13 版)以提供 XML 解析支持。RapidXML 不完全支持 XML 标准;它不能解析 DTD,因此不能进行完整的实体替换。
Please also refer to the XML boost tutorial.
另请参阅XML boost 教程。
As the OP wants a "simple way to use boost to read and write xml files", I provide below a very basic example:
由于 OP 想要一种“使用 boost 读取和写入 xml 文件的简单方法”,我在下面提供了一个非常基本的示例:
<main>
<owner>Matt</owner>
<cats>
<cat>Scarface Max</cat>
<cat>Moose</cat>
<cat>Snowball</cat>
<cat>Powerball</cat>
<cat>Miss Pudge</cat>
<cat>Needlenose</cat>
<cat>Sweety Pie</cat>
<cat>Peacey</cat>
<cat>Funnyface</cat>
</cats>
</main>
(cat names are from Matt Mahoney's homepage)
(猫名来自Matt Mahoney 的主页)
The corresponding structure in C++:
C++中对应的结构体:
struct Catowner
{
std::string owner;
std::set<std::string> cats;
};
read_xml()
usage:
read_xml()
用法:
#include <boost/property_tree/ptree.hpp>
#include <boost/property_tree/xml_parser.hpp>
Catowner load(const std::string &file)
{
boost::property_tree::ptree pt;
read_xml(file, pt);
Catowner co;
co.owner = pt.get<std::string>("main.owner");
BOOST_FOREACH(
boost::property_tree::ptree::value_type &v,
pt.get_child("main.cats"))
co.cats.insert(v.second.data());
return co;
}
write_xml()
usage:
write_xml()
用法:
void save(const Catowner &co, const std::string &file)
{
boost::property_tree::ptree pt;
pt.put("main.owner", co.owner);
BOOST_FOREACH(
const std::string &name, co.cats)
pt.add("main.cats.cat", name);
write_xml(file, pt);
}
回答by StackedCrooked
Boost does not provide an XML parser atm.
Boost 不提供 XML 解析器 atm。
Poco XML (part of the Poco C++ libs) is good and simple.
Poco XML(Poco C++ 库的一部分)既好又简单。
回答by Skurmedel
Well there is no specific library in boost for XML parsing, but there are lots of alternatives, here are a couple: libxml, Xerces, Expat
好吧,没有用于 XML 解析的特定库,但是有很多替代方案,这里有几个: libxml、 Xerces、 Expat
Of course you could use some of the other libraries in boost to aid you in making your own library, but that will probably be quite an undertaking.
当然,您可以在 boost 中使用其他一些库来帮助您制作自己的库,但这可能是一项艰巨的任务。
And here is a whole articleon the subject by IBM.
这是IBM 关于该主题的整篇文章。
回答by Stuart
It would appear that boost serialization can read from and write-to archives in XML, if that's sufficient for your purposes.
如果这足以满足您的目的,那么 boost 序列化似乎可以读取和写入 XML 中的档案。
回答by StfnoPad
Definatelly use TinyXML *thumbs up*
绝对使用 TinyXML *竖起大拇指*
回答by Vladimir Prus
If you are looking for DOM functionality only, there are some suggestions already in this thread. I personally would probably not bother with a library lacking XPath support, and in C++, would use Qt. There's also TinyXPath, and Arabica claims to have XPath support, but I cannot say anything at all about those.
如果您只是在寻找 DOM 功能,那么该线程中已经有一些建议。我个人可能不会为缺乏 XPath 支持的库而烦恼,并且在 C++ 中,会使用 Qt。还有 TinyXPath,Arabica 声称有 XPath 支持,但我不能说这些。