Python 中的 XML 解析

Question

提问by Alex

I'd like to parse a simple, small XML file using python however work on pyXML seems to have ceased. I'd like to use python 2.6 if possible. Can anyone recommend an XML parser that will work with 2.6?

我想使用 python 解析一个简单的小型 XML 文件，但是 pyXML 的工作似乎已经停止。如果可能，我想使用 python 2.6。谁能推荐一个适用于 2.6 的 XML 解析器？

Thanks

谢谢

Answer 1

回答by Eli Courtwright

If it's small and simple then just use the standard library:

如果它小而简单，那么只需使用标准库：

from xml.dom.minidom import parse
doc = parse("filename.xml")

This will return a DOM tree implementing the standard Document Object Model API

这将返回一个实现标准文档对象模型 API的 DOM 树

If you later need to do complex things like schema validation or XPath querying then I recommend the third-party lxml module, which is a wrapper around the popular libxml2 C library.

如果您以后需要做一些复杂的事情，比如模式验证或 XPath 查询，那么我推荐第三方lxml 模块，它是流行的 libxml2 C 库的包装器。

Answer 2

回答by Alex

For most of my tasks I have used the Minidom Lightweight DOM implementation, from the official page:

对于我的大部分任务，我使用了官方页面上的 Minidom Lightweight DOM 实现：

from xml.dom.minidom import parse, parseString

dom1 = parse('c:\temp\mydata.xml') # parse an XML file by name

datasource = open('c:\temp\mydata.xml')
dom2 = parse(datasource)   # parse an open file

dom3 = parseString('<myxml>Some data<empty/> some more data</myxml>')

Answer 3

回答by Andrei Vajna II

Hereis also a very good example on how to use minidom along with explanations.

这也是一个关于如何使用 minidom 以及解释的很好的例子。

Answer 4

回答by Il-Bhima

Would lxmlsuit your needs? Its the first tool I turn to for xml parsing.

将LXML满足您的需求？它是我用来解析 xml 的第一个工具。

Answer 5

回答by steveha

A few years ago, I wrote a library for working with structuredXML. It makes XML simpler by making some limiting assumptions.

几年前，我编写了一个用于处理结构化XML的库。它通过做出一些限制性假设使 XML 更简单。

You could use XML for something like a word processor document, in which case you have a complicated soup of stuff with XML tags embedded all over the place; in which case my library would not be good.

您可以将 XML 用于文字处理器文档之类的内容，在这种情况下，您会遇到一堆复杂的东西，其中到处都嵌入了 XML 标签；在这种情况下，我的图书馆不会很好。

But if you are using XML for something like a config file, my library is rather convenient. You define classes that describe the structure of the XML you want, and once you have the classes done, there is a method to slurp in XML and parse it. The actual parsing is done by xml.dom.minidom, but then my library extracts the data and puts it in the classes.

但是，如果您将 XML 用于配置文件之类的内容，我的库就相当方便了。您可以定义描述所需 XML 结构的类，一旦完成了这些类，就有一种方法可以在 XML 中获取并解析它。实际的解析由 xml.dom.minidom 完成，但随后我的库提取数据并将其放入类中。

The best part: you can declare a "Collection" type that will be a Python list with zero or more other XML elements inside it. This is great for things like Atom or RSS feeds (which was the original reason I designed the library).

最好的部分：您可以声明一个“集合”类型，该类型将是一个 Python 列表，其中包含零个或多个其他 XML 元素。这对于 Atom 或 RSS 提要（这是我设计该库的最初原因）之类的内容非常有用。

Here's the URL: http://home.avvanta.com/~steveha/xe.html

这是网址： http://home.avvanta.com/~steveha/xe.html

I'd be happy to answer questions if you have any.

如果您有任何问题，我很乐意回答。

Python 中的 XML 解析

提问by Alex

回答by Eli Courtwright

回答by Alex

回答by Andrei Vajna II

回答by Il-Bhima

回答by steveha

相关推荐

最近更新

标签

Python 中的 XML 解析

提问by Alex

回答by Eli Courtwright

回答by Alex

回答by Andrei Vajna II

回答by Il-Bhima

回答by steveha

相关推荐

python AttributeError: 'NoneType' 对象没有属性 'GetDataStore'

python 两个文本文件之间的百分比差异

python 存储时间序列数据的最佳开源解决方案是什么？

python python的作业队列实现

相关推荐

最近更新

标签