使用 python 创建一个简单的 XML 文件

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/3605680/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 11:54:48  来源:igfitidea点击:

Creating a simple XML file using python

pythonxml

提问by Blankman

What are my options if I want to create a simple XML file in python? (library wise)

如果我想在 python 中创建一个简单的 XML 文件,我有哪些选择?(图书馆明智)

The xml I want looks like:

我想要的 xml 看起来像:

<root>
 <doc>
     <field1 name="blah">some value1</field1>
     <field2 name="asdfasd">some vlaue2</field2>
 </doc>

</root>

采纳答案by ssokolow

These days, the most popular (and very simple) option is the ElementTree API, which has been included in the standard library since Python 2.5.

如今,最流行(且非常简单)的选项是ElementTree API,它自 Python 2.5 以来已包含在标准库中。

The available options for that are:

可用的选项是:

  • ElementTree (Basic, pure-Python implementation of ElementTree. Part of the standard library since 2.5)
  • cElementTree (Optimized C implementation of ElementTree. Also offered in the standard library since 2.5)
  • LXML (Based on libxml2. Offers a rich superset of the ElementTree API as well XPath, CSS Selectors, and more)
  • ElementTree(ElementTree 的基本纯 Python 实现。自 2.5 起成为标准库的一部分)
  • cElementTree(ElementTree 的优化 C 实现。自 2.5 起也在标准库中提供)
  • LXML(基于 libxml2。提供丰富的 ElementTree API 超集以及 XPath、CSS 选择器等)

Here's an example of how to generate your example document using the in-stdlib cElementTree:

以下是如何使用 in-stdlib cElementTree 生成示例文档的示例:

import xml.etree.cElementTree as ET

root = ET.Element("root")
doc = ET.SubElement(root, "doc")

ET.SubElement(doc, "field1", name="blah").text = "some value1"
ET.SubElement(doc, "field2", name="asdfasd").text = "some vlaue2"

tree = ET.ElementTree(root)
tree.write("filename.xml")

I've tested it and it works, but I'm assuming whitespace isn't significant. If you need "prettyprint" indentation, let me know and I'll look up how to do that. (It may be an LXML-specific option. I don't use the stdlib implementation much)

我已经测试过它并且它有效,但我假设空格并不重要。如果您需要“prettyprint”缩进,请告诉我,我会查找如何执行此操作。(它可能是 LXML 特定的选项。我不太使用 stdlib 实现)

For further reading, here are some useful links:

为了进一步阅读,这里有一些有用的链接:

As a final note, either cElementTree or LXML should be fast enough for all your needs (both are optimized C code), but in the event you're in a situation where you need to squeeze out every last bit of performance, the benchmarks on the LXML site indicate that:

最后要注意的是,cElementTree 或 LXML 应该足够快满足您的所有需求(两者都是优化的 C 代码),但是如果您处于需要挤出最后一点性能的情况,基准LXML 站点指出:

  • LXML clearly wins for serializing (generating) XML
  • As a side-effect of implementing proper parent traversal, LXML is a bit slower than cElementTree for parsing.
  • LXML 在序列化(生成)XML 方面明显胜出
  • 作为实现正确的父遍历的副作用,LXML 在解析方面比 cElementTree 慢一点。

回答by whaley

For the simplest choice, I'd go with minidom: http://docs.python.org/library/xml.dom.minidom.html. It is built in to the python standard library and is straightforward to use in simple cases.

对于最简单的选择,我会选择 minidom:http://docs.python.org/library/xml.dom.minidom.html 。它内置于 python 标准库中,在简单情况下使用起来很简单。

Here's a pretty easy to follow tutorial: http://www.boddie.org.uk/python/XML_intro.html

这是一个非常容易遵循的教程:http: //www.boddie.org.uk/python/XML_intro.html

回答by rescdsk

The lxml libraryincludes a very convenient syntax for XML generation, called the E-factory. Here's how I'd make the example you give:

LXML库包括XML生成一个非常方便的语法,叫做E-工厂。这是我如何制作你给出的例子:

#!/usr/bin/python
import lxml.etree
import lxml.builder    

E = lxml.builder.ElementMaker()
ROOT = E.root
DOC = E.doc
FIELD1 = E.field1
FIELD2 = E.field2

the_doc = ROOT(
        DOC(
            FIELD1('some value1', name='blah'),
            FIELD2('some value2', name='asdfasd'),
            )   
        )   

print lxml.etree.tostring(the_doc, pretty_print=True)

Output:

输出:

<root>
  <doc>
    <field1 name="blah">some value1</field1>
    <field2 name="asdfasd">some value2</field2>
  </doc>
</root>

It also supports adding to an already-made node, e.g. after the above you could say

它还支持添加到一个已经创建的节点,例如在上面你可以说

the_doc.append(FIELD2('another value again', name='hithere'))

回答by scls

Yattag http://www.yattag.org/or https://github.com/leforestier/yattagprovides an interesting API to create such XML document (and also HTML documents).

Yattag http://www.yattag.org/https://github.com/leforestier/yattag提供了一个有趣的 API 来创建这样的 XML 文档(以及 HTML 文档)。

It's using context managerand withkeyword.

它使用上下文管理器with关键字。

from yattag import Doc, indent

doc, tag, text = Doc().tagtext()

with tag('root'):
    with tag('doc'):
        with tag('field1', name='blah'):
            text('some value1')
        with tag('field2', name='asdfasd'):
            text('some value2')

result = indent(
    doc.getvalue(),
    indentation = ' '*4,
    newline = '\r\n'
)

print(result)

so you will get:

所以你会得到:

<root>
    <doc>
        <field1 name="blah">some value1</field1>
        <field2 name="asdfasd">some value2</field2>
    </doc>
</root>

回答by bigh_29

For such a simple XML structure, you may not want to involve a full blown XML module. Consider a string template for the simplest structures, or Jinja for something a little more complex. Jinja can handle looping over a list of data to produce the inner xml of your document list. That is a bit trickier with raw python string templates

对于这样一个简单的 XML 结构,您可能不想涉及一个完整的 XML 模块。考虑最简单结构的字符串模板,或者考虑更复杂的结构的 Jinja。Jinja 可以处理循环数据列表以生成文档列表的内部 xml。使用原始 python 字符串模板有点棘手

For a Jinja example, see my answer to a similar question.

有关 Jinja 示例,请参阅我对类似问题的回答

Here is an example of generating your xml with string templates.

这是使用字符串模板生成 xml 的示例。

import string
from xml.sax.saxutils import escape

inner_template = string.Template('    <field${id} name="${name}">${value}</field${id}>')

outer_template = string.Template("""<root>
 <doc>
${document_list}
 </doc>
</root>
 """)

data = [
    (1, 'foo', 'The value for the foo document'),
    (2, 'bar', 'The <value> for the <bar> document'),
]

inner_contents = [inner_template.substitute(id=id, name=name, value=escape(value)) for (id, name, value) in data]
result = outer_template.substitute(document_list='\n'.join(inner_contents))
print result

Output:

输出:

<root>
 <doc>
    <field1 name="foo">The value for the foo document</field1>
    <field2 name="bar">The &lt;value&gt; for the &lt;bar&gt; document</field2>
 </doc>
</root>

The downer of the template approach is that you won't get escaping of <and >for free. I danced around that problem by pulling in a util from xml.sax

模板方法的令人沮丧的是,你不会得到的逃避<>自由。我通过从xml.sax

回答by Cloughie

I just finished writing an xml generator, using bigh_29's method of Templates ... it's a nice way of controlling what you output without too many Objects getting 'in the way'.

我刚刚完成了一个 xml 生成器的编写,使用 bigh_29 的模板方法......这是一种控制输出内容的好方法,而不会有太多对象“妨碍”。

As for the tag and value, I used two arrays, one which gave the tag name and position in the output xml and another which referenced a parameter file having the same list of tags. The parameter file, however, also has the position number in the corresponding input (csv) file where the data will be taken from. This way, if there's any changes to the position of the data coming in from the input file, the program doesn't change; it dynamically works out the data field position from the appropriate tag in the parameter file.

至于标记和值,我使用了两个数组,一个给出了输出 xml 中的标记名称和位置,另一个引用了具有相同标记列表的参数文件。但是,参数文件在相应的输入 (csv) 文件中也有位置编号,数据将从该文件中提取。这样,如果来自输入文件的数据的位置发生任何变化,程序就不会改变;它从参数文件中的适当标记动态计算出数据字段位置。