如何让 Python 的 ElementTree 漂亮地打印到 XML 文件?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/28813876/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How do I get Python's ElementTree to pretty print to an XML file?
提问by Kimbluey
Background
背景
I am using SQLite to access a database and retrieve the desired information. I'm using ElementTree in Python version 2.6 to create an XML file with that information.
我正在使用 SQLite 访问数据库并检索所需的信息。我在 Python 2.6 版中使用 ElementTree 创建包含该信息的 XML 文件。
Code
代码
import sqlite3
import xml.etree.ElementTree as ET
# NOTE: Omitted code where I acccess the database,
# pull data, and add elements to the tree
tree = ET.ElementTree(root)
# Pretty printing to Python shell for testing purposes
from xml.dom import minidom
print minidom.parseString(ET.tostring(root)).toprettyxml(indent = " ")
####### Here lies my problem #######
tree.write("New_Database.xml")
Attempts
尝试
I've tried using tree.write("New_Database.xml", "utf-8")
in place of the last line of code above, but it did not edit the XML's layout at all - it's still a jumbled mess.
我已经尝试使用tree.write("New_Database.xml", "utf-8")
代替上面最后一行代码,但它根本没有编辑 XML 的布局 - 它仍然是一团糟。
I also decided to fiddle around and tried doing:tree = minidom.parseString(ET.tostring(root)).toprettyxml(indent = " ")
instead of printing this to the Python shell, which gives the error AttributeError: 'unicode' object has no attribute 'write'.
我还决定摆弄并尝试这样做:而不是将其打印到 Python shell,这会导致错误AttributeError: 'unicode' object has no attribute 'write'。tree = minidom.parseString(ET.tostring(root)).toprettyxml(indent = " ")
Questions
问题
When I write my tree to an XML file on the last line, is there a way to pretty print to the XML file as it does to the Python shell?
当我将树写入最后一行的 XML 文件时,有没有办法像打印到 Python shell 一样漂亮地打印到 XML 文件?
Can I use toprettyxml()
here or is there a different way to do this?
我可以toprettyxml()
在这里使用还是有不同的方法来做到这一点?
采纳答案by Jonathan Eunice
Whatever your XML string is, you can write it to the file of your choice by opening a file for writing and writing the string to the file.
无论您的 XML 字符串是什么,您都可以通过打开一个用于写入的文件并将字符串写入文件来将其写入您选择的文件。
from xml.dom import minidom
xmlstr = minidom.parseString(ET.tostring(root)).toprettyxml(indent=" ")
with open("New_Database.xml", "w") as f:
f.write(xmlstr)
There is one possible complication, especially in Python 2, which is both less strict and less sophisticated about Unicode characters in strings. If your toprettyxml
method hands back a Unicode string (u"something"
), then you may want to cast it to a suitable file encoding, such as UTF-8. E.g. replace the one write line with:
有一种可能的复杂性,尤其是在 Python 2 中,它对字符串中的 Unicode 字符既不那么严格,也不那么复杂。如果您的toprettyxml
方法返回一个 Unicode 字符串 ( u"something"
),那么您可能希望将其转换为合适的文件编码,例如 UTF-8。例如,将一行写入行替换为:
f.write(xmlstr.encode('utf-8'))
回答by RJX
Install bs4
安装 bs4
pip install bs4
Use this code to pretty print:
使用此代码可以漂亮地打印:
from bs4 import BeautifulSoup
x = your xml
print(BeautifulSoup(x, "xml").prettify())
回答by vadimk
Take a look at the vkbeautifymodule.
看看vkbeautify模块。
Input and output can be string/file in any combinations. It is very compact and doesn't have any dependency.
输入和输出可以是任意组合的字符串/文件。它非常紧凑,没有任何依赖性。
import vkbeautify as vkb
a) pretty_text = vkb.xml(your_xml_text) #return String
b) vkb.xml(your_xml_text, 'path/to/dest/file') #save in file
回答by Nick
If one wants to use lxml, it could be done in the following way:
如果要使用lxml,可以通过以下方式完成:
from lxml import etree
xml_object = etree.tostring(root,
pretty_print=True,
xml_declaration=True,
encoding='UTF-8')
with open("xmlfile.xml", "wb") as writter:
writter.write(xml_object)`
If you see xml namespaces e.g. py:pytype="TREE"
, one might want to add before the creation of xml_object
如果您看到 xml 命名空间,例如py:pytype="TREE"
,您可能希望在创建之前添加xml_object
etree.cleanup_namespaces(root)
This should be sufficient for any adaptation in your code.
这对于您的代码中的任何修改都应该足够了。
回答by Ben Anderson
I found a way using straight ElementTree, but it is rather complex.
我找到了一种使用直接 ElementTree 的方法,但它相当复杂。
ElementTree has functions that edit the text and tail of elements, for example, element.text="text"
and element.tail="tail"
. You have to use these in a specific way to get things to line up, so make sure you know your escape characters.
ElementTree 具有编辑元素的文本和尾部的函数,例如,element.text="text"
和element.tail="tail"
。你必须以特定的方式使用这些来让事情对齐,所以确保你知道你的转义字符。
As a basic example:
作为一个基本的例子:
I have the following file:
我有以下文件:
<?xml version='1.0' encoding='utf-8'?>
<root>
<data version="1">
<data>76939</data>
</data>
<data version="2">
<data>266720</data>
<newdata>3569</newdata>
</data>
</root>
To place a third element in and keep it pretty, you need the following code:
要将第三个元素放入并保持美观,您需要以下代码:
addElement = ET.Element("data") # Make a new element
addElement.set("version", "3") # Set the element's attribute
addElement.tail = "\n" # Edit the element's tail
addElement.text = "\n\t\t" # Edit the element's text
newData = ET.SubElement(addElement, "data") # Make a subelement and attach it to our element
newData.tail = "\n\t" # Edit the subelement's tail
newData.text = "5431" # Edit the subelement's text
root[-1].tail = "\n\t" # Edit the previous element's tail, so that our new element is properly placed
root.append(addElement) # Add the element to the tree.
To indent the internal tags (like the internal data tag), you have to add it to the text of the parent element. If you want to indent anything after an element (usually after subelements), you put it in the tail.
要缩进内部标签(如内部数据标签),您必须将其添加到父元素的文本中。如果您想在元素之后(通常在子元素之后)缩进任何内容,请将其放在尾部。
This code give the following result when you write it to a file:
当您将其写入文件时,此代码会给出以下结果:
<?xml version='1.0' encoding='utf-8'?>
<root>
<data version="1">
<data>76939</data>
</data>
<data version="2">
<data>266720</data>
<newdata>3569</newdata>
</data> <!--root[-1].tail-->
<data version="3"> <!--addElement's text-->
<data>5431</data> <!--newData's tail-->
</data> <!--addElement's tail-->
</root>
As another note, if you wish to make the program uniformally use \t
, you may want to parse the file as a string first, and replace all of the spaces for indentations with \t
.
另请注意,如果您希望程序统一使用\t
,您可能希望首先将文件解析为字符串,然后将所有缩进空格替换为\t
.
This code was made in Python3.7, but still works in Python2.7.
这段代码是在 Python3.7 中编写的,但在 Python2.7 中仍然有效。