python 在python中将XML编辑为字典?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/127606/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Editing XML as a dictionary in python?
提问by Jon Clegg
I'm trying to generate customized xml files from a template xml file in python.
我正在尝试从 python 中的模板 xml 文件生成自定义的 xml 文件。
Conceptually, I want to read in the template xml, remove some elements, change some text attributes, and write the new xml out to a file. I wanted it to work something like this:
从概念上讲,我想读入模板 xml,删除一些元素,更改一些文本属性,并将新的 xml 写入文件。我希望它像这样工作:
conf_base = ConvertXmlToDict('config-template.xml')
conf_base_dict = conf_base.UnWrap()
del conf_base_dict['root-name']['level1-name']['leaf1']
del conf_base_dict['root-name']['level1-name']['leaf2']
conf_new = ConvertDictToXml(conf_base_dict)
now I want to write to file, but I don't see how to get to ElementTree.ElementTree.write()
现在我想写入文件,但我不知道如何到达 ElementTree.ElementTree.write()
conf_new.write('config-new.xml')
Is there some way to do this, or can someone suggest doing this a different way?
有没有办法做到这一点,或者有人可以建议以不同的方式做到这一点?
采纳答案by Chris Lawlor
For easy manipulation of XML in python, I like the Beautiful Souplibrary. It works something like this:
为了在 python 中轻松操作 XML,我喜欢Beautiful Soup库。它的工作原理是这样的:
Sample XML File:
示例 XML 文件:
<root>
<level1>leaf1</level1>
<level2>leaf2</level2>
</root>
Python code:
蟒蛇代码:
from BeautifulSoup import BeautifulStoneSoup, Tag, NavigableString
soup = BeautifulStoneSoup('config-template.xml') # get the parser for the xml file
soup.contents[0].name
# u'root'
You can use the node names as methods:
您可以使用节点名称作为方法:
soup.root.contents[0].name
# u'level1'
It is also possible to use regexes:
也可以使用正则表达式:
import re
tags_starting_with_level = soup.findAll(re.compile('^level'))
for tag in tags_starting_with_level: print tag.name
# level1
# level2
Adding and inserting new nodes is pretty straightforward:
添加和插入新节点非常简单:
# build and insert a new level with a new leaf
level3 = Tag(soup, 'level3')
level3.insert(0, NavigableString('leaf3')
soup.root.insert(2, level3)
print soup.prettify()
# <root>
# <level1>
# leaf1
# </level1>
# <level2>
# leaf2
# </level2>
# <level3>
# leaf3
# </level3>
# </root>
回答by
This'll get you a dict minus attributes... dunno if this is useful to anyone. I was looking for an xml to dict solution myself when i came up with this.
这会给你一个 dict 减去属性......不知道这是否对任何人有用。当我想出这个时,我正在寻找一个 xml 来自己 dict 解决方案。
import xml.etree.ElementTree as etree
tree = etree.parse('test.xml')
root = tree.getroot()
def xml_to_dict(el):
d={}
if el.text:
d[el.tag] = el.text
else:
d[el.tag] = {}
children = el.getchildren()
if children:
d[el.tag] = map(xml_to_dict, children)
return d
This: http://www.w3schools.com/XML/note.xml
这个:http: //www.w3schools.com/XML/note.xml
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
Would equal this:
将等于:
{'note': [{'to': 'Tove'},
{'from': 'Jani'},
{'heading': 'Reminder'},
{'body': "Don't forget me this weekend!"}]}
回答by Torsten Marek
I'm not sure if converting the info set to nested dicts first is easier. Using ElementTree, you can do this:
我不确定首先将信息集转换为嵌套字典是否更容易。使用 ElementTree,您可以执行以下操作:
import xml.etree.ElementTree as ET
doc = ET.parse("template.xml")
lvl1 = doc.findall("level1-name")[0]
lvl1.remove(lvl1.find("leaf1")
lvl1.remove(lvl1.find("leaf2")
# or use del lvl1[idx]
doc.write("config-new.xml")
ElementTree was designed so that you don't have to convert your XML trees to lists and attributes first, since it uses exactly that internally.
ElementTree 的设计使您不必先将 XML 树转换为列表和属性,因为它在内部完全使用了这些。
It also support as small subset of XPath.
它还支持作为XPath 的小子集。
回答by Mark
My modification of Daniel's answer, to give a marginally neater dictionary:
我对丹尼尔的回答的修改,给出了一个稍微整洁的字典:
def xml_to_dictionary(element):
l = len(namespace)
dictionary={}
tag = element.tag[l:]
if element.text:
if (element.text == ' '):
dictionary[tag] = {}
else:
dictionary[tag] = element.text
children = element.getchildren()
if children:
subdictionary = {}
for child in children:
for k,v in xml_to_dictionary(child).items():
if k in subdictionary:
if ( isinstance(subdictionary[k], list)):
subdictionary[k].append(v)
else:
subdictionary[k] = [subdictionary[k], v]
else:
subdictionary[k] = v
if (dictionary[tag] == {}):
dictionary[tag] = subdictionary
else:
dictionary[tag] = [dictionary[tag], subdictionary]
if element.attrib:
attribs = {}
for k,v in element.attrib.items():
attribs[k] = v
if (dictionary[tag] == {}):
dictionary[tag] = attribs
else:
dictionary[tag] = [dictionary[tag], attribs]
return dictionary
namespace is the xmlns string, including braces, that ElementTree prepends to all tags, so here I've cleared it as there is one namespace for the entire document
命名空间是 xmlns 字符串,包括大括号,ElementTree 在所有标签前加上它,所以在这里我已经清除了它,因为整个文档有一个命名空间
NB that I adjusted the raw xml too, so that 'empty' tags would produce at most a ' ' text property in the ElementTree representation
注意,我也调整了原始 xml,因此“空”标签最多会在 ElementTree 表示中产生一个“”文本属性
spacepattern = re.compile(r'\s+')
mydictionary = xml_to_dictionary(ElementTree.XML(spacepattern.sub(' ', content)))
would give for instance
会给例如
{'note': {'to': 'Tove',
'from': 'Jani',
'heading': 'Reminder',
'body': "Don't forget me this weekend!"}}
it's designed for specific xml that is basically equivalent to json, should handle element attributes such as
它是为特定的 xml 设计的,基本上等同于 json,应该处理元素属性,例如
<elementName attributeName='attributeContent'>elementContent</elementName>
too
也
there's the possibility of merging the attribute dictionary / subtag dictionary similarly to how repeat subtags are merged, although nesting the lists seems kind of appropriate :-)
有可能合并属性字典/子标签字典,类似于合并重复子标签的方式,尽管嵌套列表似乎有点合适:-)
回答by Robbo
Adding this line
添加这一行
d.update(('@' + k, v) for k, v in el.attrib.iteritems())
in the user247686's codeyou can have node attributes too.
在user247686 的代码中,您也可以拥有节点属性。
Found it in this post https://stackoverflow.com/a/7684581/1395962
在这篇文章中找到它https://stackoverflow.com/a/7684581/1395962
Example:
例子:
import xml.etree.ElementTree as etree
from urllib import urlopen
xml_file = "http://your_xml_url"
tree = etree.parse(urlopen(xml_file))
root = tree.getroot()
def xml_to_dict(el):
d={}
if el.text:
d[el.tag] = el.text
else:
d[el.tag] = {}
children = el.getchildren()
if children:
d[el.tag] = map(xml_to_dict, children)
d.update(('@' + k, v) for k, v in el.attrib.iteritems())
return d
Call as
调用为
xml_to_dict(root)
回答by nealmcb
XML has a rich infoset, and it takes some special tricks to represent that in a Python dictionary. Elements are ordered, attributes are distinguished from element bodies, etc.
XML 具有丰富的信息集,需要一些特殊的技巧才能在 Python 字典中表示它。元素是有序的,属性与元素主体区分开来等等。
One project to handle round-trips between XML and Python dictionaries, with some configuration options to handle the tradeoffs in different ways is XML Support in Pickling Tools. Version 1.3 and newer is required. It isn't pure Python (and in fact is designed to make C++ / Python interaction easier), but it might be appropriate for various use cases.
一个处理 XML 和 Python 字典之间往返的项目,以及一些配置选项以不同方式处理权衡,是Pickling Tools 中的 XML Support。需要 1.3 及更高版本。它不是纯 Python(实际上旨在使 C++/Python 交互更容易),但它可能适用于各种用例。
回答by Loooo
most direct way to me :
对我来说最直接的方式:
root = ET.parse(xh)
data = root.getroot()
xdic = {}
if data > None:
for part in data.getchildren():
xdic[part.tag] = part.text
回答by S.Lott
Have you tried this?
你试过这个吗?
print xml.etree.ElementTree.tostring( conf_new )