将xml转换为python dict
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/17177109/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
convert xml to python dict
提问by Alfredo Solís
I'm trying to make a dict class to process an xml but get stuck, I really run out of ideas. If someone could guide on this subject would be great.
我正在尝试制作一个 dict 类来处理一个 xml 但卡住了,我真的没有想法了。如果有人可以指导这个主题会很棒。
code developed so far:
到目前为止开发的代码:
class XMLResponse(dict):
def __init__(self, xml):
self.result = True
self.message = ''
pass
def __setattr__(self, name, val):
self[name] = val
def __getattr__(self, name):
if name in self:
return self[name]
return None
message="<?xml version="1.0"?><note><to>Tove</to><from>Jani</from><heading>Reminder</heading><body>Don't forget me this weekend!</body></note>"
XMLResponse(message)
采纳答案by alecxe
You can make use of xmltodictmodule:
您可以使用xmltodict模块:
import xmltodict
message = """<?xml version="1.0"?><note><to>Tove</to><from>Jani</from><heading>Reminder</heading><body>Don't forget me this weekend!</body></note>"""
print xmltodict.parse(message)['note']
which produces an OrderedDict:
产生一个OrderedDict:
OrderedDict([(u'to', u'Tove'), (u'from', u'Jani'), (u'heading', u'Reminder'), (u'body', u"Don't forget me this weekend!")])
which can be converted to dict if order doesn't matter:
如果顺序无关紧要,可以将其转换为 dict:
print dict(xmltodict.parse(message)['note'])
Prints:
印刷:
{u'body': u"Don't forget me this weekend!", u'to': u'Tove', u'from': u'Jani', u'heading': u'Reminder'}
回答by dusual
You should checkout
你应该结帐
https://github.com/martinblech/xmltodict
https://github.com/martinblech/xmltodict
I think it is one of the best standard handlers for xml to dict I have seen.
我认为它是我见过的 xml 到 dict 的最佳标准处理程序之一。
However I should warn you xml and dict are not absolutely compatible data structures
但是我应该警告你 xml 和 dict 不是绝对兼容的数据结构
回答by radtek
You can use the lxml library. Convert the string to an xml object using objectify.fromstringand then look up the objects dir method. For Example:
您可以使用lxml 库。使用objectify.fromstring然后查找对象 dir 方法将字符串转换为 xml 对象。例如:
from lxml import objectify
xml_string = """<?xml version="1.0" encoding="UTF-8"?><NewOrderResp><IndustryType></IndustryType><MessageType>R</MessageType><MerchantID>700000005894</MerchantID><TerminalID>0031</TerminalID><CardBrand>AMEX</CardBrand><AccountNum>3456732800000010</AccountNum><OrderID>TESTORDER1</OrderID><TxRefNum>55A69B278025130CD36B3A95435AA84DC45363</TxRefNum><TxRefIdx>10</TxRefIdx><ProcStatus>0</ProcStatus><ApprovalStatus>1</ApprovalStatus><RespCode></RespCode><AVSRespCode></AVSRespCode><CVV2RespCode></CVV2RespCode><AuthCode></AuthCode><RecurringAdviceCd></RecurringAdviceCd><CAVVRespCode></CAVVRespCode><StatusMsg></StatusMsg><RespMsg></RespMsg><HostRespCode></HostRespCode><HostAVSRespCode></HostAVSRespCode><HostCVV2RespCode></HostCVV2RespCode><CustomerRefNum>A51C5B2B1811E5991208</CustomerRefNum><CustomerName>BOB STEVEN</CustomerName><ProfileProcStatus>0</ProfileProcStatus><CustomerProfileMessage>Profile Created</CustomerProfileMessage><RespTime>13055</RespTime><PartialAuthOccurred></PartialAuthOccurred><RequestedAmount></RequestedAmount><RedeemedAmount></RedeemedAmount><RemainingBalance></RemainingBalance><CountryFraudFilterStatus></CountryFraudFilterStatus><IsoCountryCode></IsoCountryCode></NewOrderResp>"""
xml_object = objectify.fromstring(xml_string)
print xml_object.__dict__
Converting the xml object to dict would return a dict:
将 xml 对象转换为 dict 将返回一个 dict:
{'RemainingBalance': u'', 'AVSRespCode': u'', 'RequestedAmount': u'', 'AccountNum': 3456732800000010, 'IsoCountryCode': u'', 'HostCVV2RespCode': u'', 'TerminalID': 31, 'CVV2RespCode': u'', 'RespMsg': u'', 'CardBrand': 'AMEX', 'MerchantID': 700000005894, 'RespCode': u'', 'ProfileProcStatus': 0, 'CustomerName': 'BOB STEVEN', 'PartialAuthOccurred': u'', 'MessageType': 'R', 'ProcStatus': 0, 'TxRefIdx': 10, 'RecurringAdviceCd': u'', 'IndustryType': u'', 'OrderID': 'TESTORDER1', 'StatusMsg': u'', 'ApprovalStatus': 1, 'RedeemedAmount': u'', 'CountryFraudFilterStatus': u'', 'TxRefNum': '55A69B278025130CD36B3A95435AA84DC45363', 'CustomerRefNum': 'A51C5B2B1811E5991208', 'CustomerProfileMessage': 'Profile Created', 'AuthCode': u'', 'RespTime': 13055, 'HostAVSRespCode': u'', 'CAVVRespCode': u'', 'HostRespCode': u''}
The xml string I used is a response from paymentech payments gateway just to show a real world example.
我使用的 xml 字符串是来自 paymentech 支付网关的响应,只是为了展示一个真实的例子。
Also note that the above example is not recursive, so if there is dicts within dicts you have to do some recursion. See the recursive function I wrote that you can use:
还要注意,上面的例子不是递归的,所以如果字典中有字典,你必须做一些递归。请参阅我编写的递归函数,您可以使用:
from lxml import objectify
def xml_to_dict_recursion(xml_object):
dict_object = xml_object.__dict__
if not dict_object:
return xml_object
for key, value in dict_object.items():
dict_object[key] = xml_to_dict_recursion(value)
return dict_object
def xml_to_dict(xml_str):
return xml_to_dict_recursion(objectify.fromstring(xml_str))
xml_string = """<?xml version="1.0" encoding="UTF-8"?><Response><NewOrderResp>
<IndustryType>Test</IndustryType><SomeData><SomeNestedData1>1234</SomeNestedData1>
<SomeNestedData2>3455</SomeNestedData2></SomeData></NewOrderResp></Response>"""
print xml_to_dict(xml_string)
Heres a variant that preserves the parent key / element:
这是一个保留父键/元素的变体:
def xml_to_dict(xml_str):
""" Convert xml to dict, using lxml v3.4.2 xml processing library, see http://lxml.de/ """
def xml_to_dict_recursion(xml_object):
dict_object = xml_object.__dict__
if not dict_object: # if empty dict returned
return xml_object
for key, value in dict_object.items():
dict_object[key] = xml_to_dict_recursion(value)
return dict_object
xml_obj = objectify.fromstring(xml_str)
return {xml_obj.tag: xml_to_dict_recursion(xml_obj)}
And if you want to only return a subtree and convert it to dict, you can use Element.find():
如果您只想返回一个子树并将其转换为 dict,您可以使用Element.find():
xml_obj.find('.//') # lxml.objectify.ObjectifiedElement instance
There are many options to accomplish this but this one is great if you're already using lxml. In this example lxml-3.4.2 was used.Cheers!
有很多选项可以实现这一点,但是如果您已经在使用 lxml,那么这个选项非常好。在这个例子中使用了 lxml-3.4.2。干杯!
回答by Fred
You'd think that by now we'd have a good answer to this one, but we apparently didn't. After reviewing half of dozen of similar questions on stackoverflow, here is what worked for me:
你会认为现在我们会有一个很好的答案,但我们显然没有。在查看了有关 stackoverflow 的六个类似问题之后,以下是对我有用的方法:
from lxml import etree
# arrow is an awesome lib for dealing with dates in python
import arrow
# converts an etree to dict, useful to convert xml to dict
def etree2dict(tree):
root, contents = recursive_dict(tree)
return {root: contents}
def recursive_dict(element):
if element.attrib and 'type' in element.attrib and element.attrib['type'] == "array":
return element.tag, [(dict(map(recursive_dict, child)) or getElementValue(child)) for child in element]
else:
return element.tag, dict(map(recursive_dict, element)) or getElementValue(element)
def getElementValue(element):
if element.text:
if element.attrib and 'type' in element.attrib:
attr_type = element.attrib.get('type')
if attr_type == 'integer':
return int(element.text.strip())
if attr_type == 'float':
return float(element.text.strip())
if attr_type == 'boolean':
return element.text.lower().strip() == 'true'
if attr_type == 'datetime':
return arrow.get(element.text.strip()).timestamp
else:
return element.text
elif element.attrib:
if 'nil' in element.attrib:
return None
else:
return element.attrib
else:
return None
and this is how you use it:
这就是你如何使用它:
from lxml import etree
message="""<?xml version="1.0"?><note><to>Tove</to><from>Jani</from><heading>Reminder</heading><body>Don't forget me this weekend!</body></note>"''
tree = etree.fromstring(message)
etree2dict(tree)
Hope it helps :-)
希望能帮助到你 :-)

