我应该如何在 python 中解析这个 xml 字符串?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/14153988/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How should I parse this xml string in python?
提问by Hussain
My XML string is -
我的 XML 字符串是 -
xmlData = """<SMSResponse xmlns="http://example.com" xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
<Cancelled>false</Cancelled>
<MessageID>00000000-0000-0000-0000-000000000000</MessageID>
<Queued>false</Queued>
<SMSError>NoError</SMSError>
<SMSIncomingMessages i:nil="true"/>
<Sent>false</Sent>
<SentDateTime>0001-01-01T00:00:00</SentDateTime>
</SMSResponse>"""
I am trying to parse and get the values of tags - Cancelled, MessageId, SMSError, etc. I am using python's Elementtreelibrary. So far, I have tried things like -
我正在尝试解析并获取标签的值 - Cancelled、MessageId、SMSError 等。我正在使用 python 的Elementtree库。到目前为止,我已经尝试过这样的事情 -
root = ET.fromstring(xmlData)
print root.find('Sent') // gives None
for child in root:
print chil.find('MessageId') // also gives None
Although, I am able to print the tags with -
虽然,我可以打印标签 -
for child in root:
print child.tag
//child.tag for the tag Cancelled is - {http://example.com}Cancelled
and their respective values with -
以及它们各自的值 -
for child in root:
print child.text
How do I get something like -
我如何得到类似的东西 -
print child.Queued // will print false
Like in PHP we can access them with the root -
就像在 PHP 中一样,我们可以使用 root 访问它们 -
$xml = simplexml_load_string($data);
$status = $xml->SMSError;
采纳答案by Martijn Pieters
Your document has a namespace on it, you need to include the namespace when searching:
您的文档上有一个命名空间,搜索时需要包含命名空间:
root = ET.fromstring(xmlData)
print root.find('{http://example.com}Sent',)
print root.find('{http://example.com}MessageID')
output:
输出:
<Element '{http://example.com}Sent' at 0x1043e0690>
<Element '{http://example.com}MessageID' at 0x1043e0350>
The find()and findall()methods also take a namespace map; you can search for a arbitrary prefix, and the prefix will be looked up in that map, to save typing:
该find()和findall()方法也需要一个命名空间的地图; 您可以搜索任意前缀,前缀将在该映射中查找,以节省输入:
nsmap = {'n': 'http://example.com'}
print root.find('n:Sent', namespaces=nsmap)
print root.find('n:MessageID', namespaces=nsmap)
回答by ATOzTOA
You can create a dictionary and directly get values out of it...
您可以创建一个字典并直接从中获取值...
tree = ET.fromstring(xmlData)
root = {}
for child in tree:
root[child.tag.split("}")[1]] = child.text
print root["Queued"]
回答by tuomur
If you're set on Python standard XML libraries, you could use something like this:
如果您使用 Python 标准 XML 库,则可以使用以下内容:
root = ET.fromstring(xmlData)
namespace = 'http://example.com'
def query(tree, nodename):
return tree.find('{{{ex}}}{nodename}'.format(ex=namespace, nodename=nodename))
queued = query(root, 'Queued')
print queued.text
回答by root
With lxml.etree:
与lxml.etree:
In [8]: import lxml.etree as et
In [9]: doc=et.fromstring(xmlData)
In [10]: ns={'n':'http://example.com'}
In [11]: doc.xpath('n:Queued/text()',namespaces=ns)
Out[11]: ['false']
With elementtreeyou can do:
有了elementtree你可以这样做:
import xml.etree.ElementTree as ET
root=ET.fromstring(xmlData)
ns={'n':'http://example.com'}
root.find('n:Queued',namespaces=ns).text
Out[13]: 'false'

