Python 如何使用 xmltodict 从 xml 文件中获取项目
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/40154727/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to use xmltodict to get items out of an xml file
提问by Sam Vruggink
I am trying to easily access values from an xml file.
我正在尝试轻松访问 xml 文件中的值。
<artikelen>
<artikel nummer="121">
<code>ABC123</code>
<naam>Highlight pen</naam>
<voorraad>231</voorraad>
<prijs>0.56</prijs>
</artikel>
<artikel nummer="123">
<code>PQR678</code>
<naam>Nietmachine</naam>
<voorraad>587</voorraad>
<prijs>9.99</prijs>
</artikel>
..... etc
If i want to acces the value ABC123, how do I get it?
如果我想访问值 ABC123,我如何获得它?
import xmltodict
with open('8_1.html') as fd:
doc = xmltodict.parse(fd.read())
print(doc[fd]['code'])
回答by Paul
Using your example:
使用您的示例:
import xmltodict
with open('artikelen.xml') as fd:
doc = xmltodict.parse(fd.read())
If you examine doc
, you'll see it's an OrderedDict
, ordered by tag:
如果您检查doc
,您会看到它是一个OrderedDict
,按标签排序:
>>> doc
OrderedDict([('artikelen',
OrderedDict([('artikel',
[OrderedDict([('@nummer', '121'),
('code', 'ABC123'),
('naam', 'Highlight pen'),
('voorraad', '231'),
('prijs', '0.56')]),
OrderedDict([('@nummer', '123'),
('code', 'PQR678'),
('naam', 'Nietmachine'),
('voorraad', '587'),
('prijs', '9.99')])])]))])
The root node is called artikelen
, and there a subnode artikel
which is a list of OrderedDict
objects, so if you want the code
for every article, you would do:
根节点被称为artikelen
,并且有一个子节点,artikel
它是一个OrderedDict
对象列表,所以如果你想要code
每篇文章,你可以这样做:
codes = []
for artikel in doc['artikelen']['artikel']:
codes.append(artikel['code'])
# >>> codes
# ['ABC123', 'PQR678']
If you specifically want the code
only when nummer
is 121
, you could do this:
如果你特别想要code
only when nummer
is 121
,你可以这样做:
code = None
for artikel in doc['artikelen']['artikel']:
if artikel['@nummer'] == '121':
code = artikel['code']
break
That said, if you're parsing XML documents and want to search for a specific value like that, I would consider using XPath expressions, which are supported by ElementTree
.
也就是说,如果您正在解析 XML 文档并想要搜索这样的特定值,我会考虑使用XPath 表达式,它由ElementTree
.
回答by Chaitanya Sama
This is using xml.etree You can try this:
这是使用 xml.etree 你可以试试这个:
for artikelobj in root.findall('artikel'):
print artikelobj.find('code')
if you want to extract a specific code based on the attribute 'nummer' of artikel, then you can try this:
如果你想根据artikel的'nummer'属性提取特定的代码,那么你可以试试这个:
for artikelobj in root.findall('artikel'):
if artikel.get('nummer') == 121:
print artikelobj.find('code')
this will print only the code you want.
这将只打印您想要的代码。
回答by Chr
To read .xml files :
读取 .xml 文件:
import lxml.etree as ET
root = ET.parse(filename).getroot()
value = root.node1.node2.variable_name.text
回答by pseudo
You can use lxml package using XPath Expression.
您可以使用 XPath 表达式使用 lxml 包。
from lxml import etree
f = open("8_1.html", "r")
tree = etree.parse(f)
expression = "/artikelen/artikel[1]/code"
l = tree.xpath(expression)
code = next(i.text for i in l)
print code
# ABC123
The thing to notice here is the expression. /artikelen
is the root element. /artikel[1]
chooses the first artikel
element under root
(Notice first element is not at index 0). /code
is the child element under artikel[1]
. You can read more about at lxmland xpath syntax.
这里要注意的是表达式。/artikelen
是根元素。/artikel[1]
选择下的第一个artikel
元素root
(注意第一个元素不在索引 0 处)。/code
是 下的子元素artikel[1]
。您可以阅读有关lxml和xpath 语法的更多信息。