Python 如何使用 xmltodict 从 xml 文件中获取项目

Question

提问by Sam Vruggink

I am trying to easily access values from an xml file.

我正在尝试轻松访问 xml 文件中的值。

<artikelen>
    <artikel nummer="121">
        <code>ABC123</code>
        <naam>Highlight pen</naam>
        <voorraad>231</voorraad>
        <prijs>0.56</prijs>
    </artikel>
    <artikel nummer="123">
        <code>PQR678</code>
        <naam>Nietmachine</naam>
        <voorraad>587</voorraad>
        <prijs>9.99</prijs>
    </artikel>
..... etc

If i want to acces the value ABC123, how do I get it?

如果我想访问值 ABC123，我如何获得它？

import xmltodict

with open('8_1.html') as fd:
    doc = xmltodict.parse(fd.read())
    print(doc[fd]['code'])

Answer 1

回答by Paul

Using your example:

使用您的示例：

import xmltodict

with open('artikelen.xml') as fd:
    doc = xmltodict.parse(fd.read())

If you examine doc, you'll see it's an OrderedDict, ordered by tag:

如果您检查doc，您会看到它是一个OrderedDict，按标签排序：

>>> doc
OrderedDict([('artikelen',
              OrderedDict([('artikel',
                            [OrderedDict([('@nummer', '121'),
                                          ('code', 'ABC123'),
                                          ('naam', 'Highlight pen'),
                                          ('voorraad', '231'),
                                          ('prijs', '0.56')]),
                             OrderedDict([('@nummer', '123'),
                                          ('code', 'PQR678'),
                                          ('naam', 'Nietmachine'),
                                          ('voorraad', '587'),
                                          ('prijs', '9.99')])])]))])

The root node is called artikelen, and there a subnode artikelwhich is a list of OrderedDictobjects, so if you want the codefor every article, you would do:

根节点被称为artikelen，并且有一个子节点，artikel它是一个OrderedDict对象列表，所以如果你想要code每篇文章，你可以这样做：

codes = []
for artikel in doc['artikelen']['artikel']:
    codes.append(artikel['code'])

# >>> codes
# ['ABC123', 'PQR678']

If you specifically want the codeonly when nummeris 121, you could do this:

如果你特别想要codeonly when nummeris 121，你可以这样做：

code = None
for artikel in doc['artikelen']['artikel']:
    if artikel['@nummer'] == '121':
        code = artikel['code']
        break

That said, if you're parsing XML documents and want to search for a specific value like that, I would consider using XPath expressions, which are supported by ElementTree.

也就是说，如果您正在解析 XML 文档并想要搜索这样的特定值，我会考虑使用XPath 表达式，它由ElementTree.

Answer 2

回答by Chaitanya Sama

This is using xml.etree You can try this:

这是使用 xml.etree 你可以试试这个：

for artikelobj in root.findall('artikel'):
    print artikelobj.find('code')

if you want to extract a specific code based on the attribute 'nummer' of artikel, then you can try this:

如果你想根据artikel的'nummer'属性提取特定的代码，那么你可以试试这个：

for artikelobj in root.findall('artikel'):
    if artikel.get('nummer') == 121:
        print artikelobj.find('code')

this will print only the code you want.

这将只打印您想要的代码。

Answer 3

回答by Chr

To read .xml files :

读取 .xml 文件：

import lxml.etree as ET
root = ET.parse(filename).getroot()
value = root.node1.node2.variable_name.text

Answer 4

回答by pseudo

You can use lxml package using XPath Expression.

您可以使用 XPath 表达式使用 lxml 包。

from lxml import etree
f = open("8_1.html", "r")
tree = etree.parse(f)
expression = "/artikelen/artikel[1]/code"
l = tree.xpath(expression)
code = next(i.text for i in l)
print code

# ABC123

The thing to notice here is the expression. /artikelenis the root element. /artikel[1]chooses the first artikelelement under root(Notice first element is not at index 0). /codeis the child element under artikel[1]. You can read more about at lxmland xpath syntax.

这里要注意的是表达式。/artikelen是根元素。/artikel[1]选择下的第一个artikel元素root（注意第一个元素不在索引 0 处）。/code是下的子元素artikel[1]。您可以阅读有关lxml和xpath 语法的更多信息。

Python 如何使用 xmltodict 从 xml 文件中获取项目

提问by Sam Vruggink

回答by Paul

回答by Chaitanya Sama

回答by Chr

回答by pseudo

相关推荐

最近更新

标签

Python 如何使用 xmltodict 从 xml 文件中获取项目

提问by Sam Vruggink

回答by Paul

回答by Chaitanya Sama

回答by Chr

回答by pseudo

相关推荐

python tkInter浏览文件夹按钮

如何使用 Python 的 RotatingFileHandler

你如何在python中进行块注释？

何时在python中应用（pd.to_numeric）以及何时使用astype（np.float64）？

相关推荐

最近更新

标签