如何使用 ElementTree 在 Python 中递归迭代 XML 标签？

Question

提问by kloop

I am trying to iterate over all nodes in a tree using ElementTree.

我正在尝试使用 ElementTree 遍历树中的所有节点。

I do something like:

我做这样的事情：

  tree = ET.parse("/tmp/test.xml")

  root = tree.getroot()

  for child in root:
       ### do something with child

The problem is that child is an Element object and not ElementTree object, so I can't further look into it and recurse to iterate over its elements. Is there a way to iterate differently over "root" so that it iterates over the top level nodes in the tree (immediate children) and return the same class as root itself?

问题是 child 是一个 Element 对象而不是 ElementTree 对象，所以我不能进一步研究它并递归迭代它的元素。有没有办法对“根”进行不同的迭代，以便它迭代树中的顶级节点（直接子节点）并返回与根本身相同的类？

Answer 1

采纳答案by Robert Christie

To iterate over all nodes, use the itermethod on the ElementTree, not the root Element.

要遍历所有节点，请在ElementTree上使用iter方法，而不是在根 Element 上使用。

The root is an Element, just like the other elements in the tree and only really has context of its own attributes and children. The ElementTree has the context for all Elements.

根是一个元素，就像树中的其他元素一样，只有它自己的属性和子元素的上下文。ElementTree 具有所有元素的上下文。

For example, given this xml

例如，给定这个 xml

<?xml version="1.0"?>
<data>
    <country name="Liechtenstein">
        <rank>1</rank>
        <year>2008</year>
        <gdppc>141100</gdppc>
        <neighbor name="Austria" direction="E"/>
        <neighbor name="Switzerland" direction="W"/>
    </country>
    <country name="Singapore">
        <rank>4</rank>
        <year>2011</year>
        <gdppc>59900</gdppc>
        <neighbor name="Malaysia" direction="N"/>
    </country>
    <country name="Panama">
        <rank>68</rank>
        <year>2011</year>
        <gdppc>13600</gdppc>
        <neighbor name="Costa Rica" direction="W"/>
        <neighbor name="Colombia" direction="E"/>
    </country>
</data>

You can do the following

您可以执行以下操作

>>> import xml.etree.ElementTree as ET
>>> tree = ET.parse('test.xml')
>>> for elem in tree.iter():
...     print elem
... 
<Element 'data' at 0x10b2d7b50>
<Element 'country' at 0x10b2d7b90>
<Element 'rank' at 0x10b2d7bd0>
<Element 'year' at 0x10b2d7c50>
<Element 'gdppc' at 0x10b2d7d10>
<Element 'neighbor' at 0x10b2d7e90>
<Element 'neighbor' at 0x10b2d7ed0>
<Element 'country' at 0x10b2d7f10>
<Element 'rank' at 0x10b2d7f50>
<Element 'year' at 0x10b2d7f90>
<Element 'gdppc' at 0x10b2d7fd0>
<Element 'neighbor' at 0x10b2db050>
<Element 'country' at 0x10b2db090>
<Element 'rank' at 0x10b2db0d0>
<Element 'year' at 0x10b2db110>
<Element 'gdppc' at 0x10b2db150>
<Element 'neighbor' at 0x10b2db190>
<Element 'neighbor' at 0x10b2db1d0>

Answer 2

回答by trustory

you can also access specific elements like this:

您还可以访问这样的特定元素：

country= tree.findall('.//country')

then loop over range(len(country))and access

然后循环range(len(country))访问

Answer 3

回答by ssjadon

Adding to Robert Christie's answer it is possible to iterate over all nodes using fromstring()by converting the Element to an ElementTree:

添加到 Robert Christie 的答案中，可以fromstring()通过将 Element 转换为 ElementTree来迭代所有节点：

import xml.etree.ElementTree as ET

e = ET.ElementTree(ET.fromstring(xml_string))
for elt in e.iter():
    print "%s: '%s'" % (elt.tag, elt.text)

Answer 4

回答by FatihAkici

In addition to Robert Christie's accepted answer, printing the values and tags separately is very easy:

除了罗伯特克里斯蒂接受的答案之外，单独打印值和标签非常容易：

tree = ET.parse('test.xml')
for elem in tree.iter():
    print(elem.tag, elem.text)

如何使用 ElementTree 在 Python 中递归迭代 XML 标签？

提问by kloop

采纳答案by Robert Christie

回答by trustory

回答by ssjadon

回答by FatihAkici

相关推荐

最近更新

标签

如何使用 ElementTree 在 Python 中递归迭代 XML 标签？

提问by kloop

采纳答案by Robert Christie

回答by trustory

回答by ssjadon

回答by FatihAkici

相关推荐

Python 强制转换为 Unicode：需要字符串或缓冲区，在 django admin 中渲染时发现 NoneType

两个日期之间python中的整数差异

Python sklearn.LabelEncoder 以前从未见过的值

溢出错误：long int 太大而无法在 python 中转换为浮点数

相关推荐

最近更新

标签