Python 在 ElementTree 中检查 XML 元素是否有子元素

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/25950635/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 23:51:18  来源:igfitidea点击:

Check if XML Element has children or not, in ElementTree

pythonxmlelementtreechildren

提问by アレックス

I retrieve an XML documents this way:

我以这种方式检索 XML 文档:

import xml.etree.ElementTree as ET

root = ET.parse(urllib2.urlopen(url))
for child in root.findall("item"):
  a1 = child[0].text # ok
  a2 = child[1].text # ok
  a3 = child[2].text # ok
  a4 = child[3].text # BOOM
  # ...

The XML looks like this:

XML 如下所示:

<item>
  <a1>value1</a1>
  <a2>value2</a2>
  <a3>value3</a3>
  <a4>
    <a11>value222</a11>
    <a22>value22</a22>
  </a4>
</item>

How do I check if a4(in this particular case, but it might've been any other element) has children?

我如何检查a4(在这种特殊情况下,但它可能是任何其他元素)是否有孩子?

采纳答案by jlr

You could try the listfunction on the element:

您可以list在元素上尝试该功能:

>>> xml = """<item>
  <a1>value1</a1>
  <a2>value2</a2>
  <a3>value3</a3>
  <a4>
    <a11>value222</a11>
    <a22>value22</a22>
  </a4>
</item>"""
>>> root = ET.fromstring(xml)
>>> list(root[0])
[]
>>> list(root[3])
[<Element 'a11' at 0x2321e10>, <Element 'a22' at 0x2321e48>]
>>> len(list(root[3]))
2
>>> print "has children" if len(list(root[3])) else "no child"
has children
>>> print "has children" if len(list(root[2])) else "no child"
no child
>>> # Or simpler, without a call to list within len, it also works:
>>> print "has children" if len(root[3]) else "no child"
has children

I modified your sample because the findallfunction call on the itemroot did not work (as findallwill search for direct descendants, and not the current element). If you want to access text of the subchildren afterward in your working program, you could do:

我修改了您的示例,因为根findall上的函数调用item不起作用(findall搜索直接后代,而不是当前元素)。如果您想在之后的工作程序中访问子子项的文本,您可以执行以下操作:

for child in root.findall("item"):
  # if there are children, get their text content as well.
  if len(child): 
    for subchild in child:
      subchild.text
  # else just get the current child text.
  else:
    child.text

This would be a good fit for a recursive though.

不过,这将非常适合递归。

回答by marscher

The element class has the get children method. So you should use something like this, to check if there are children and store result in a dictionary by key=tag name:

元素类具有 get children 方法。所以你应该使用这样的东西,检查是否有孩子并通过 key=tag 名称将结果存储在字典中:

result = {}
for child in root.findall("item"):
   if child.getchildren() == []:
      result[child.tag] = child.text

回答by roippi

I would personally recommend that you use an xml parser that fully supports xpath expressions. The subset supported by xml.etreeis insufficient for tasks like this.

我个人建议您使用完全支持 xpath 表达式的 xml 解析器。支持xml.etree子集对于这样的任务是不够的。

For example, in lxmlI can do:

例如,在lxml我可以这样做:

"give me all children of the children of the <item>node":

“给我<item>节点子节点的所有子节点”:

doc.xpath('//item/*/child::*') #equivalent to '//item/*/*', if you're being terse
Out[18]: [<Element a11 at 0x7f60ec1c1348>, <Element a22 at 0x7f60ec1c1888>]

or,

或者,

"give me all of <item>'s children that have no children themselves":

“把<item>自己没有孩子的所有孩子都给我”:

doc.xpath('/item/*[count(child::*) = 0]')
Out[20]: 
[<Element a1 at 0x7f60ec1c1588>,
 <Element a2 at 0x7f60ec1c15c8>,
 <Element a3 at 0x7f60ec1c1608>]

or,

或者,

"give me ALL of the elements that don't have any children":

“给我所有没有任何孩子的元素”:

doc.xpath('//*[count(child::*) = 0]')
Out[29]: 
[<Element a1 at 0x7f60ec1c1588>,
 <Element a2 at 0x7f60ec1c15c8>,
 <Element a3 at 0x7f60ec1c1608>,
 <Element a11 at 0x7f60ec1c1348>,
 <Element a22 at 0x7f60ec1c1888>]

# and if I only care about the text from those nodes...
doc.xpath('//*[count(child::*) = 0]/text()')
Out[30]: ['value1', 'value2', 'value3', 'value222', 'value22']

回答by Mad Physicist

The simplest way I have been able to find is to use the boolvalue of the element directly. This means you can use a4in a conditional statement as-is:

我能找到的最简单的方法是直接使用bool元素的值。这意味着您可以a4按原样在条件语句中使用:

a4 = Element('a4')
if a4:
    print('Has kids')
else:
    print('No kids yet')

a4.append(Element('x'))
if a4:
    print('Has kids now')
else:
    print('Still no kids')

Running this code will print

运行此代码将打印

No kids yet
Has kids now

The boolean value of an element does not say anything about text, tailor attributes. It only indicates the presence or absence of children, which is what the original question was asking.

元素的布尔值没有说明text,tail或属性。它只表明孩子的存在与否,这就是最初的问题所问的。

回答by David Córdoba Ruiz

You can use the iter method

您可以使用 iter 方法

import xml.etree.ElementTree as ET

etree = ET.parse('file.xml')
root = etree.getroot()
a = []
for child in root.iter():
    if child.text:
        if len(child.text.split()) > 0:
            a.append(child.text)
print(a)