Python 使用 xml.etree.ElementTree 获取文件中的 XML 标签列表

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/29596584/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 04:46:51  来源:igfitidea点击:

Getting a list of XML tags in file, using xml.etree.ElementTree

pythonxmltagselementtreetagname

提问by FanaticD

As mentioned, I need to get the list of XML tags in file, using library xml.etree.ElementTree.

如前所述,我需要使用 library 获取文件中的 XML 标记列表xml.etree.ElementTree

I am aware that there are properties and methods like ETVar.child, ETVar.getroot(), ETVar.tag, ETVar.attrib.

我知道有像ETVar.child, ETVar.getroot(), ETVar.tag, ETVar.attrib.

But to be able to use them and get at least name of tags on level 2, I had to use nested for.

但是为了能够使用它们并至少获得第 2 级的标签名称,我不得不使用嵌套的 for。

At the moment I have something like

目前我有类似的东西

for xmlChild in xmlRootTag:
    if xmlChild.tag:
        print(xmlChild.tag)

Goal would be to get a list of ALL, even deeply nested XML tagsin file, eliminating duplicates.

目标是获取文件中所有甚至深层嵌套的 XML 标记的列表,从而消除重复项。

For a better idea, I add the possible example of XML code:

为了更好的主意,我添加了 XML 代码的可能示例:

<root>
 <firstLevel>
  <secondlevel level="2">
    <thirdlevel>
      <fourth>text</fourth>
      <fourth2>text</fourth>
    </thirdlevel>
  </secondlevel>
 </firstlevel>
</root>

采纳答案by FanaticD

I've done more of a research on the subject and found out suitable solution. Since this could be a common task to do, I'll answer it, hence I believe it could help others.

我对该主题进行了更多研究,并找到了合适的解决方案。由于这可能是一项常见的任务,我会回答它,因此我相信它可以帮助其他人。

What I was looking for was etree method iter.

我正在寻找的是 etree 方法 iter。

import xml.etree.ElementTree as ET
# load and parse the file
xmlTree = ET.parse('myXMLFile.xml')

elemList = []

for elem in xmlTree.iter():
    elemList.append(elem.tag)

# now I remove duplicities - by convertion to set and back to list
elemList = list(set(elemList))

# Just printing out the result
print(elemList)

Important notes

重要笔记

  • xml.etree.ElemTreeis a standard Python library
  • sample is written for Python v3.2.3
  • mechanic used to remove duplicities is based on converting to set, which allows only unique values and then converting back to list.
  • xml.etree.ElemTree是一个标准的 Python 库
  • 样本是为 Python v3.2.3
  • 用于删除重复项的机制基于转换为set,它只允许唯一值,然后转换回list.

回答by Jonne Kleijer

You could use the built-in Python set comprehension:

您可以使用内置的 Python 集合理解:

import xml.etree.ElementTree as ET

xmlTree = ET.parse('myXMLFile.xml')
tags = {elem.tag for elem in xmlTree.iter()}

If you specifically need a list, you can cast it to a list:

如果你特别需要一个列表,你可以将它转换为一个列表:

import xml.etree.ElementTree as ET

xmlTree = ET.parse('myXMLFile.xml')
tags = list({elem.tag for elem in xmlTree.iter()})