Python 使用 xml.etree.ElementTree 获取文件中的 XML 标签列表

Question

提问by FanaticD

As mentioned, I need to get the list of XML tags in file, using library xml.etree.ElementTree.

如前所述，我需要使用 library 获取文件中的 XML 标记列表xml.etree.ElementTree。

I am aware that there are properties and methods like ETVar.child, ETVar.getroot(), ETVar.tag, ETVar.attrib.

我知道有像ETVar.child, ETVar.getroot(), ETVar.tag, ETVar.attrib.

But to be able to use them and get at least name of tags on level 2, I had to use nested for.

但是为了能够使用它们并至少获得第 2 级的标签名称，我不得不使用嵌套的 for。

At the moment I have something like

目前我有类似的东西

for xmlChild in xmlRootTag:
    if xmlChild.tag:
        print(xmlChild.tag)

Goal would be to get a list of ALL, even deeply nested XML tagsin file, eliminating duplicates.

目标是获取文件中所有甚至深层嵌套的 XML 标记的列表，从而消除重复项。

For a better idea, I add the possible example of XML code:

为了更好的主意，我添加了 XML 代码的可能示例：

<root>
 <firstLevel>
  <secondlevel level="2">
    <thirdlevel>
      <fourth>text</fourth>
      <fourth2>text</fourth>
    </thirdlevel>
  </secondlevel>
 </firstlevel>
</root>

Answer 1

采纳答案by FanaticD

I've done more of a research on the subject and found out suitable solution. Since this could be a common task to do, I'll answer it, hence I believe it could help others.

我对该主题进行了更多研究，并找到了合适的解决方案。由于这可能是一项常见的任务，我会回答它，因此我相信它可以帮助其他人。

What I was looking for was etree method iter.

我正在寻找的是 etree 方法 iter。

import xml.etree.ElementTree as ET
# load and parse the file
xmlTree = ET.parse('myXMLFile.xml')

elemList = []

for elem in xmlTree.iter():
    elemList.append(elem.tag)

# now I remove duplicities - by convertion to set and back to list
elemList = list(set(elemList))

# Just printing out the result
print(elemList)

Important notes

重要笔记

xml.etree.ElemTreeis a standard Python library
sample is written for Python v3.2.3
mechanic used to remove duplicities is based on converting to set, which allows only unique values and then converting back to list.

xml.etree.ElemTree是一个标准的 Python 库
样本是为 Python v3.2.3
用于删除重复项的机制基于转换为set，它只允许唯一值，然后转换回list.

Answer 2

回答by Jonne Kleijer

You could use the built-in Python set comprehension:

您可以使用内置的 Python 集合理解：

import xml.etree.ElementTree as ET

xmlTree = ET.parse('myXMLFile.xml')
tags = {elem.tag for elem in xmlTree.iter()}

If you specifically need a list, you can cast it to a list:

如果你特别需要一个列表，你可以将它转换为一个列表：

import xml.etree.ElementTree as ET

xmlTree = ET.parse('myXMLFile.xml')
tags = list({elem.tag for elem in xmlTree.iter()})

Python 使用 xml.etree.ElementTree 获取文件中的 XML 标签列表

提问by FanaticD

采纳答案by FanaticD

Important notes

重要笔记

回答by Jonne Kleijer

相关推荐

最近更新

标签

Python 使用 xml.etree.ElementTree 获取文件中的 XML 标签列表

提问by FanaticD

采纳答案by FanaticD

Important notes

重要笔记

回答by Jonne Kleijer

相关推荐

如何将会话和 cookie 从 Selenium 浏览器加载到 Python 中的请求库？

Python “未安装带有标签‘admin’的应用”正在运行 Django 迁移。该应用程序已正确安装

Python 官方安装程序缺少 python27.dll

Python 错误 - IDLE 的子进程没有建立连接。IDLE 无法启动或个人防火墙软件阻止连接

相关推荐

最近更新

标签