Python 使用 xml.etree.ElementTree 获取文件中的 XML 标签列表
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/29596584/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Getting a list of XML tags in file, using xml.etree.ElementTree
提问by FanaticD
As mentioned, I need to get the list of XML tags in file, using library xml.etree.ElementTree.
如前所述,我需要使用 library 获取文件中的 XML 标记列表xml.etree.ElementTree。
I am aware that there are properties and methods like ETVar.child, ETVar.getroot(), ETVar.tag, ETVar.attrib.
我知道有像ETVar.child, ETVar.getroot(), ETVar.tag, ETVar.attrib.
But to be able to use them and get at least name of tags on level 2, I had to use nested for.
但是为了能够使用它们并至少获得第 2 级的标签名称,我不得不使用嵌套的 for。
At the moment I have something like
目前我有类似的东西
for xmlChild in xmlRootTag:
if xmlChild.tag:
print(xmlChild.tag)
Goal would be to get a list of ALL, even deeply nested XML tagsin file, eliminating duplicates.
目标是获取文件中所有甚至深层嵌套的 XML 标记的列表,从而消除重复项。
For a better idea, I add the possible example of XML code:
为了更好的主意,我添加了 XML 代码的可能示例:
<root>
<firstLevel>
<secondlevel level="2">
<thirdlevel>
<fourth>text</fourth>
<fourth2>text</fourth>
</thirdlevel>
</secondlevel>
</firstlevel>
</root>
采纳答案by FanaticD
I've done more of a research on the subject and found out suitable solution. Since this could be a common task to do, I'll answer it, hence I believe it could help others.
我对该主题进行了更多研究,并找到了合适的解决方案。由于这可能是一项常见的任务,我会回答它,因此我相信它可以帮助其他人。
What I was looking for was etree method iter.
我正在寻找的是 etree 方法 iter。
import xml.etree.ElementTree as ET
# load and parse the file
xmlTree = ET.parse('myXMLFile.xml')
elemList = []
for elem in xmlTree.iter():
elemList.append(elem.tag)
# now I remove duplicities - by convertion to set and back to list
elemList = list(set(elemList))
# Just printing out the result
print(elemList)
Important notes
重要笔记
xml.etree.ElemTreeis a standard Python library- sample is written for
Python v3.2.3 - mechanic used to remove duplicities is based on converting to
set, which allows only unique values and then converting back tolist.
xml.etree.ElemTree是一个标准的 Python 库- 样本是为
Python v3.2.3 - 用于删除重复项的机制基于转换为
set,它只允许唯一值,然后转换回list.
回答by Jonne Kleijer
You could use the built-in Python set comprehension:
您可以使用内置的 Python 集合理解:
import xml.etree.ElementTree as ET
xmlTree = ET.parse('myXMLFile.xml')
tags = {elem.tag for elem in xmlTree.iter()}
If you specifically need a list, you can cast it to a list:
如果你特别需要一个列表,你可以将它转换为一个列表:
import xml.etree.ElementTree as ET
xmlTree = ET.parse('myXMLFile.xml')
tags = list({elem.tag for elem in xmlTree.iter()})

