Python 使用 xml.etree.ElementTree 获取文件中的 XML 标签列表
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/29596584/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Getting a list of XML tags in file, using xml.etree.ElementTree
提问by FanaticD
As mentioned, I need to get the list of XML tags in file, using library xml.etree.ElementTree
.
如前所述,我需要使用 library 获取文件中的 XML 标记列表xml.etree.ElementTree
。
I am aware that there are properties and methods like ETVar.child, ETVar.getroot(), ETVar.tag, ETVar.attrib
.
我知道有像ETVar.child, ETVar.getroot(), ETVar.tag, ETVar.attrib
.
But to be able to use them and get at least name of tags on level 2, I had to use nested for.
但是为了能够使用它们并至少获得第 2 级的标签名称,我不得不使用嵌套的 for。
At the moment I have something like
目前我有类似的东西
for xmlChild in xmlRootTag:
if xmlChild.tag:
print(xmlChild.tag)
Goal would be to get a list of ALL, even deeply nested XML tagsin file, eliminating duplicates.
目标是获取文件中所有甚至深层嵌套的 XML 标记的列表,从而消除重复项。
For a better idea, I add the possible example of XML code:
为了更好的主意,我添加了 XML 代码的可能示例:
<root>
<firstLevel>
<secondlevel level="2">
<thirdlevel>
<fourth>text</fourth>
<fourth2>text</fourth>
</thirdlevel>
</secondlevel>
</firstlevel>
</root>
采纳答案by FanaticD
I've done more of a research on the subject and found out suitable solution. Since this could be a common task to do, I'll answer it, hence I believe it could help others.
我对该主题进行了更多研究,并找到了合适的解决方案。由于这可能是一项常见的任务,我会回答它,因此我相信它可以帮助其他人。
What I was looking for was etree method iter.
我正在寻找的是 etree 方法 iter。
import xml.etree.ElementTree as ET
# load and parse the file
xmlTree = ET.parse('myXMLFile.xml')
elemList = []
for elem in xmlTree.iter():
elemList.append(elem.tag)
# now I remove duplicities - by convertion to set and back to list
elemList = list(set(elemList))
# Just printing out the result
print(elemList)
Important notes
重要笔记
xml.etree.ElemTree
is a standard Python library- sample is written for
Python v3.2.3
- mechanic used to remove duplicities is based on converting to
set
, which allows only unique values and then converting back tolist
.
xml.etree.ElemTree
是一个标准的 Python 库- 样本是为
Python v3.2.3
- 用于删除重复项的机制基于转换为
set
,它只允许唯一值,然后转换回list
.
回答by Jonne Kleijer
You could use the built-in Python set comprehension:
您可以使用内置的 Python 集合理解:
import xml.etree.ElementTree as ET
xmlTree = ET.parse('myXMLFile.xml')
tags = {elem.tag for elem in xmlTree.iter()}
If you specifically need a list, you can cast it to a list:
如果你特别需要一个列表,你可以将它转换为一个列表:
import xml.etree.ElementTree as ET
xmlTree = ET.parse('myXMLFile.xml')
tags = list({elem.tag for elem in xmlTree.iter()})