Python:如何使用 xml.dom.minidom 获取 XML 元素的文本内容?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/4485132/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 15:57:57  来源:igfitidea点击:

Python: How do you get an XML element's text content using xml.dom.minidom?

pythonxmlminidom

提问by mindthief

I've called elems = xmldoc.getElementsByTagName('myTagName')on an XML object that I parsed as minidom.parse(xmlObj). Now I'm trying to get the text content of this element, and although I spent a while looking through the dir() and trying things out, I haven't found the call yet. As an example of what I want to accomplish, in:

我调用elems = xmldoc.getElementsByTagName('myTagName')了一个 XML 对象,我将其解析为minidom.parse(xmlObj). 现在我正在尝试获取该元素的文本内容,虽然我花了一段时间查看 dir() 并尝试了一些东西,但我还没有找到调用。作为我想要完成的一个例子,在:

<myTagName> Hello there </myTagName>

<myTagName> Hello there </myTagName>

I would like the extract just "Hello there". (obviously I could parse this myself but I expect there is some built-in functionality)

我想要摘录只是“你好”。(显然我可以自己解析,但我希望有一些内置功能)

Thanks

谢谢

采纳答案by ismail

Try like this:

像这样尝试:

xmldoc.getElementsByTagName('myTagName')[0].firstChild.nodeValue

回答by James Thompson

for elem in elems:
    print elem.firstValue.nodeValue

That will print out each myTagName's text.

这将打印出每个 myTagName 的文本。

James

詹姆士

回答by mike rodent

wait a mo... do you want ALL the text under a given node? It has then to involve a subtree traversal function of some kind. Doesn't have to be recursive but this works fine:

等一下...你想要给定节点下的所有文本吗?然后它必须涉及某种类型的子树遍历函数。不必是递归的,但这工作正常:

    def get_all_text( node ):
        if node.nodeType ==  node.TEXT_NODE:
            return node.data
        else:
            text_string = ""
            for child_node in node.childNodes:
                text_string += get_all_text( child_node )
            return text_string