Python XML 文件打开
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/18834393/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Python XML File Open
提问by Trying_hard
I am trying to open an xml file and parse it, but when I try to open it the file never seems to open at all it just keeps running, any ideas?
我正在尝试打开一个 xml 文件并解析它,但是当我尝试打开它时,该文件似乎根本没有打开它只是继续运行,有什么想法吗?
from xml.dom import minidom
Test_file = open('C::/test_file.xml','r')
xmldoc = minidom.parse(Test_file)
Test_file.close()
for i in xmldoc:
print('test')
The file is 180.288 KB, why does it never make it to the print portion?
该文件是 180.288 KB,为什么它永远不会进入打印部分?
采纳答案by kjhughes
Running your Python code with a few adjustments:
通过一些调整运行你的 Python 代码:
from xml.dom import minidom
Test_file = open('C:/test_file.xml','r')
xmldoc = minidom.parse(Test_file)
Test_file.close()
def printNode(node):
print node
for child in node.childNodes:
printNode(child)
printNode(xmldoc.documentElement)
With this sample input as test_file.xml:
将此示例输入作为 test_file.xml:
<a>
<b>testing 1</b>
<c>testing 2</c>
</a>
Yields this output:
产生这个输出:
<DOM Element: a at 0xbc56e8>
<DOM Text node "u'\n '">
<DOM Element: b at 0xbc5788>
<DOM Text node "u'testing 1'">
<DOM Text node "u'\n '">
<DOM Element: c at 0xbc5828>
<DOM Text node "u'testing 2'">
<DOM Text node "u'\n'">
Notes:
笔记:
- As @LukeWoodward mentioned, avoid DOM-based libraries for large inputs, however 180K should be fine. For 180M, control may never return from
minidom.parse()
without running out of memory first (MemoryError). - As @alecxe mentioned, you should eliminate the extraneous ':' in the file spec. You should have seen error output along the lines of
IOError: [Errno 22] invalid mode ('r') or filename: 'C::/test_file.xml'
. - As @mzjn mentioned,
xml.dom.minidom.Document
is not iterable. You should have seen error output along the lines ofTypeError: iteration over non-sequence
.
- 正如@LukeWoodward 所提到的,避免对大型输入使用基于 DOM 的库,但是 180K 应该没问题。对于 180M,如果没有
minidom.parse()
先耗尽内存(MemoryError),控制可能永远不会返回。 - 正如@alecxe 提到的,您应该消除文件规范中多余的“:”。您应该已经看到了
IOError: [Errno 22] invalid mode ('r') or filename: 'C::/test_file.xml'
. - 正如@mzjn 提到的,
xml.dom.minidom.Document
不可迭代。您应该已经看到了TypeError: iteration over non-sequence
.