Python 循环遍历文本文件,readline() 构造在大文件上失败
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/4568171/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Looping through a text file, readline() construction fails on large files
提问by WombatPM
In Python 2.6 and 2.7 I would have thought that these two constructs would be identical:
在 Python 2.6 和 2.7 中,我会认为这两个构造是相同的:
Method A
方法一
i=0
f=open('fred.txt','r')
for line in f.readline():
i+=1
print i
Method B
方法B
i=0
f=open('fred.txt','r')
for line in f:
i+=1
print i
However, when fred.txt grew to be 74,000 lines, with each line 2,684 characters in length, Method Aprints 2685 while Method Bprints 74000. Obviously, Method B is preferred, but why does Method A work for small files but fail for large files?
但是,当 fred.txt 增长到 74,000 行,每行长度为 2,684 个字符时,方法 A打印 2685,而方法 B打印 74000。显然,方法 B 是首选,但为什么方法 A 对小文件有效,但对大文件无效文件?
采纳答案by Josh Lee
There's a typo, it should be f.readlines(). You're reading one line and looping through each character in the line.
有一个错字,它应该是f.readlines()。您正在阅读一行并遍历该行中的每个字符。
Both methods (readlinesvs iterating over the file directly) ought to give the same results, but readlineswill store the entire contents in memory.
这两种方法(readlines与直接迭代文件)应该给出相同的结果,但readlines会将整个内容存储在内存中。

