Python读取大文本文件(几GB)的最快方法
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/14944183/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Python fastest way to read a large text file (several GB)
提问by Gianni Spear
i have a large text file (~7 GB). I am looking if exist the fastest way to read large text file. I have been reading about using several approach as read chunk-by-chunk in order to speed the process.
我有一个大文本文件(~7 GB)。我正在寻找是否存在读取大文本文件的最快方法。我一直在阅读有关使用多种方法逐块读取以加快进程的信息。
at example effbotsuggest
例如effbot建议
# File: readline-example-3.py
file = open("sample.txt")
while 1:
lines = file.readlines(100000)
if not lines:
break
for line in lines:
pass # do something**strong text**
in order to process 96,900 lines of text per second. Other authorssuggest to use islice()
为了每秒处理 96,900 行文本。其他作者建议使用 islice()
from itertools import islice
with open(...) as f:
while True:
next_n_lines = list(islice(f, n))
if not next_n_lines:
break
# process next_n_lines
list(islice(f, n))will return a list of the next nlines of the file f. Using this inside a loop will give you the file in chunks of nlines
list(islice(f, n))将返回n文件下一行的列表f。使用这个循环里面会给你的块中的文件n行
回答by Morten Larsen
with open(<FILE>) as FileObj:
for lines in FileObj:
print lines # or do some other thing with the line...
will read one line at the time to memory, and close the file when done...
将一次读取一行到内存中,完成后关闭文件...

