读取文件 python 中的上一行
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/17373118/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Read previous line in a file python
提问by Lim H.
I need to get the value of the previous line in a file and compare it with the current line as I'm iterating through the file. The file is HUGE so I can't read it whole or randomly accessing a line number with linecache
because the library function still reads the whole file into memory anyway.
我需要获取文件中前一行的值,并在我遍历文件时将其与当前行进行比较。该文件很大,因此我无法读取整个文件或随机访问行号,linecache
因为库函数仍然将整个文件读入内存。
EDITI'm so sorry I forgot the mention that I have to read the file backwardly.
编辑我很抱歉我忘了提到我必须向后阅读文件。
EDIT2
编辑2
I have tried the following:
我尝试了以下方法:
f = open("filename", "r")
for line in reversed(f.readlines()): # this doesn't work because there are too many lines to read into memory
line = linecache.getline("filename", num_line) # this also doesn't work due to the same problem above.
采纳答案by Stephan
Just save the previous when you iterate to the next
迭代到下一个时只需保存上一个
prevLine = ""
for line in file:
# do some work here
prevLine = line
This will store the previous line in prevLine
while you are looping
这将在prevLine
您循环时存储上一行
editapparently OP needs to read this file backwards:
编辑显然 OP 需要向后读取此文件:
aaand after like an hour of research I failed multiple times to do it within memory constraints
aaand 经过一个小时的研究,我在内存限制内多次失败
Hereyou go Lim, that guy knows what he's doing, here is his best Idea:
在这里你去林,那家伙知道自己在做什么,这里是他最好的想法:
General approach #2: Read the entire file, store position of lines
With this approach, you also read through the entire file once, but instead of storing the entire file (all the text) in memory, you only store the binary positions inside the file where each line started. You can store these positions in a similar data structure as the one storing the lines in the first approach.
Whever you want to read line X, you have to re-read the line from the file, starting at the position you stored for the start of that line.
Pros: Almost as easy to implement as the first approach Cons: can take a while to read large files
一般方法#2:读取整个文件,存储行的位置
使用这种方法,您还可以通读整个文件一次,但不是将整个文件(所有文本)存储在内存中,而是仅将二进制位置存储在文件中每一行开始的位置。您可以将这些位置存储在与第一种方法中存储行的数据结构类似的数据结构中。
无论您想读取第 X 行,都必须从文件中重新读取该行,从您存储的该行开头的位置开始。
优点:几乎和第一种方法一样容易实现缺点:读取大文件可能需要一段时间
回答by mgilson
I'd write a simple generator for the task:
我会为这个任务编写一个简单的生成器:
def pairwise(fname):
with open(fname) as fin:
prev = next(fin)
for line in fin:
yield prev,line
prev = line
Or, you can use the pairwise
recipe from itertools
:
或者,您可以使用以下pairwise
配方itertools
:
def pairwise(iterable):
"s -> (s0,s1), (s1,s2), (s2, s3), ..."
a, b = itertools.tee(iterable)
next(b, None)
return itertools.izip(a, b)
回答by Diana
@Lim, here's how I would write it (reply to the comments)
@Lim,这是我的写作方式(回复评论)
def do_stuff_with_two_lines(previous_line, current_line):
print "--------------"
print previous_line
print current_line
my_file = open('my_file.txt', 'r')
if my_file:
current_line = my_file.readline()
for line in my_file:
previous_line = current_line
current_line = line
do_stuff_with_two_lines(previous_line, current_line)