使用python,如何读取从第七行开始的文件?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/4864361/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Using python, how to read a file starting at the seventh line ?
提问by Merlin
I have a text file structure as:
我有一个文本文件结构:
date
downland
user
date data1 date2
201102 foo bar 200 50
201101 foo bar 300 35
So first six lines of file are not needed. filename:dnw.txt
所以不需要文件的前六行。文件名:dnw.txt
f = open('dwn.txt', 'rb')
How do I "split" this file starting at line 7 to EOF?
如何将该文件从第 7 行开始“拆分”为 EOF?
采纳答案by John Machin
with open('dwn.txt') as f:
for i in xrange(6):
f.next()
for line in f:
process(line)
回答by Spacedman
Just do f.readline() six times. Ignore the returned value.
只需执行 f.readline() 六次。忽略返回值。
回答by Convolution
You can read the entire file into an array/list and then just start at the index appropriate to the line you wish to start reading at.
您可以将整个文件读入一个数组/列表,然后从适合您希望开始读取的行的索引处开始。
f = open('dwn.txt', 'rb')
fileAsList = f.readlines()
fileAsList[0] #first line
fileAsList[1] #second line
回答by Josh Lee
Itertools answer!
Itertools 回答!
from itertools import islice
with open('foo') as f:
for line in islice(f, 6, None):
print line
回答by systempuntoout
with open('test.txt', 'r') as fo:
for i in xrange(6):
fo.next()
for line in fo:
print "%s" % line.strip()
回答by Cuga
#!/usr/bin/python
with open('dnw.txt', 'r') as f:
lines_7_through_end = f.readlines()[6:]
print "Lines 7+:"
i = 7;
for line in lines_7_through_end:
print " Line %s: %s" % (i, line)
i+=1
Prints:
印刷:
Lines 7+:
Line 7: 201102 foo bar 200 50 Line 8: 201101 foo bar 300 35
第 7 行以上:
Line 7: 201102 foo bar 200 50 Line 8: 201101 foo bar 300 35
Edit:
编辑:
To rebuild dwn.txtwithout the first six lines, do this after the above code:
要在dwn.txt没有前六行的情况下重建,请在上述代码之后执行此操作:
with open('dnw.txt', 'w') as f:
for line in lines_7_through_end:
f.write(line)
回答by eyquem
Solutions with readlines()are not satisfactory in my opinion because readlines()reads the entire file. The user will have to read again the lines (in file or in the produced list) to process what he wants, while it could have been done without having read the intersting lines already a first time. Moreover if the file is big, the memory is weighed by the file's content while a for line in fileinstruction would have been lighter.
我认为使用readlines() 的解决方案并不令人满意,因为readlines()读取整个文件。用户将不得不再次阅读这些行(在文件中或在生成的列表中)以处理他想要的内容,而无需第一次阅读中间行就可以完成。此外,如果文件很大,内存会受到文件内容的影响,而for line in file指令会更轻。
Doing repetition of readline() can be done like that
可以像这样重复 readline()
nb = 6
exec( nb * 'f.readline()\n')
It's short piece of code and nbis programmatically adjustable
这是一小段代码,nb可以通过编程进行调整
回答by eyquem
In fact, to answer precisely at the question as it was written
事实上,要准确地回答所写的问题
How do I "split" this file starting at line 7 to EOF?
you can do
你可以做
:
:
in case the file is not big:
如果文件不大:
with open('dwn.txt','rb+') as f:
for i in xrange(6):
print f.readline()
content = f.read()
f.seek(0,0)
f.write(content)
f.truncate()
in case the file is very big
如果文件很大
with open('dwn.txt','rb+') as ahead, open('dwn.txt','rb+') as back:
for i in xrange(6):
print ahead.readline()
x = 100000
chunk = ahead.read(x)
while chunk:
print repr(chunk)
back.write(chunk)
chunk = ahead.read(x)
back.truncate()
The truncate()function is essential to put the EOF you asked for. Without executing truncate(), the tail of the file, corresponding to the offset of 6 lines, would remain.
该截断()功能是必不可少的把你要的EOF。不执行truncate(),文件的尾部,对应于 6 行的偏移量,将保留。
.
.
The file mustbe opened in binary mode to prevent any problem to happen.
该文件必须以二进制模式打开以防止发生任何问题。
When Python reads '\r\n', it transforms them in '\n'(that's the Universal Newline Support, enabled by default) , that is to say there are only '\n'in the chains chunkeven if there were '\r\n'in the file.
当 Python 读取'\r\n' 时,它会将它们转换为'\n'(这是 Universal Newline Support,默认启用),也就是说,即使有',链块中也只有'\n ' \r\n'在文件中。
If the file is from Macintosh origin , it contains only CR = '\r'newlines before the treatment but they will be changed to '\n'or '\r\n'(according to the platform) during the rewriting on a non-Macintosh machine.
如果该文件是从Macintosh的起源,它仅包含CR = “\ R”换行符治疗之前,但它们将被更改为“\ n”或“\ r \ N”(根据平台)上的非重写期间- 麦金塔机。
If it is a file from Linux origin, it contains only LF = '\n'newlines which, on a Windows OS, will be changed to '\r\n'(I don't know for a Linux file processed on a Macintosh ). The reason is that the OS Windows writes '\r\n'whatever it is ordered to write , '\n'or '\r'or '\r\n'. Consequently, there would be more characters rewritten than having been read, and then the offset between the file's pointers aheadand backwould diminish and cause a messy rewriting.
如果它是来自 Linux 的文件,它只包含 LF = '\n'换行符,在 Windows 操作系统上,这些换行符将更改为'\r\n'(我不知道在 Macintosh 上处理的 Linux 文件)。原因是操作系统 Windows 写入'\r\n'任何命令写入的内容,'\n'或'\r'或'\r\n'。因此,会有比重写已经读出更多的字符,然后将文件的指针之间的偏移提前和背部会减少,并导致混乱的改写。
In HTML sources , there are also various newlines.
在 HTML 源代码中,还有各种换行符。
That's why it's always preferable to open files in binary mode when they are so processed.
这就是为什么在处理文件时最好以二进制模式打开文件的原因。
回答by strpeter
Alternative version
替代版本
You can direct use the command read()if you know the character position posof the separating (header part from the part of interest) linebreak, e.g. an \n, in the text at which you want to break your input text:
read()如果您知道要中断输入文本的文本中pos分隔(标题部分与感兴趣的部分)换行符的字符位置,例如 an \n,则可以直接使用该命令:
with open('input.txt', 'r') as txt_in:
txt_in.seek(pos)
second_half = txt_in.read()
If you are interested in both halfs, you could also investigate the following method:
如果您对两半都感兴趣,还可以研究以下方法:
with open('input.txt', 'r') as txt_in:
all_contents = txt_in.read()
first_half = all_contents[:pos]
second_half = all_contents[pos:]
回答by KiteCoder
Python 3:
蟒蛇3:
with open("file.txt","r") as f:
for i in range(6):
f.readline()
for line in f:
# process lines 7-end

