Python 访问 csv 文件第 N 行的最佳方式
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/27307385/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Best way to access the Nth line of csv file
提问by Gabriel L'Heureux
I have to access the Nth line in a CSV file.
我必须访问 CSV 文件中的第 N 行。
Here's what I did:
这是我所做的:
import csv
the_file = open('path', 'r')
reader = csv.reader(the_file)
N = input('What line do you need? > ')
i = 0
for row in reader:
if i == N:
print("This is the line.")
print(row)
break
i += 1
the_file.close()
...but this does not feel optimal. Edit for precision: If the file is huge, I do not want to go through all the lines and I do not want to have to load the whole file into memory.
...但这感觉不是最佳的。精确编辑:如果文件很大,我不想遍历所有行,也不想将整个文件加载到内存中。
I do hope something like reader[N]
exists, but I have not found it.
我确实希望reader[N]
存在类似的东西,但我还没有找到。
Edit for answer: This line (coming from chosen answer) is what I was looking for:
编辑答案:这一行(来自选择的答案)是我正在寻找的:
next(itertools.islice(csv.reader(f), N, None)
采纳答案by Stuart
It makes little difference but it is slightly cleaner to use enumerate
rather than making your own counter variable.
它没有什么区别,但使用起来enumerate
比创建自己的计数器变量要干净一些。
for i, row in enumerate(reader):
if i == N:
print("This is the line.")
print(row)
break
You can also use itertools.islice
which is designed for this type of scenario - accessing a particular slice of an iterable without reading the whole thing into memory. It should be a bit more efficient than looping through the unwanted rows.
您还可以使用itertools.islice
which 是为这种类型的场景设计的 - 访问可迭代的特定部分,而无需将整个内容读入内存。它应该比循环遍历不需要的行更有效。
with open(path, 'r') as f:
N = int(input('What line do you need? > '))
print("This is the line.")
print(next(itertools.islice(csv.reader(f), N, None)))
But if your CSV file is small, just read the entire thing into a list, which you can then access with an index in the normal way. This also has the advantage that you can access several different rows in random order without having to reset the csv reader.
但是,如果您的 CSV 文件很小,只需将整个内容读入一个列表,然后您就可以通过索引以正常方式访问该列表。这还有一个优点,您可以以随机顺序访问多个不同的行,而无需重置 csv 读取器。
my_csv_data = list(reader)
print(my_csv_data[N])
回答by Marcin
You could minimize your for
loop into a comprehension expression, e.g.
您可以将for
循环最小化为理解表达式,例如
row = [row for i,row in enumerate(reader) if i == N][0]
# or even nicer as seen in iCodez code with next and generator expression
row = next(row for i,row in enumerate(reader) if i == N)
回答by Marcin
Your solution is actually not that bad. Advancing the file iterator to the line you want is a good approach and is used in many situations like this.
您的解决方案实际上并没有那么糟糕。将文件迭代器推进到您想要的行是一个很好的方法,并且在许多情况下都使用这种方法。
If you want it more concise though, you can use next
and enumerate
with a generator expression:
如果你想让它更简洁,你可以使用next
andenumerate
和一个生成器表达式:
import csv
the_file = open('path', 'r')
reader = csv.reader(the_file)
N = int(input('What line do you need? > '))
line = next((x for i, x in enumerate(reader) if i == N), None)
print(line)
the_file.close()
The None
in there is what will be returned if the line is not found (N
is too large). You can pick any other value though.
该None
在那里是,如果线路没有发现什么将返回(N
太大)。不过,您可以选择任何其他值。
You could also open the file with a with-statementto have it be automatically closed:
您还可以使用with 语句打开文件以使其自动关闭:
import csv
with open('path', 'r') as the_file:
reader = csv.reader(the_file)
N = int(input('What line do you need? > '))
line = next((x for i, x in enumerate(reader) if i == N), None)
print(line)
If you really want to cut down on size, you could do:
如果你真的想减少尺寸,你可以这样做:
from csv import reader
N = int(input('What line do you need? > '))
with open('path') as f:
print(next((x for i, x in enumerate(reader(f)) if i == N), None))
回答by ajmartin
You can simply do:
你可以简单地做:
n = 2 # line to print
fd = open('foo.csv', 'r')
lines = fd.readlines()
print lines[n-1] # prints 2nd line
fd.close()
Or even better to utilize less memory by not loading entire file into memory:
或者甚至更好地通过不将整个文件加载到内存中来使用更少的内存:
import linecache
n = 2
linecache.getline('foo.csv', n)
回答by Tanveer Alam
import csv
with open('cvs_file.csv', 'r') as inFile:
reader = csv.reader(inFile)
my_content = list(reader)
line_no = input('What line do you need(line number begins from 0)? > ')
if line_no < len(my_content):
print(my_content[line_no])
else:
print('This line does not exists')
As a result
now you can get any line by its index
directly
:
作为result
现在你可以通过它得到任何行index
directly
:
What line do you need? > 2
['101', '0.19', '1']
What line do you need? > 100
This line does not exists
回答by martineau
Theitertools
module has a number of functions for creating specialized iterators — and itsislice()
function could be used to easily solve this problem:
该itertools
模块有许多用于创建专用迭代器的函数——它的islice()
函数可以用来轻松解决这个问题:
import csv
import itertools
N = 5 # desired line number
with open('path.csv', newline='') as the_file:
row = next(csv.reader(itertools.islice(the_file, N, N+1)))
print("This is the line.")
print(row)
P.S. For the curious, my initial response — which also works (arguably better) — was:
PS 对于好奇,我最初的反应——也有效(可以说更好)——是:
row = next(itertools.islice(csv.reader(the_file), N, N+1))