从python中的文本文件中读取特定列
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/30216573/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Reading specific columns from a text file in python
提问by Jethro
I have a text file which contains a table comprised of numbers e.g:
我有一个文本文件,其中包含一个由数字组成的表格,例如:
5 10 6
6 20 1
7 30 4
8 40 3
9 23 1
4 13 6
5 10 6
6 20 1
7 30 4
8 40 3
9 23 1
4 13 6
if for example I want the numbers contained only in the second column, how do i extract that column into a list?
例如,如果我想要只包含在第二列中的数字,我如何将该列提取到列表中?
采纳答案by ForceBru
f=open(file,"r")
lines=f.readlines()
result=[]
for x in lines:
result.append(x.split(' ')[1])
f.close()
You can do the same using a list comprehension
你可以使用列表理解来做同样的事情
print [x.split(' ')[1] for x in open(file).readlines()]
Docs on split()
文档 split()
string.split(s[, sep[, maxsplit]])
Return a list of the words of the string
s
. If the optional second argument sep is absent or None, the words are separated by arbitrary strings of whitespace characters (space, tab, newline, return, formfeed). If the second argument sep is present and not None, it specifies a string to be used as the word separator. The returned list will then have one more item than the number of non-overlapping occurrences of the separator in the string.
string.split(s[, sep[, maxsplit]])
返回字符串的单词列表
s
。如果可选的第二个参数 sep 不存在或 None ,则单词由任意的空白字符字符串(空格、制表符、换行符、返回、换页符)分隔。如果第二个参数 sep 存在而不是 None,则它指定一个字符串作为单词分隔符。返回的列表将比字符串中分隔符的非重叠出现次数多一项。
So, you can omit the space I used and do just x.split()
but this will also remove tabs and newlines, be aware of that.
因此,您可以省略我使用的空间并执行x.split()
此操作,但这也会删除制表符和换行符,请注意这一点。
回答by ZdaR
First of all we open the file and as datafile
then we apply .read()
method reads the file contents and then we split the data which returns something like: ['5', '10', '6', '6', '20', '1', '7', '30', '4', '8', '40', '3', '9', '23', '1', '4', '13', '6']
and the we applied list slicing on this list to start from the element at index position 1 and skip next 3 elements untill it hits the end of the loop.
首先,我们打开文件,datafile
然后我们的 apply.read()
方法读取文件内容,然后我们拆分返回如下内容的数据:['5', '10', '6', '6', '20', '1', '7', '30', '4', '8', '40', '3', '9', '23', '1', '4', '13', '6']
我们在此列表上应用列表切片从索引位置 1 的元素开始并跳过下一个 3元素,直到它到达循环的末尾。
with open("sample.txt", "r") as datafile:
print datafile.read().split()[1::3]
Output:
输出:
['10', '20', '30', '40', '23', '13']
回答by Kasramvd
You can use a zip
function with a list comprehension :
您可以使用zip
具有列表理解的函数:
with open('ex.txt') as f:
print zip(*[line.split() for line in f])[1]
result :
结果 :
('10', '20', '30', '40', '23', '13')
回答by Adam Smith
You have a space delimited file, so use the module designed for reading delimited values files, csv
.
您有一个以空格分隔的文件,因此请使用专为读取分隔值文件而设计的模块csv
.
import csv
with open('path/to/file.txt') as inf:
reader = csv.reader(inf, delimiter=" ")
second_col = list(zip(*reader))[1]
# In Python2, you can omit the `list(...)` cast
The zip(*iterable)
pattern is useful for converting rows to columns or vice versa. If you're reading a file row-wise...
该zip(*iterable)
模式可用于将行转换为列,反之亦然。如果您正在逐行阅读文件...
>>> testdata = [[1, 2, 3],
[4, 5, 6],
[7, 8, 9]]
>>> for line in testdata:
... print(line)
[1, 2, 3]
[4, 5, 6]
[7, 8, 9]
...but need columns, you can pass each row to the zip
function
...但需要列,您可以将每一行传递给zip
函数
>>> testdata_columns = zip(*testdata)
# this is equivalent to zip([1,2,3], [4,5,6], [7,8,9])
>>> for line in testdata_columns:
... print(line)
[1, 4, 7]
[2, 5, 8]
[3, 6, 9]
回答by aerobiomat
回答by StephanSchrodinger
It may help:
它可能有帮助:
import csv
with open('csv_file','r') as f:
# Printing Specific Part of CSV_file
# Printing last line of second column
lines = list(csv.reader(f, delimiter = ' ', skipinitialspace = True))
print(lines[-1][1])
# For printing a range of rows except 10 last rows of second column
for i in range(len(lines)-10):
print(lines[i][1])