从python中的文本文件中读取特定列

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/30216573/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 08:05:00  来源:igfitidea点击:

Reading specific columns from a text file in python

pythonlisttext-files

提问by Jethro

I have a text file which contains a table comprised of numbers e.g:

我有一个文本文件,其中包含一个由数字组成的表格,例如:

5 10 6

6 20 1

7 30 4

8 40 3

9 23 1

4 13 6

5 10 6

6 20 1

7 30 4

8 40 3

9 23 1

4 13 6

if for example I want the numbers contained only in the second column, how do i extract that column into a list?

例如,如果我想要只包含在第二列中的数字,我如何将该列提取到列表中?

采纳答案by ForceBru

f=open(file,"r")
lines=f.readlines()
result=[]
for x in lines:
    result.append(x.split(' ')[1])
f.close()

You can do the same using a list comprehension

你可以使用列表理解来做同样的事情

print [x.split(' ')[1] for x in open(file).readlines()]

Docs on split()

文档 split()

string.split(s[, sep[, maxsplit]])

Return a list of the words of the string s. If the optional second argument sep is absent or None, the words are separated by arbitrary strings of whitespace characters (space, tab, newline, return, formfeed). If the second argument sep is present and not None, it specifies a string to be used as the word separator. The returned list will then have one more item than the number of non-overlapping occurrences of the separator in the string.

string.split(s[, sep[, maxsplit]])

返回字符串的单词列表s。如果可选的第二个参数 sep 不存在或 None ,则单词由任意的空白字符字符串(空格、制表符、换行符、返回、换页符)分隔。如果第二个参数 sep 存在而不是 None,则它指定一个字符串作为单词分隔符。返回的列表将比字符串中分隔符的非重叠出现次数多一项。

So, you can omit the space I used and do just x.split()but this will also remove tabs and newlines, be aware of that.

因此,您可以省略我使用的空间并执行x.split()此操作,但这也会删除制表符和换行符,请注意这一点。

回答by ZdaR

First of all we open the file and as datafilethen we apply .read()method reads the file contents and then we split the data which returns something like: ['5', '10', '6', '6', '20', '1', '7', '30', '4', '8', '40', '3', '9', '23', '1', '4', '13', '6']and the we applied list slicing on this list to start from the element at index position 1 and skip next 3 elements untill it hits the end of the loop.

首先,我们打开文件,datafile然后我们的 apply.read()方法读取文件内容,然后我们拆分返回如下内容的数据:['5', '10', '6', '6', '20', '1', '7', '30', '4', '8', '40', '3', '9', '23', '1', '4', '13', '6']我们在此列表上应用列表切片从索引位置 1 的元素开始并跳过下一个 3元素,直到它到达循环的末尾。

with open("sample.txt", "r") as datafile:
    print datafile.read().split()[1::3]

Output:

输出:

['10', '20', '30', '40', '23', '13']

回答by Kasramvd

You can use a zipfunction with a list comprehension :

您可以使用zip具有列表理解的函数:

with open('ex.txt') as f:
    print zip(*[line.split() for line in f])[1]

result :

结果 :

('10', '20', '30', '40', '23', '13')

回答by Adam Smith

You have a space delimited file, so use the module designed for reading delimited values files, csv.

您有一个以空格分隔的文件,因此请使用专为读取分隔值文件而设计的模块csv.

import csv

with open('path/to/file.txt') as inf:
    reader = csv.reader(inf, delimiter=" ")
    second_col = list(zip(*reader))[1]
    # In Python2, you can omit the `list(...)` cast

The zip(*iterable)pattern is useful for converting rows to columns or vice versa. If you're reading a file row-wise...

zip(*iterable)模式可用于将行转换为列,反之亦然。如果您正在逐行阅读文件...

>>> testdata = [[1, 2, 3],
                [4, 5, 6],
                [7, 8, 9]]

>>> for line in testdata:
...     print(line)

[1, 2, 3]
[4, 5, 6]
[7, 8, 9]

...but need columns, you can pass each row to the zipfunction

...但需要列,您可以将每一行传递给zip函数

>>> testdata_columns = zip(*testdata)
# this is equivalent to zip([1,2,3], [4,5,6], [7,8,9])

>>> for line in testdata_columns:
...     print(line)

[1, 4, 7]
[2, 5, 8]
[3, 6, 9]

回答by aerobiomat

I know this is an old question, but nobody mentioned that when your data looks like an array, numpy's loadtxtcomes in handy:

我知道这是一个老问题,但没有人提到当你的数据看起来像一个数组时,numpy 的loadtxt会派上用场:

>>> import numpy as np
>>> np.loadtxt("myfile.txt")[:, 1]
array([10., 20., 30., 40., 23., 13.])

回答by StephanSchrodinger

It may help:

它可能有帮助:

import csv
with open('csv_file','r') as f:
    # Printing Specific Part of CSV_file
    # Printing last line of second column
    lines = list(csv.reader(f, delimiter = ' ', skipinitialspace = True))
    print(lines[-1][1])
    # For printing a range of rows except 10 last rows of second column
    for i in range(len(lines)-10):
        print(lines[i][1])