Python 导入 csv 到列表

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/24662571/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 04:58:59  来源:igfitidea点击:

Python import csv to list

pythoncsv

提问by MorganTN

I have a CSV file with about 2000 records.

我有一个包含大约 2000 条记录的 CSV 文件。

Each record has a string, and a category to it:

每条记录都有一个字符串和一个类别:

This is the first line,Line1
This is the second line,Line2
This is the third line,Line3

I need to read this file into a list that looks like this:

我需要将此文件读入如下所示的列表:

data = [('This is the first line', 'Line1'),
        ('This is the second line', 'Line2'),
        ('This is the third line', 'Line3')]

How can import this CSV to the list I need using Python?

如何使用 Python 将此 CSV 导入到我需要的列表中?

回答by Miquel

If you are sure there are no commas in your input, other than to separate the category, you can read the file line by lineand spliton ,, then push the result to List

如果您确定输入中没有逗号,除了分隔类别之外,您可以逐行读取文件并在 上拆分,,然后将结果推送到List

That said, it looks like you are looking at a CSV file, so you might consider using the modulesfor it

这就是说,它看起来像你正在寻找一个CSV文件,所以你可能会考虑使用该模块为它

回答by Acid_Snake

result = []
for line in text.splitlines():
    result.append(tuple(line.split(",")))

回答by Hunter McMillen

A simple loop would suffice:

一个简单的循环就足够了:

lines = []
with open('test.txt', 'r') as f:
    for line in f.readlines():
        l,name = line.strip().split(',')
        lines.append((l,name))

print lines

回答by Maciej Gol

Using the csv module:

使用csv 模块

import csv

with open('file.csv', newline='') as f:
    reader = csv.reader(f)
    data = list(reader)

print(data)

Output:

输出:

[['This is the first line', 'Line1'], ['This is the second line', 'Line2'], ['This is the third line', 'Line3']]


If you need tuples:

如果您需要元组:

import csv

with open('file.csv', newline='') as f:
    reader = csv.reader(f)
    data = [tuple(row) for row in reader]

print(data)

Output:

输出:

[('This is the first line', 'Line1'), ('This is the second line', 'Line2'), ('This is the third line', 'Line3')]


Old Python 2 answer, also using the csvmodule:

旧的 Python 2 答案,也使用该csv模块:

import csv
with open('file.csv', 'rb') as f:
    reader = csv.reader(f)
    your_list = list(reader)

print your_list
# [['This is the first line', 'Line1'],
#  ['This is the second line', 'Line2'],
#  ['This is the third line', 'Line3']]

回答by Jan Vlcinsky

Extending your requirements a bit and assuming you do not care about the order of lines and want to get them grouped under categories, the following solution may work for you:

稍微扩展您的需求并假设您不关心行的顺序并希望将它们按类别分组,以下解决方案可能适合您:

>>> fname = "lines.txt"
>>> from collections import defaultdict
>>> dct = defaultdict(list)
>>> with open(fname) as f:
...     for line in f:
...         text, cat = line.rstrip("\n").split(",", 1)
...         dct[cat].append(text)
...
>>> dct
defaultdict(<type 'list'>, {' CatA': ['This is the first line', 'This is the another line'], ' CatC': ['This is the third line'], ' CatB': ['This is the second line', 'This is the last line']})

This way you get all relevant lines available in the dictionary under key being the category.

通过这种方式,您可以在字典中的 key 下获得所有相关行,即类别。

回答by seokhoonlee

Updated for Python 3:

Python 3更新:

import csv

with open('file.csv', newline='') as f:
    reader = csv.reader(f)
    your_list = list(reader)

print(your_list)

Output:

输出:

[['This is the first line', 'Line1'], ['This is the second line', 'Line2'], ['This is the third line', 'Line3']]

回答by Martin Thoma

Pandasis pretty good at dealing with data. Here is one example how to use it:

Pandas非常擅长处理数据。这是一个如何使用它的示例:

import pandas as pd

# Read the CSV into a pandas data frame (df)
#   With a df you can do many things
#   most important: visualize data with Seaborn
df = pd.read_csv('filename.csv', delimiter=',')

# Or export it in many ways, e.g. a list of tuples
tuples = [tuple(x) for x in df.values]

# or export it as a list of dicts
dicts = df.to_dict().values()

One big advantage is that pandas deals automatically with header rows.

一大优势是 Pandas 会自动处理标题行。

If you haven't heard of Seaborn, I recommend having a look at it.

如果你还没有听说过Seaborn,我建议你看看它。

See also: How do I read and write CSV files with Python?

另请参阅:如何使用 Python 读取和写入 CSV 文件?

Pandas #2

熊猫 #2

import pandas as pd

# Get data - reading the CSV file
import mpu.pd
df = mpu.pd.example_df()

# Convert
dicts = df.to_dict('records')

The content of df is:

df的内容是:

     country   population population_time    EUR
0    Germany   82521653.0      2016-12-01   True
1     France   66991000.0      2017-01-01   True
2  Indonesia  255461700.0      2017-01-01  False
3    Ireland    4761865.0             NaT   True
4      Spain   46549045.0      2017-06-01   True
5    Vatican          NaN             NaT   True

The content of dicts is

dicts的内容是

[{'country': 'Germany', 'population': 82521653.0, 'population_time': Timestamp('2016-12-01 00:00:00'), 'EUR': True},
 {'country': 'France', 'population': 66991000.0, 'population_time': Timestamp('2017-01-01 00:00:00'), 'EUR': True},
 {'country': 'Indonesia', 'population': 255461700.0, 'population_time': Timestamp('2017-01-01 00:00:00'), 'EUR': False},
 {'country': 'Ireland', 'population': 4761865.0, 'population_time': NaT, 'EUR': True},
 {'country': 'Spain', 'population': 46549045.0, 'population_time': Timestamp('2017-06-01 00:00:00'), 'EUR': True},
 {'country': 'Vatican', 'population': nan, 'population_time': NaT, 'EUR': True}]

Pandas #3

熊猫 #3

import pandas as pd

# Get data - reading the CSV file
import mpu.pd
df = mpu.pd.example_df()

# Convert
lists = [[row[col] for col in df.columns] for row in df.to_dict('records')]

The content of listsis:

内容lists为:

[['Germany', 82521653.0, Timestamp('2016-12-01 00:00:00'), True],
 ['France', 66991000.0, Timestamp('2017-01-01 00:00:00'), True],
 ['Indonesia', 255461700.0, Timestamp('2017-01-01 00:00:00'), False],
 ['Ireland', 4761865.0, NaT, True],
 ['Spain', 46549045.0, Timestamp('2017-06-01 00:00:00'), True],
 ['Vatican', nan, NaT, True]]

回答by Alexey Antonenko

Next is a piece of code which uses csv module but extracts file.csv contents to a list of dicts using the first line which is a header of csv table

接下来是一段代码,它使用 csv 模块,但使用作为 csv 表标题的第一行将 file.csv 内容提取到字典列表中

import csv
def csv2dicts(filename):
  with open(filename, 'rb') as f:
    reader = csv.reader(f)
    lines = list(reader)
    if len(lines) < 2: return None
    names = lines[0]
    if len(names) < 1: return None
    dicts = []
    for values in lines[1:]:
      if len(values) != len(names): return None
      d = {}
      for i,_ in enumerate(names):
        d[names[i]] = values[i]
      dicts.append(d)
    return dicts
  return None

if __name__ == '__main__':
  your_list = csv2dicts('file.csv')
  print your_list

回答by Calculus

Update for Python3:

Python3 更新:

import csv
from pprint import pprint

with open('text.csv', newline='') as file:
    reader = csv.reader(file)
    res = list(map(tuple, reader))

pprint(res)

Output:

输出:

[('This is the first line', ' Line1'),
 ('This is the second line', ' Line2'),
 ('This is the third line', ' Line3')]

If csvfile is a file object, it should be opened with newline=''.
csv module

如果 csvfile 是文件对象,则应使用newline=''.
csv模块

回答by Francesco Boi

As said already in the comments you can use the csvlibrary in python. csv means comma separated values which seems exactly your case: a label and a value separated by a comma.

正如评论中所述,您可以csv在 python 中使用该库。csv 表示逗号分隔的值,这似乎正是您的情况:标签和用逗号分隔的值。

Being a category and value type I would rather use a dictionary type instead of a list of tuples.

作为类别和值类型,我宁愿使用字典类型而不是元组列表。

Anyway in the code below I show both ways: dis the dictionary and lis the list of tuples.

无论如何,在下面的代码中,我展示了两种方式:d是字典,l是元组列表。

import csv

file_name = "test.txt"
try:
    csvfile = open(file_name, 'rt')
except:
    print("File not found")
csvReader = csv.reader(csvfile, delimiter=",")
d = dict()
l =  list()
for row in csvReader:
    d[row[1]] = row[0]
    l.append((row[0], row[1]))
print(d)
print(l)