Python 导入 csv 到列表

Question

提问by MorganTN

I have a CSV file with about 2000 records.

我有一个包含大约 2000 条记录的 CSV 文件。

Each record has a string, and a category to it:

每条记录都有一个字符串和一个类别：

This is the first line,Line1
This is the second line,Line2
This is the third line,Line3

I need to read this file into a list that looks like this:

我需要将此文件读入如下所示的列表：

data = [('This is the first line', 'Line1'),
        ('This is the second line', 'Line2'),
        ('This is the third line', 'Line3')]

How can import this CSV to the list I need using Python?

如何使用 Python 将此 CSV 导入到我需要的列表中？

Answer 1

回答by Miquel

If you are sure there are no commas in your input, other than to separate the category, you can read the file line by lineand spliton ,, then push the result to List

如果您确定输入中没有逗号，除了分隔类别之外，您可以逐行读取文件并在上拆分,，然后将结果推送到List

That said, it looks like you are looking at a CSV file, so you might consider using the modulesfor it

这就是说，它看起来像你正在寻找一个CSV文件，所以你可能会考虑使用该模块为它

Answer 2

回答by Acid_Snake

result = []
for line in text.splitlines():
    result.append(tuple(line.split(",")))

Answer 3

回答by Hunter McMillen

A simple loop would suffice:

一个简单的循环就足够了：

lines = []
with open('test.txt', 'r') as f:
    for line in f.readlines():
        l,name = line.strip().split(',')
        lines.append((l,name))

print lines

Answer 4

回答by Maciej Gol

Using the csv module:

使用csv 模块：

import csv

with open('file.csv', newline='') as f:
    reader = csv.reader(f)
    data = list(reader)

print(data)

Output:

输出：

[['This is the first line', 'Line1'], ['This is the second line', 'Line2'], ['This is the third line', 'Line3']]

If you need tuples:

如果您需要元组：

import csv

with open('file.csv', newline='') as f:
    reader = csv.reader(f)
    data = [tuple(row) for row in reader]

print(data)

Output:

输出：

[('This is the first line', 'Line1'), ('This is the second line', 'Line2'), ('This is the third line', 'Line3')]

Old Python 2 answer, also using the csvmodule:

旧的 Python 2 答案，也使用该csv模块：

import csv
with open('file.csv', 'rb') as f:
    reader = csv.reader(f)
    your_list = list(reader)

print your_list
# [['This is the first line', 'Line1'],
#  ['This is the second line', 'Line2'],
#  ['This is the third line', 'Line3']]

Answer 5

回答by Jan Vlcinsky

Extending your requirements a bit and assuming you do not care about the order of lines and want to get them grouped under categories, the following solution may work for you:

稍微扩展您的需求并假设您不关心行的顺序并希望将它们按类别分组，以下解决方案可能适合您：

>>> fname = "lines.txt"
>>> from collections import defaultdict
>>> dct = defaultdict(list)
>>> with open(fname) as f:
...     for line in f:
...         text, cat = line.rstrip("\n").split(",", 1)
...         dct[cat].append(text)
...
>>> dct
defaultdict(<type 'list'>, {' CatA': ['This is the first line', 'This is the another line'], ' CatC': ['This is the third line'], ' CatB': ['This is the second line', 'This is the last line']})

This way you get all relevant lines available in the dictionary under key being the category.

通过这种方式，您可以在字典中的 key 下获得所有相关行，即类别。

Answer 6

回答by seokhoonlee

Updated for Python 3:

为Python 3更新：

import csv

with open('file.csv', newline='') as f:
    reader = csv.reader(f)
    your_list = list(reader)

print(your_list)

Output:

输出：

[['This is the first line', 'Line1'], ['This is the second line', 'Line2'], ['This is the third line', 'Line3']]

Answer 7

回答by Martin Thoma

Pandasis pretty good at dealing with data. Here is one example how to use it:

Pandas非常擅长处理数据。这是一个如何使用它的示例：

import pandas as pd

# Read the CSV into a pandas data frame (df)
#   With a df you can do many things
#   most important: visualize data with Seaborn
df = pd.read_csv('filename.csv', delimiter=',')

# Or export it in many ways, e.g. a list of tuples
tuples = [tuple(x) for x in df.values]

# or export it as a list of dicts
dicts = df.to_dict().values()

One big advantage is that pandas deals automatically with header rows.

一大优势是 Pandas 会自动处理标题行。

If you haven't heard of Seaborn, I recommend having a look at it.

如果你还没有听说过Seaborn，我建议你看看它。

See also: How do I read and write CSV files with Python?

另请参阅：如何使用 Python 读取和写入 CSV 文件？

Pandas #2

熊猫 #2

import pandas as pd

# Get data - reading the CSV file
import mpu.pd
df = mpu.pd.example_df()

# Convert
dicts = df.to_dict('records')

The content of df is:

df的内容是：

     country   population population_time    EUR
0    Germany   82521653.0      2016-12-01   True
1     France   66991000.0      2017-01-01   True
2  Indonesia  255461700.0      2017-01-01  False
3    Ireland    4761865.0             NaT   True
4      Spain   46549045.0      2017-06-01   True
5    Vatican          NaN             NaT   True

The content of dicts is

dicts的内容是

[{'country': 'Germany', 'population': 82521653.0, 'population_time': Timestamp('2016-12-01 00:00:00'), 'EUR': True},
 {'country': 'France', 'population': 66991000.0, 'population_time': Timestamp('2017-01-01 00:00:00'), 'EUR': True},
 {'country': 'Indonesia', 'population': 255461700.0, 'population_time': Timestamp('2017-01-01 00:00:00'), 'EUR': False},
 {'country': 'Ireland', 'population': 4761865.0, 'population_time': NaT, 'EUR': True},
 {'country': 'Spain', 'population': 46549045.0, 'population_time': Timestamp('2017-06-01 00:00:00'), 'EUR': True},
 {'country': 'Vatican', 'population': nan, 'population_time': NaT, 'EUR': True}]

Pandas #3

熊猫 #3

import pandas as pd

# Get data - reading the CSV file
import mpu.pd
df = mpu.pd.example_df()

# Convert
lists = [[row[col] for col in df.columns] for row in df.to_dict('records')]

The content of listsis:

内容lists为：

[['Germany', 82521653.0, Timestamp('2016-12-01 00:00:00'), True],
 ['France', 66991000.0, Timestamp('2017-01-01 00:00:00'), True],
 ['Indonesia', 255461700.0, Timestamp('2017-01-01 00:00:00'), False],
 ['Ireland', 4761865.0, NaT, True],
 ['Spain', 46549045.0, Timestamp('2017-06-01 00:00:00'), True],
 ['Vatican', nan, NaT, True]]

Answer 8

回答by Alexey Antonenko

Next is a piece of code which uses csv module but extracts file.csv contents to a list of dicts using the first line which is a header of csv table

接下来是一段代码，它使用 csv 模块，但使用作为 csv 表标题的第一行将 file.csv 内容提取到字典列表中

import csv
def csv2dicts(filename):
  with open(filename, 'rb') as f:
    reader = csv.reader(f)
    lines = list(reader)
    if len(lines) < 2: return None
    names = lines[0]
    if len(names) < 1: return None
    dicts = []
    for values in lines[1:]:
      if len(values) != len(names): return None
      d = {}
      for i,_ in enumerate(names):
        d[names[i]] = values[i]
      dicts.append(d)
    return dicts
  return None

if __name__ == '__main__':
  your_list = csv2dicts('file.csv')
  print your_list

Answer 9

回答by Calculus

Update for Python3:

Python3 更新：

import csv
from pprint import pprint

with open('text.csv', newline='') as file:
    reader = csv.reader(file)
    res = list(map(tuple, reader))

pprint(res)

Output:

输出：

[('This is the first line', ' Line1'),
 ('This is the second line', ' Line2'),
 ('This is the third line', ' Line3')]

If csvfile is a file object, it should be opened with newline=''.
csv module

如果 csvfile 是文件对象，则应使用newline=''.
csv模块

Answer 10

回答by Francesco Boi

As said already in the comments you can use the csvlibrary in python. csv means comma separated values which seems exactly your case: a label and a value separated by a comma.

正如评论中所述，您可以csv在 python 中使用该库。csv 表示逗号分隔的值，这似乎正是您的情况：标签和用逗号分隔的值。

Being a category and value type I would rather use a dictionary type instead of a list of tuples.

作为类别和值类型，我宁愿使用字典类型而不是元组列表。

Anyway in the code below I show both ways: dis the dictionary and lis the list of tuples.

无论如何，在下面的代码中，我展示了两种方式：d是字典，l是元组列表。

import csv

file_name = "test.txt"
try:
    csvfile = open(file_name, 'rt')
except:
    print("File not found")
csvReader = csv.reader(csvfile, delimiter=",")
d = dict()
l =  list()
for row in csvReader:
    d[row[1]] = row[0]
    l.append((row[0], row[1]))
print(d)
print(l)

Python 导入 csv 到列表

提问by MorganTN

回答by Miquel

回答by Acid_Snake

回答by Hunter McMillen

回答by Maciej Gol

回答by Jan Vlcinsky

回答by seokhoonlee

回答by Martin Thoma

Pandas #2

熊猫 #2

Pandas #3

熊猫 #3

回答by Alexey Antonenko

回答by Calculus

Update for Python3:

Python3 更新：

回答by Francesco Boi

相关推荐

最近更新

标签

Python 导入 csv 到列表

提问by MorganTN

回答by Miquel

回答by Acid_Snake

回答by Hunter McMillen

回答by Maciej Gol

回答by Jan Vlcinsky

回答by seokhoonlee

回答by Martin Thoma

Pandas #2

熊猫 #2

Pandas #3

熊猫 #3

回答by Alexey Antonenko

回答by Calculus

Update for Python3:

Python3 更新：

回答by Francesco Boi

相关推荐

从 CSV 文件 Python 创建对象

Python 如何读取要由 scikit-image 处理的 mp4 视频？

如何使用 Scikit-Image 库从 Python 中的 RGB 图像中提取绿色通道？

Python numpy.float64 对象不可迭代......但我不想

相关推荐

最近更新

标签