Python 导入 csv 到列表
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/24662571/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Python import csv to list
提问by MorganTN
I have a CSV file with about 2000 records.
我有一个包含大约 2000 条记录的 CSV 文件。
Each record has a string, and a category to it:
每条记录都有一个字符串和一个类别:
This is the first line,Line1
This is the second line,Line2
This is the third line,Line3
I need to read this file into a list that looks like this:
我需要将此文件读入如下所示的列表:
data = [('This is the first line', 'Line1'),
('This is the second line', 'Line2'),
('This is the third line', 'Line3')]
How can import this CSV to the list I need using Python?
如何使用 Python 将此 CSV 导入到我需要的列表中?
回答by Miquel
If you are sure there are no commas in your input, other than to separate the category, you can read the file line by lineand spliton ,
, then push the result to List
如果您确定输入中没有逗号,除了分隔类别之外,您可以逐行读取文件并在 上拆分,
,然后将结果推送到List
That said, it looks like you are looking at a CSV file, so you might consider using the modulesfor it
这就是说,它看起来像你正在寻找一个CSV文件,所以你可能会考虑使用该模块为它
回答by Acid_Snake
result = []
for line in text.splitlines():
result.append(tuple(line.split(",")))
回答by Hunter McMillen
A simple loop would suffice:
一个简单的循环就足够了:
lines = []
with open('test.txt', 'r') as f:
for line in f.readlines():
l,name = line.strip().split(',')
lines.append((l,name))
print lines
回答by Maciej Gol
Using the csv module:
使用csv 模块:
import csv
with open('file.csv', newline='') as f:
reader = csv.reader(f)
data = list(reader)
print(data)
Output:
输出:
[['This is the first line', 'Line1'], ['This is the second line', 'Line2'], ['This is the third line', 'Line3']]
If you need tuples:
如果您需要元组:
import csv
with open('file.csv', newline='') as f:
reader = csv.reader(f)
data = [tuple(row) for row in reader]
print(data)
Output:
输出:
[('This is the first line', 'Line1'), ('This is the second line', 'Line2'), ('This is the third line', 'Line3')]
Old Python 2 answer, also using the csv
module:
旧的 Python 2 答案,也使用该csv
模块:
import csv
with open('file.csv', 'rb') as f:
reader = csv.reader(f)
your_list = list(reader)
print your_list
# [['This is the first line', 'Line1'],
# ['This is the second line', 'Line2'],
# ['This is the third line', 'Line3']]
回答by Jan Vlcinsky
Extending your requirements a bit and assuming you do not care about the order of lines and want to get them grouped under categories, the following solution may work for you:
稍微扩展您的需求并假设您不关心行的顺序并希望将它们按类别分组,以下解决方案可能适合您:
>>> fname = "lines.txt"
>>> from collections import defaultdict
>>> dct = defaultdict(list)
>>> with open(fname) as f:
... for line in f:
... text, cat = line.rstrip("\n").split(",", 1)
... dct[cat].append(text)
...
>>> dct
defaultdict(<type 'list'>, {' CatA': ['This is the first line', 'This is the another line'], ' CatC': ['This is the third line'], ' CatB': ['This is the second line', 'This is the last line']})
This way you get all relevant lines available in the dictionary under key being the category.
通过这种方式,您可以在字典中的 key 下获得所有相关行,即类别。
回答by seokhoonlee
Updated for Python 3:
为Python 3更新:
import csv
with open('file.csv', newline='') as f:
reader = csv.reader(f)
your_list = list(reader)
print(your_list)
Output:
输出:
[['This is the first line', 'Line1'], ['This is the second line', 'Line2'], ['This is the third line', 'Line3']]
回答by Martin Thoma
Pandasis pretty good at dealing with data. Here is one example how to use it:
Pandas非常擅长处理数据。这是一个如何使用它的示例:
import pandas as pd
# Read the CSV into a pandas data frame (df)
# With a df you can do many things
# most important: visualize data with Seaborn
df = pd.read_csv('filename.csv', delimiter=',')
# Or export it in many ways, e.g. a list of tuples
tuples = [tuple(x) for x in df.values]
# or export it as a list of dicts
dicts = df.to_dict().values()
One big advantage is that pandas deals automatically with header rows.
一大优势是 Pandas 会自动处理标题行。
If you haven't heard of Seaborn, I recommend having a look at it.
如果你还没有听说过Seaborn,我建议你看看它。
See also: How do I read and write CSV files with Python?
另请参阅:如何使用 Python 读取和写入 CSV 文件?
Pandas #2
熊猫 #2
import pandas as pd
# Get data - reading the CSV file
import mpu.pd
df = mpu.pd.example_df()
# Convert
dicts = df.to_dict('records')
The content of df is:
df的内容是:
country population population_time EUR
0 Germany 82521653.0 2016-12-01 True
1 France 66991000.0 2017-01-01 True
2 Indonesia 255461700.0 2017-01-01 False
3 Ireland 4761865.0 NaT True
4 Spain 46549045.0 2017-06-01 True
5 Vatican NaN NaT True
The content of dicts is
dicts的内容是
[{'country': 'Germany', 'population': 82521653.0, 'population_time': Timestamp('2016-12-01 00:00:00'), 'EUR': True},
{'country': 'France', 'population': 66991000.0, 'population_time': Timestamp('2017-01-01 00:00:00'), 'EUR': True},
{'country': 'Indonesia', 'population': 255461700.0, 'population_time': Timestamp('2017-01-01 00:00:00'), 'EUR': False},
{'country': 'Ireland', 'population': 4761865.0, 'population_time': NaT, 'EUR': True},
{'country': 'Spain', 'population': 46549045.0, 'population_time': Timestamp('2017-06-01 00:00:00'), 'EUR': True},
{'country': 'Vatican', 'population': nan, 'population_time': NaT, 'EUR': True}]
Pandas #3
熊猫 #3
import pandas as pd
# Get data - reading the CSV file
import mpu.pd
df = mpu.pd.example_df()
# Convert
lists = [[row[col] for col in df.columns] for row in df.to_dict('records')]
The content of lists
is:
内容lists
为:
[['Germany', 82521653.0, Timestamp('2016-12-01 00:00:00'), True],
['France', 66991000.0, Timestamp('2017-01-01 00:00:00'), True],
['Indonesia', 255461700.0, Timestamp('2017-01-01 00:00:00'), False],
['Ireland', 4761865.0, NaT, True],
['Spain', 46549045.0, Timestamp('2017-06-01 00:00:00'), True],
['Vatican', nan, NaT, True]]
回答by Alexey Antonenko
Next is a piece of code which uses csv module but extracts file.csv contents to a list of dicts using the first line which is a header of csv table
接下来是一段代码,它使用 csv 模块,但使用作为 csv 表标题的第一行将 file.csv 内容提取到字典列表中
import csv
def csv2dicts(filename):
with open(filename, 'rb') as f:
reader = csv.reader(f)
lines = list(reader)
if len(lines) < 2: return None
names = lines[0]
if len(names) < 1: return None
dicts = []
for values in lines[1:]:
if len(values) != len(names): return None
d = {}
for i,_ in enumerate(names):
d[names[i]] = values[i]
dicts.append(d)
return dicts
return None
if __name__ == '__main__':
your_list = csv2dicts('file.csv')
print your_list
回答by Calculus
Update for Python3:
Python3 更新:
import csv
from pprint import pprint
with open('text.csv', newline='') as file:
reader = csv.reader(file)
res = list(map(tuple, reader))
pprint(res)
Output:
输出:
[('This is the first line', ' Line1'),
('This is the second line', ' Line2'),
('This is the third line', ' Line3')]
If csvfile is a file object, it should be opened with newline=''
.
csv module
如果 csvfile 是文件对象,则应使用newline=''
.
csv模块
回答by Francesco Boi
As said already in the comments you can use the csv
library in python. csv means comma separated values which seems exactly your case: a label and a value separated by a comma.
正如评论中所述,您可以csv
在 python 中使用该库。csv 表示逗号分隔的值,这似乎正是您的情况:标签和用逗号分隔的值。
Being a category and value type I would rather use a dictionary type instead of a list of tuples.
作为类别和值类型,我宁愿使用字典类型而不是元组列表。
Anyway in the code below I show both ways: d
is the dictionary and l
is the list of tuples.
无论如何,在下面的代码中,我展示了两种方式:d
是字典,l
是元组列表。
import csv
file_name = "test.txt"
try:
csvfile = open(file_name, 'rt')
except:
print("File not found")
csvReader = csv.reader(csvfile, delimiter=",")
d = dict()
l = list()
for row in csvReader:
d[row[1]] = row[0]
l.append((row[0], row[1]))
print(d)
print(l)