在 Pandas 中读取包含列表的 csv
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/20799593/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Reading csv containing a list in Pandas
提问by Finger twist
I'm trying to read this csv into pandas
我正在尝试将此 csv 读入Pandas
HK,"[u'5328.1', u'5329.3', '2013-12-27 13:58:57.973614']"
HK,"[u'5328.1', u'5329.3', '2013-12-27 13:58:59.237387']"
HK,"[u'5328.1', u'5329.3', '2013-12-27 13:59:00.346325']"
As you can see there are only 2 columns and the second one is a list, is there a way to interpret it correctly ( meaning reading the values in the list as columns) when using pd.read_csv()with arguments ?
正如您所看到的,只有 2 列,而第二列是一个列表,在使用带有参数的pd.read_csv()时,是否有一种方法可以正确解释它(意味着将列表中的值读取为列)?
thank you
谢谢你
回答by alko
One option is to use ast.literal_evalas converter:
一种选择是ast.literal_eval用作转换器:
>>> import ast
>>> df = pd.read_clipboard(header=None, quotechar='"', sep=',',
... converters={1:ast.literal_eval})
>>> df
0 1
0 HK [5328.1, 5329.3, 2013-12-27 13:58:57.973614]
1 HK [5328.1, 5329.3, 2013-12-27 13:58:59.237387]
2 HK [5328.1, 5329.3, 2013-12-27 13:59:00.346325]
And convert those lists to a DataFrame if needed, for example with:
并在需要时将这些列表转换为 DataFrame,例如:
>>> df = pd.DataFrame.from_records(df[1].tolist(), index=df[0],
... columns=list('ABC')).reset_index()
>>> df['C'] = pd.to_datetime(df['C'])
>>> df
0 A B C
0 HK 5328.1 5329.3 2013-12-27 13:58:57.973614
1 HK 5328.1 5329.3 2013-12-27 13:58:59.237387
2 HK 5328.1 5329.3 2013-12-27 13:59:00.346325
回答by Superstar
Based alko's answer, you can use the df.apply() function for the first part to read the actual data in the list string:
基于 alko 的回答,您可以在第一部分使用 df.apply() 函数来读取列表字符串中的实际数据:
>>> df = pd.read_clipboard(header=None,sep=',')
>>> df
0 1
0 HK [u'5328.1', u'5329.3', '2013-12-27 13:58:57.97...
1 HK [u'5328.1', u'5329.3', '2013-12-27 13:58:59.23...
2 HK [u'5328.1', u'5329.3', '2013-12-27 13:59:00.34...
>>> df[1] = df[1].apply(eval)
>>> df
0 1
0 HK [5328.1, 5329.3, 2013-12-27 13:58:57.973614]
1 HK [5328.1, 5329.3, 2013-12-27 13:58:59.237387]
2 HK [5328.1, 5329.3, 2013-12-27 13:59:00.346325]
回答by krishna keshav
use .strip() in python.
在 python 中使用 .strip()。
with open(csvfile, 'r')as infile:
reader = csv.reader(infile)
for row in reader:
col1 = row[0]
col2 = row[1:].strip("[]")

