在 Pandas 中读取包含列表的 csv

Question

提问by Finger twist

I'm trying to read this csv into pandas

我正在尝试将此 csv 读入Pandas

HK,"[u'5328.1', u'5329.3', '2013-12-27 13:58:57.973614']"
HK,"[u'5328.1', u'5329.3', '2013-12-27 13:58:59.237387']"
HK,"[u'5328.1', u'5329.3', '2013-12-27 13:59:00.346325']"

As you can see there are only 2 columns and the second one is a list, is there a way to interpret it correctly ( meaning reading the values in the list as columns) when using pd.read_csv()with arguments ?

正如您所看到的，只有 2 列，而第二列是一个列表，在使用带有参数的pd.read_csv()时，是否有一种方法可以正确解释它（意味着将列表中的值读取为列）？

thank you

谢谢你

Answer 1

回答by alko

One option is to use ast.literal_evalas converter:

一种选择是ast.literal_eval用作转换器：

>>> import ast
>>> df = pd.read_clipboard(header=None, quotechar='"', sep=',', 
...                   converters={1:ast.literal_eval})
>>> df
    0                                             1
0  HK  [5328.1, 5329.3, 2013-12-27 13:58:57.973614]
1  HK  [5328.1, 5329.3, 2013-12-27 13:58:59.237387]
2  HK  [5328.1, 5329.3, 2013-12-27 13:59:00.346325]

And convert those lists to a DataFrame if needed, for example with:

并在需要时将这些列表转换为 DataFrame，例如：

>>> df = pd.DataFrame.from_records(df[1].tolist(), index=df[0],
...                           columns=list('ABC')).reset_index()
>>> df['C'] = pd.to_datetime(df['C'])
>>> df
    0       A       B                          C
0  HK  5328.1  5329.3 2013-12-27 13:58:57.973614
1  HK  5328.1  5329.3 2013-12-27 13:58:59.237387
2  HK  5328.1  5329.3 2013-12-27 13:59:00.346325

Answer 2

回答by Superstar

Based alko's answer, you can use the df.apply() function for the first part to read the actual data in the list string:

基于 alko 的回答，您可以在第一部分使用 df.apply() 函数来读取列表字符串中的实际数据：

 >>> df = pd.read_clipboard(header=None,sep=',')
 >>> df
     0                                                  1
  0  HK  [u'5328.1', u'5329.3', '2013-12-27 13:58:57.97...
  1  HK  [u'5328.1', u'5329.3', '2013-12-27 13:58:59.23...
  2  HK  [u'5328.1', u'5329.3', '2013-12-27 13:59:00.34...
 >>> df[1] = df[1].apply(eval)
 >>> df
     0                                             1
  0  HK  [5328.1, 5329.3, 2013-12-27 13:58:57.973614]
  1  HK  [5328.1, 5329.3, 2013-12-27 13:58:59.237387]
  2  HK  [5328.1, 5329.3, 2013-12-27 13:59:00.346325]

Answer 3

回答by krishna keshav

use .strip() in python.

在 python 中使用 .strip()。

with open(csvfile, 'r')as infile:
    reader = csv.reader(infile)
    for row in reader:
        col1 = row[0]
        col2 = row[1:].strip("[]")

在 Pandas 中读取包含列表的 csv

提问by Finger twist

回答by alko

回答by Superstar

回答by krishna keshav

相关推荐

最近更新

标签

在 Pandas 中读取包含列表的 csv

提问by Finger twist

回答by alko

回答by Superstar

回答by krishna keshav

相关推荐

使用 Pandas TimeSeries 创建热图

如何使用 Pandas 将 Series 连接到 DataFrame 上？

pandas 重命名多索引数据框中的索引值

pandas 从 DataFrame 中减去一个系列，同时保持 DataFrame 结构完整

相关推荐

最近更新

标签