pandas 包含数组的熊猫系列

Question

提问by toast

I have a pandas dataframe column which looks a little like:

我有一个看起来有点像的Pandas数据框列：

Out[67]:
0      ["cheese", "milk...
1      ["yogurt", "cheese...
2      ["cheese", "cream"...
3      ["milk", "cheese"...

now, ultimately I would like this as a flat list, but in attempting to flatten this, i noticed that pandas treats ["cheese", "milk", "cream"]as strrather than list

现在，最终我希望将其作为一个平面列表，但是在试图将其展平时，我注意到大Pandas将其["cheese", "milk", "cream"]视为str而不是list

How would i go about flattening this so I end up with:

我将如何将其展平，因此我最终得到：

["cheese", "milk", "yogurt", "cheese", "cheese"...]

[EDIT] So the answer given below appears to be:

[编辑] 所以下面给出的答案似乎是：

s = pd.Series(["['cheese', 'milk']", "['yogurt', 'cheese']", "['cheese', 'cream']"])

s = s.str.strip("[]")
df = s.str.split(',', expand=True)
df = df.applymap(lambda x: x.replace("'", '').strip())
l = df.values.flatten()
print (l.tolist())

Which is great, question answered, answer accepted but it strikes me as rather inelegant solution.

这很好，问题得到回答，答案被接受，但在我看来，这是相当不雅的解决方案。

Answer 1

采纳答案by jezrael

You can use numpy.flattenand then flat nested lists- see:

您可以使用numpy.flatten然后平面嵌套lists-请参阅：

print df
                  a
0    [cheese, milk]
1  [yogurt, cheese]
2   [cheese, cream]

print df.a.values
[[['cheese', 'milk']]
 [['yogurt', 'cheese']]
 [['cheese', 'cream']]]

l = df.a.values.flatten()
print l
[['cheese', 'milk'] ['yogurt', 'cheese'] ['cheese', 'cream']]

print [item for sublist in l for item in sublist]
['cheese', 'milk', 'yogurt', 'cheese', 'cheese', 'cream']

EDIT:

编辑：

You can try:

你可以试试：

import pandas as pd

s = pd.Series(["['cheese', 'milk']", "['yogurt', 'cheese']", "['cheese', 'cream']"])

#remove []
s = s.str.strip('[]')
print s
0      'cheese', 'milk'
1    'yogurt', 'cheese'
2     'cheese', 'cream'
dtype: object

df = s.str.split(',', expand=True)
#remove ' and strip empty string
df = df.applymap(lambda x: x.replace("'", '').strip())
print df
        0       1
0  cheese    milk
1  yogurt  cheese
2  cheese   cream

l = df.values.flatten()
print l.tolist()
['cheese', 'milk', 'yogurt', 'cheese', 'cheese', 'cream']

Answer 2

回答by Colin

You can convert the Seriesinto a DataFrameand then call stack:

您可以将转换Series为 aDataFrame然后调用stack：

s.apply(pd.Series).stack().tolist()

Answer 3

回答by Colin

To convert the column values from str to list you could use df.columnName.tolist()and for flattening you could do df.columnName.values.flatten()

要将列值从 str 转换为列表，您可以使用df.columnName.tolist()并展平您可以执行的操作df.columnName.values.flatten()

pandas 包含数组的熊猫系列

提问by toast

采纳答案by jezrael

回答by Colin

回答by Colin

相关推荐

最近更新

标签

pandas 包含数组的熊猫系列

提问by toast

采纳答案by jezrael

回答by Colin

回答by Colin

相关推荐

为什么 Seaborn 调色板不适用于 Pandas 条形图？

使用 python pandas 对大型 csv 文件的汇总统计

pandas ipython笔记本中的熊猫子图标题大小

Pandas 数据框：按两列分组，然后对另一列求平均值

相关推荐

最近更新

标签