pandas 连接熊猫中相同索引的行值
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/31243352/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
concatenate row values for the same index in pandas
提问by alkamid
My initial DataFrame looks as follows:
我的初始 DataFrame 如下所示:
A B quantity
0 1 foo 1
1 1 baz 2
2 1 bar 2
3 1 faz 1
4 2 foo 2
5 2 bar 1
6 3 foo 3
I need to group it by 'A' and make a list of 'B' multiplied by 'quantity':
我需要将它按“A”分组,并列出“B”乘以“数量”:
A B
0 1 [foo, baz, baz, bar, bar, faz]
1 2 [foo, foo, bar]
2 3 [foo, foo, foo]
Currently I'm using groupby() and then apply():
目前我正在使用 groupby() 然后 apply():
def itemsToList(tdf, column):
collist = []
for row in tdf[column].iteritems():
collist = collist + tdf['quantity'][row[0]]*[row[1]]
return pd.Series({column: collist})
gb = df.groupby('A').apply(itemsToList, 'B')
I doubt it is an efficient way, so I'm looking for a good, "pandaic" method to achieve this.
我怀疑这是一种有效的方法,所以我正在寻找一种好的、“流行的”方法来实现这一目标。
采纳答案by EdChum
This could be done in 2 steps, generate a new column that creates the expanded str values, then groupbyon 'A' and applylistto this new column:
这可以分两步完成,生成一个新列来创建扩展的 str 值,然后groupby在 'A' 和这个新列上:applylist
In [62]:
df['expand'] = df.apply(lambda x: ','.join([x['B']] * x['quantity']), axis=1)
df.groupby('A')['expand'].apply(list)
Out[62]:
A
1 [foo, baz,baz, bar,bar, faz]
2 [foo,foo, bar]
3 [foo,foo,foo]
Name: expand, dtype: object
EDIT
编辑
OK after taking inspirationfrom @Jianxun Li's answer:
服用后确定的灵感来自@Jianxun李的回答是:
In [130]:
df.groupby('A').apply(lambda x: np.repeat(x['B'].values, x['quantity']).tolist())
Out[130]:
A
1 [foo, baz, baz, bar, bar, faz]
2 [foo, foo, bar]
3 [foo, foo, foo]
dtype: object
Also this works:
这也有效:
In [131]:
df.groupby('A').apply(lambda x: list(np.repeat(x['B'].values, x['quantity'])))
Out[131]:
A
1 [foo, baz, baz, bar, bar, faz]
2 [foo, foo, bar]
3 [foo, foo, foo]
dtype: object
回答by Jianxun Li
Another way to do it. First reshape the dfusing pivot_tableand then applynp.repeat().tolist().
另一种方法来做到这一点。首先重塑dfusingpivot_table然后applynp.repeat().tolist().
import pandas as pd
import numpy as np
df
Out[52]:
A B quantity
0 1 foo 1
1 1 baz 2
2 1 bar 2
3 1 faz 1
4 2 foo 2
5 2 bar 1
6 3 foo 3
df.pivot('A','B','quantity').fillna(0).apply(lambda row: np.repeat(row.index.values, row.values.astype(int)).tolist(), axis=1)
Out[53]:
A
1 [bar, bar, baz, baz, faz, foo]
2 [bar, foo, foo]
3 [foo, foo, foo]
dtype: object

