Pandas groupby 多列，多列列表

Question

提问by GrandmasLove

I have the following data:

我有以下数据：

Invoice NoStockCode Description                         Quantity    CustomerID  Country
536365  85123A      WHITE HANGING HEART T-LIGHT HOLDER  6           17850       United Kingdom
536365  71053       WHITE METAL LANTERN                 6           17850       United Kingdom
536365  84406B      CREAM CUPID HEARTS COAT HANGER      8           17850       United Kingdom

I am trying to do a groupby so i have the following operation:

我正在尝试进行 groupby，因此我有以下操作：

df.groupby(['InvoiceNo','CustomerID','Country'])['NoStockCode','Description','Quantity'].apply(list)

I want to get the output

我想得到输出

|Invoice |CustomerID |Country        |NoStockCode              |Description                                                                                 |Quantity       
|536365| |17850      |United Kingdom |85123A, 71053, 84406B    |WHITE HANGING HEART T-LIGHT HOLDER, WHITE METAL LANTERN, CREAM CUPID HEARTS COAT HANGER     |6, 6, 8

Instead I get:

相反，我得到：

|Invoice |CustomerID |Country        |0         
|536365| |17850      |United Kingdom |['NoStockCode','Description','Quantity']

I have tried agg and other methods, but I haven't been able to get all of the columns to join as a list. I don't need to use the list function, but in the end I want the different columns to be lists.

我尝试过 agg 和其他方法，但我无法将所有列作为列表加入。我不需要使用 list 函数，但最后我希望不同的列成为列表。

Answer 1

回答by Ben.T

I can't reproduce your code right now, but I think that:

我现在无法重现您的代码，但我认为：

print (df.groupby(['InvoiceNo','CustomerID','Country'], 
                  as_index=False)['NoStockCode','Description','Quantity']
          .agg(lambda x: list(x)))

would give you the expected output

会给你预期的输出

Answer 2

回答by YOBEN_S

IIUC

国际大学联盟

df.groupby(['Invoice','CustomerID'],as_index=False)['Description','NoStockCode'].agg(','.join)
Out[47]: 
   Invoice  CustomerID                                        Description  \
0   536365       17850  WHITEHANGINGHEARTT-LIGHTHOLDER,WHITEMETALANTER...   
           NoStockCode  
0  85123A,71053,84406B

Answer 3

回答by Syed

Try using a variation of the following:

尝试使用以下变体：

df.groupby('company').product.agg([('count', 'count'), ('NoStockCode', ', '.join), ('Descrption', ', '.join), ('Quantity', ', '.join)])

Answer 4

回答by unutbu

You could use pd.pivot_tablewith aggfunc=list:

你可以使用pd.pivot_table同aggfunc=list：

import pandas as pd
df = pd.DataFrame({'Country': ['United Kingdom', 'United Kingdom', 'United Kingdom'],
                   'CustomerID': [17850, 17850, 17850],
                   'Description': ['WHITE HANGING HEART T-LIGHT HOLDER',
                                   'WHITE METAL LANTERN',
                                   'CREAM CUPID HEARTS COAT HANGER'],
                   'Invoice': [536365, 536365, 536365],
                   'NoStockCode': ['85123A', '71053', '84406B'],
                   'Quantity': [6, 6, 8]})

result = pd.pivot_table(df, index=['Invoice','CustomerID','Country'], 
                        values=['NoStockCode','Description','Quantity'], 
                        aggfunc=lambda x: ', '.join(map(str, x)))
print(result)

yields

产量

                                                                         Description            NoStockCode Quantity
Invoice CustomerID Country                                                                                          
536365  17850      United Kingdom  WHITE HANGING HEART T-LIGHT HOLDER, WHITE META...  85123A, 71053, 84406B  6, 6, 8

Note that if Quantityare ints, you will need to convert them to strs before calling ', '.join. That is why map(str, x)was used above.

请注意，如果Quantity是ints，则需要str在调用之前将它们转换为s ', '.join。这就是map(str, x)上面使用的原因。

Pandas groupby 多列，多列列表

提问by GrandmasLove

回答by Ben.T

回答by YOBEN_S

回答by Syed

回答by unutbu

相关推荐

最近更新

标签

Pandas groupby 多列，多列列表

提问by GrandmasLove

回答by Ben.T

回答by YOBEN_S

回答by Syed

回答by unutbu

相关推荐

pandas 如何将分类数据转换为数值数据？

pandas 根据另一列的唯一值对列的值求和

pandas 带双引号的熊猫数据

如何将 excel 或 csv 文件作为 Pandas 数据框上传到烧瓶？

相关推荐

最近更新

标签