Pandas groupby 多列,多列列表

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/51584363/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 05:51:20  来源:igfitidea点击:

Pandas groupby multiple columns, list of multiple columns

pythonpandaspandas-groupby

提问by GrandmasLove

I have the following data:

我有以下数据:

Invoice NoStockCode Description                         Quantity    CustomerID  Country
536365  85123A      WHITE HANGING HEART T-LIGHT HOLDER  6           17850       United Kingdom
536365  71053       WHITE METAL LANTERN                 6           17850       United Kingdom
536365  84406B      CREAM CUPID HEARTS COAT HANGER      8           17850       United Kingdom

I am trying to do a groupby so i have the following operation:

我正在尝试进行 groupby,因此我有以下操作:

df.groupby(['InvoiceNo','CustomerID','Country'])['NoStockCode','Description','Quantity'].apply(list)

I want to get the output

我想得到输出

|Invoice |CustomerID |Country        |NoStockCode              |Description                                                                                 |Quantity       
|536365| |17850      |United Kingdom |85123A, 71053, 84406B    |WHITE HANGING HEART T-LIGHT HOLDER, WHITE METAL LANTERN, CREAM CUPID HEARTS COAT HANGER     |6, 6, 8            

Instead I get:

相反,我得到:

|Invoice |CustomerID |Country        |0         
|536365| |17850      |United Kingdom |['NoStockCode','Description','Quantity']

I have tried agg and other methods, but I haven't been able to get all of the columns to join as a list. I don't need to use the list function, but in the end I want the different columns to be lists.

我尝试过 agg 和其他方法,但我无法将所有列作为列表加入。我不需要使用 list 函数,但最后我希望不同的列成为列表。

回答by Ben.T

I can't reproduce your code right now, but I think that:

我现在无法重现您的代码,但我认为:

print (df.groupby(['InvoiceNo','CustomerID','Country'], 
                  as_index=False)['NoStockCode','Description','Quantity']
          .agg(lambda x: list(x)))

would give you the expected output

会给你预期的输出

回答by YOBEN_S

IIUC

国际大学联盟

df.groupby(['Invoice','CustomerID'],as_index=False)['Description','NoStockCode'].agg(','.join)
Out[47]: 
   Invoice  CustomerID                                        Description  \
0   536365       17850  WHITEHANGINGHEARTT-LIGHTHOLDER,WHITEMETALANTER...   
           NoStockCode  
0  85123A,71053,84406B  

回答by Syed

Try using a variation of the following:

尝试使用以下变体:

df.groupby('company').product.agg([('count', 'count'), ('NoStockCode', ', '.join), ('Descrption', ', '.join), ('Quantity', ', '.join)])

回答by unutbu

You could use pd.pivot_tablewith aggfunc=list:

你可以使用pd.pivot_tableaggfunc=list

import pandas as pd
df = pd.DataFrame({'Country': ['United Kingdom', 'United Kingdom', 'United Kingdom'],
                   'CustomerID': [17850, 17850, 17850],
                   'Description': ['WHITE HANGING HEART T-LIGHT HOLDER',
                                   'WHITE METAL LANTERN',
                                   'CREAM CUPID HEARTS COAT HANGER'],
                   'Invoice': [536365, 536365, 536365],
                   'NoStockCode': ['85123A', '71053', '84406B'],
                   'Quantity': [6, 6, 8]})

result = pd.pivot_table(df, index=['Invoice','CustomerID','Country'], 
                        values=['NoStockCode','Description','Quantity'], 
                        aggfunc=lambda x: ', '.join(map(str, x)))
print(result)

yields

产量

                                                                         Description            NoStockCode Quantity
Invoice CustomerID Country                                                                                          
536365  17850      United Kingdom  WHITE HANGING HEART T-LIGHT HOLDER, WHITE META...  85123A, 71053, 84406B  6, 6, 8

Note that if Quantityare ints, you will need to convert them to strs before calling ', '.join. That is why map(str, x)was used above.

请注意,如果Quantityints,则需要str在调用之前将它们转换为s ', '.join。这就是map(str, x)上面使用的原因。