Python Aggfunc 的 Pandas 数据透视表列表

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/34193862/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 14:35:49  来源:igfitidea点击:

Pandas Pivot Table List of Aggfunc

pythonpandaspivot-table

提问by Felix

Pandas Pivot Table Dictionary of Agg function

Pandas Pivot Table 的 Agg 函数字典

I am trying to calculate 3 aggregativefunctions during pivoting:

我试图aggregative在旋转期间计算 3 个函数:

  1. Count
  2. Mean
  3. StDev
  1. 数数
  2. 意思
  3. 标准差

This is the code:

这是代码:

n_page = (pd.pivot_table(Main_DF, 
                         values='SPC_RAW_VALUE',  
                         index=['ALIAS', 'SPC_PRODUCT', 'LABLE', 'RAW_PARAMETER_NAME'], 
                         columns=['LOT_VIRTUAL_LINE'],
                         aggfunc={'N': 'count', 'Mean': np.mean, 'Sigma': np.std})
          .reset_index()
         )

Error I am getting is: KeyError: 'Mean'

我得到的错误是: KeyError: 'Mean'

How can I calculate those 3 functions?

我如何计算这三个函数?

采纳答案by Happy001

The aggfuncargument of pivot_tabletakes a function or list of functions but not dict

aggfunc参数pivot_table接受一个函数或函数列表,但不接受dict

aggfunc : function, default numpy.mean, or list of functions If list of functions passed, the resulting pivot table will have hierarchical columns whose top level are the function names (inferred from the function objects themselves)

aggfunc : 函数、默认 numpy.mean 或函数列表如果传递了函数列表,则生成的数据透视表将具有分层列,其顶层是函数名称(从函数对象本身推断)

So try

所以试试

n_page = (pd.pivot_table(Main_DF, 
                         values='SPC_RAW_VALUE',  
                         index=['ALIAS', 'SPC_PRODUCT', 'LABLE', 'RAW_PARAMETER_NAME'], 
                         columns=['LOT_VIRTUAL_LINE'],
                         aggfunc=[len, np.mean, np.std])
          .reset_index()
         )

You may want to rename the hierarchical columns afterwards.

之后您可能希望重命名分层列。

回答by Alexander

Try using groupby

尝试使用 groupby

df = (Main_DF
      .groupby(['ALIAS', 'SPC_PRODUCT', 'LABLE', 'RAW_PARAMETER_NAME'], as_index=False)
      .LOT_VIRTUAL_LINE
      .agg({'N': 'count', 'Mean': np.mean, 'Sigma': np.std})
     )

Setting as_index=Falsejust leaves these as columns in your dataframe so you don't have to reset the index afterwards.

设置as_index=False只是将这些保留为数据框中的列,因此您之后不必重置索引。

回答by Ganesh_

As written in approved answer by @Happy001, aggfunccant take dictis false. we can actually pass the dictto aggfunc.

正如@Happy001 在批准的答案中所写的那样,aggfunc不能接受dict是错误的。我们实际上可以dictaggfunc.

A really handy feature is the ability to pass a dictionaryto the aggfuncso you can perform different functions on each of the values you select. for example:

一个非常方便的功能是能够将 a 传递dictionary给 the,aggfunc以便您可以对您选择的每个值执行不同的功能。例如:

import pandas as pd
import numpy as np

df = pd.read_excel('sales-funnel.xlsx')  #loading xlsx file

table = pd.pivot_table(df, index=['Manager', 'Status'], columns=['Product'], values=['Quantity','Price'],
           aggfunc={'Quantity':len,'Price':[np.sum, np.mean]},fill_value=0)
table

In the above code, I am passing dictionaryto the aggfuncand performing lenoperation on Quantityand mean, sumoperations on Price.

在上面的代码,我传递dictionaryaggfunc和执行len上的操作Quantitymeansum操作上Price

Here is the output attaching:

这是附加的输出:

enter image description here

在此处输入图片说明

The example is taken from pivot table explained.

该示例取自解释的数据透视表。