Python 为熊猫数据透视表中的每个值列定义 aggfunc

Question

提问by VIKASH JAISWAL

Was trying to generate a pivot table with multiple "values" columns. I know I can use aggfunc to aggregate values the way I want to, but what if I don't want to sum or avg both columns but instead I want sum of one column while mean of the other one. So is it possible to do so using pandas?

试图生成具有多个“值”列的数据透视表。我知道我可以使用 aggfunc 以我想要的方式聚合值，但是如果我不想对两列求和或求平均值，而是想要一列的总和，而另一列的平均值。那么可以使用熊猫来做到这一点吗？

df = pd.DataFrame({
          'A' : ['one', 'one', 'two', 'three'] * 6,
          'B' : ['A', 'B', 'C'] * 8,
          'C' : ['foo', 'foo', 'foo', 'bar', 'bar', 'bar'] * 4,
          'D' : np.random.randn(24),
          'E' : np.random.randn(24)
})

Now this will get a pivot table with sum:

现在这将得到一个带有总和的数据透视表：

pd.pivot_table(df, values=['D','E'], rows=['B'], aggfunc=np.sum)

And this for mean:

这意味着：

pd.pivot_table(df, values=['D','E'], rows=['B'], aggfunc=np.mean)

How can I get sum for Dand mean for E?

我怎样才能得到 sum forD和 mean for E？

Hope my question is clear enough.

希望我的问题足够清楚。

Answer 1

采纳答案by Roman Pekar

You can concat two DataFrames:

您可以连接两个 DataFrames：

>>> df1 = pd.pivot_table(df, values=['D'], rows=['B'], aggfunc=np.sum)
>>> df2 = pd.pivot_table(df, values=['E'], rows=['B'], aggfunc=np.mean)
>>> pd.concat((df1, df2), axis=1)
          D         E
B                    
A  1.810847 -0.524178
B  2.762190 -0.443031
C  0.867519  0.078460

or you can pass list of functionsas aggfuncparameter and then reindex:

或者您可以将函数列表作为aggfunc参数传递，然后重新索引：

>>> df3 = pd.pivot_table(df, values=['D','E'], rows=['B'], aggfunc=[np.sum, np.mean])
>>> df3
        sum                mean          
          D         E         D         E
B                                        
A  1.810847 -4.193425  0.226356 -0.524178
B  2.762190 -3.544245  0.345274 -0.443031
C  0.867519  0.627677  0.108440  0.078460
>>> df3 = df3.ix[:, [('sum', 'D'), ('mean','E')]]
>>> df3.columns = ['D', 'E']
>>> df3
          D         E
B                    
A  1.810847 -0.524178
B  2.762190 -0.443031
C  0.867519  0.078460

Alghouth, it would be nice to have an option to defin aggfuncfor each column individually. Don't know how it could be done, may be pass into aggfuncdict-like parameter, like {'D':np.mean, 'E':np.sum}.

Alghouth，最好有一个选项来aggfunc单独定义每一列。不知道怎么做，可能会传入aggfunc类似 dict 的参数，比如{'D':np.mean, 'E':np.sum}.

updateActually, in your case you can pivot by hand:

更新实际上，在您的情况下，您可以手动旋转：

>>> df.groupby('B').aggregate({'D':np.sum, 'E':np.mean})
          E         D
B                    
A -0.524178  1.810847
B -0.443031  2.762190
C  0.078460  0.867519

Answer 2

回答by DataSwede

You can apply a specific function to a specific column by passing in a dict.

您可以通过传入 dict 将特定函数应用于特定列。

pd.pivot_table(df, values=['D','E'], rows=['B'], aggfunc={'D':np.sum, 'E':np.mean})

Answer 3

回答by user10987461

table = pivot_table(df, values=['D', 'E'], index=['A', 'C'],
                aggfunc={'D': np.mean,'E': np.sum})

table D E mean sum A C bar large 5.500000 7.500000 small 5.500000 8.500000 foo large 2.000000 4.500000 small 2.333333 4.333333

表 DE 平均和 AC bar 大 5.500000 7.500000 小 5.500000 8.500000 foo 大 2.000000 4.500000 小 2.333333 4.333333

Python 为熊猫数据透视表中的每个值列定义 aggfunc

提问by VIKASH JAISWAL

采纳答案by Roman Pekar

回答by DataSwede

回答by user10987461

相关推荐

最近更新

标签

Python 为熊猫数据透视表中的每个值列定义 aggfunc

提问by VIKASH JAISWAL

采纳答案by Roman Pekar

回答by DataSwede

回答by user10987461

相关推荐

Python 将 Pandas Multi-Index 转成列

如何在python中计算段落中的句子数量

如何从 Python 中的 txt 文件中读取数据集？

在python中打印字典的原始输入顺序

相关推荐

最近更新

标签