如何使用 Pandas 中的 groupby 计算绝对和?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/45405192/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 04:08:11  来源:igfitidea点击:

How can I compute the absolute sum with a groupby in pandas?

pythonpandasdataframepandas-groupby

提问by Franck Dernoncourt

How can I compute the absolute sum with a groupby in pandas?

如何使用 Pandas 中的 groupby 计算绝对和?

For example, given the DataFrame:

例如,给定 DataFrame:

    Player  Score
0      A    100
1      B   -150
2      A   -110
3      B    180
4      B    125

I would like to have the total score for player A (100+110=210) as well as the total score for player A (150+180+125=455), ignoring the sign of the score.

我想要玩家 A 的总分(100+110=210)以及玩家 A 的总分(150+180+125=455),忽略分数的符号。

I can use the following code to compute the sum:

我可以使用以下代码来计算总和:

import pandas as pd
import numpy as np

frame = pd.DataFrame({'Player' : ['A', 'B', 'A', 'B', 'B'], 
                      'Score'  : [100, -150, -110, 180, 125]})

print('frame: {0}'.format(frame))

total_scores = frame[['Player','Score']].groupby(['Player']).agg(['sum'])

print('total_scores: {0}'.format(total_scores))

but how can I compute the absolute sum with a groupby?

但是如何使用 groupby 计算绝对总和?

frame[['Player','Score']].abs().groupby(['Player']).agg(['sum'])unsurprisingly returns:

frame[['Player','Score']].abs().groupby(['Player']).agg(['sum'])不出所料地返回:

Traceback (most recent call last):
  File "O:\tests\absolute_count.py", line 10, in <module>
    total_scores = frame[['Player','Score']].abs().groupby(['Player']).agg(['sum'])
  File "C:\Users\dernoncourt\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\core\generic.py", line 5518, in abs
    return np.abs(self)
TypeError: bad operand type for abs(): 'str'

I don't want to alter the DataFrame.

我不想改变 DataFrame。

回答by BrenBarn

You could apply a function that takes the absolute value and then sums it:

您可以应用一个取绝对值然后求和的函数:

>>> frame.groupby('Player').Score.apply(lambda c: c.abs().sum())
Player
A    210
B    455
Name: Score, dtype: int64

You could also create a new column with the absolute values and then sum that:

您还可以使用绝对值创建一个新列,然后总结:

>>> frame.assign(AbsScore=frame.Score.abs()).groupby('Player').AbsScore.sum()
Player
A    210
B    455
Name: AbsScore, dtype: int64

回答by cs95

You can use DataFrameGroupBy.applywith a lambda:

您可以DataFrameGroupBy.apply与 lambda 一起使用:

In [326]: df.groupby('Player').Score.apply(lambda x: np.sum(np.abs(x)))
Out[326]: 
Player
A    210
B    455
Name: Score, dtype: int64

To get back the Playercolumn, use df.reset_index:

要取回Player列,请使用df.reset_index

In [371]: df.groupby('Player').Score.apply(lambda x: np.sum(np.abs(x))).reset_index()
Out[371]: 
  Player  Score
0      A    210
1      B    455