Pandas dataframe groupby 计算总体标准差

Question

提问by neelshiv

I am trying to use groupby and np.std to calculate a standard deviation, but it seems to be calculating a sample standard deviation (with a degrees of freedom equal to 1).

我正在尝试使用 groupby 和 np.std 来计算标准偏差，但它似乎正在计算样本标准偏差（自由度等于 1）。

Here is a sample.

这是一个示例。

#create dataframe
>>> df = pd.DataFrame({'A':[1,1,2,2],'B':[1,2,1,2],'values':np.arange(10,30,5)})
>>> df
   A  B  values
0  1  1      10
1  1  2      15
2  2  1      20
3  2  2      25

#calculate standard deviation using groupby
>>> df.groupby('A').agg(np.std)
      B    values
A                    
1  0.707107  3.535534
2  0.707107  3.535534

#Calculate using numpy (np.std)
>>> np.std([10,15],ddof=0)
2.5
>>> np.std([10,15],ddof=1)
3.5355339059327378

Is there a way to use the population std calculation (ddof=0) with the groupby statement? The records I am using are not (not the example table above) are not samples, so I am only interested in population std deviations.

有没有办法在 groupby 语句中使用人口标准计算（ddof=0）？我使用的记录不是（不是上面的示例表）不是样本，所以我只对总体标准偏差感兴趣。

Answer 1

回答by EdChum

You can pass additional args to np.stdin the aggfunction:

您可以np.std在agg函数中传递额外的参数：

In [202]:

df.groupby('A').agg(np.std, ddof=0)

Out[202]:
     B  values
A             
1  0.5     2.5
2  0.5     2.5

In [203]:

df.groupby('A').agg(np.std, ddof=1)

Out[203]:
          B    values
A                    
1  0.707107  3.535534
2  0.707107  3.535534

Answer 2

回答by Giorgos Myrianthous

For degree of freedom = 0

为了 degree of freedom = 0

(This means that bins with one number will end up with std=0instead of NaN)

（这意味着带有一个数字的垃圾箱将以std=0代替NaN）

import numpy as np


def std(x): 
    return np.std(x)


df.groupby('A').agg(['mean', 'max', std])

Pandas dataframe groupby 计算总体标准差

提问by neelshiv

回答by EdChum

回答by Giorgos Myrianthous

相关推荐

最近更新

标签

Pandas dataframe groupby 计算总体标准差

提问by neelshiv

回答by EdChum

回答by Giorgos Myrianthous

相关推荐

pandas 使用 Seaborn FacetGrid 绘制时间序列

pandas 使用 np.where 但如果条件为 False 则保持现有值

pandas 从对象创建数据框

Python 中 Pandas DataFrame 的 JSON 字典

相关推荐

最近更新

标签