DF、pandas 的标准偏差

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/29799043/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 23:14:46  来源:igfitidea点击:

Standard deviation for DF, pandas

pythonpandasdataframe

提问by Guforu

for example I have a pandas DataFrame, which looks as:

例如,我有一个 Pandas DataFrame,它看起来像:

a b c
1 2 3
4 5 6
7 8 9

I want to calculate the standard deviation for all values in this DF. The function df.std()get me back the values pro column.

我想计算这个 DF 中所有值的标准偏差。该功能df.std()让我回到 values pro 列。

Of course I can create the next code:

当然我可以创建下一个代码:

sd = []
sd.append(list(df['a']))
sd.append(list(df['b']))
sd.append(list(df['c']))
numpy.std(sd)

Is it possible to do this code simpler and use some pandas function for this DF?

是否可以更简单地执行此代码并为此 DF 使用一些 Pandas 函数?

采纳答案by unutbu

df.valuesreturns a NumPy array containing the values in df. You could then apply np.stdto that array:

df.values返回一个 NumPy 数组,其中包含 中的值df。然后您可以申请np.std该数组:

In [52]: np.std(sd)
Out[52]: 2.5819888974716112

In [53]: np.std(df.values)
Out[53]: 2.5819888974716112

回答by 8one6

An alternative, if you like the idea of "making a vector of all your values" and then taking its standard deviation:

另一种选择,如果您喜欢“制作所有值的向量”然后取其标准偏差的想法:

df.stack().std()

But big note here: please remember that pandas stdfunctions assume a different denominator (degrees of freedom) than numpy stdfunctionsso that:

但请注意:请记住,pandasstd函数采用与 numpystd函数不同的分母(自由度),因此:

df = pd.DataFrame(np.arange(1, 10).reshape(3, 3), columns=list('abc'))
print np.std(df.values)
print df.stack().std()
print df.stack().std() * np.sqrt(8. / 9.)

yields:

产量:

2.58198889747
2.73861278753
2.58198889747

The middle number is different! Not a typo!

中间的数字不一样!不是错别字!