DF、pandas 的标准偏差
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/29799043/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Standard deviation for DF, pandas
提问by Guforu
for example I have a pandas DataFrame, which looks as:
例如,我有一个 Pandas DataFrame,它看起来像:
a b c
1 2 3
4 5 6
7 8 9
I want to calculate the standard deviation for all values in this DF. The function df.std()get me back the values pro column.
我想计算这个 DF 中所有值的标准偏差。该功能df.std()让我回到 values pro 列。
Of course I can create the next code:
当然我可以创建下一个代码:
sd = []
sd.append(list(df['a']))
sd.append(list(df['b']))
sd.append(list(df['c']))
numpy.std(sd)
Is it possible to do this code simpler and use some pandas function for this DF?
是否可以更简单地执行此代码并为此 DF 使用一些 Pandas 函数?
采纳答案by unutbu
df.valuesreturns a NumPy array containing the values in df. You could then apply np.stdto that array:
df.values返回一个 NumPy 数组,其中包含 中的值df。然后您可以申请np.std该数组:
In [52]: np.std(sd)
Out[52]: 2.5819888974716112
In [53]: np.std(df.values)
Out[53]: 2.5819888974716112
回答by 8one6
An alternative, if you like the idea of "making a vector of all your values" and then taking its standard deviation:
另一种选择,如果您喜欢“制作所有值的向量”然后取其标准偏差的想法:
df.stack().std()
But big note here: please remember that pandas stdfunctions assume a different denominator (degrees of freedom) than numpy stdfunctionsso that:
但请注意:请记住,pandasstd函数采用与 numpystd函数不同的分母(自由度),因此:
df = pd.DataFrame(np.arange(1, 10).reshape(3, 3), columns=list('abc'))
print np.std(df.values)
print df.stack().std()
print df.stack().std() * np.sqrt(8. / 9.)
yields:
产量:
2.58198889747
2.73861278753
2.58198889747
The middle number is different! Not a typo!
中间的数字不一样!不是错别字!

