DF、pandas 的标准偏差

Question

提问by Guforu

for example I have a pandas DataFrame, which looks as:

例如，我有一个 Pandas DataFrame，它看起来像：

I want to calculate the standard deviation for all values in this DF. The function df.std()get me back the values pro column.

我想计算这个 DF 中所有值的标准偏差。该功能df.std()让我回到 values pro 列。

Of course I can create the next code:

当然我可以创建下一个代码：

sd = []
sd.append(list(df['a']))
sd.append(list(df['b']))
sd.append(list(df['c']))
numpy.std(sd)

Is it possible to do this code simpler and use some pandas function for this DF?

是否可以更简单地执行此代码并为此 DF 使用一些 Pandas 函数？

Answer 1

采纳答案by unutbu

df.valuesreturns a NumPy array containing the values in df. You could then apply np.stdto that array:

df.values返回一个 NumPy 数组，其中包含中的值df。然后您可以申请np.std该数组：

In [52]: np.std(sd)
Out[52]: 2.5819888974716112

In [53]: np.std(df.values)
Out[53]: 2.5819888974716112

Answer 2

回答by 8one6

An alternative, if you like the idea of "making a vector of all your values" and then taking its standard deviation:

另一种选择，如果您喜欢“制作所有值的向量”然后取其标准偏差的想法：

df.stack().std()

But big note here: please remember that pandas stdfunctions assume a different denominator (degrees of freedom) than numpy stdfunctionsso that:

但请注意：请记住，pandasstd函数采用与 numpystd函数不同的分母（自由度），因此：

df = pd.DataFrame(np.arange(1, 10).reshape(3, 3), columns=list('abc'))
print np.std(df.values)
print df.stack().std()
print df.stack().std() * np.sqrt(8. / 9.)

yields:

产量：

2.58198889747
2.73861278753
2.58198889747

The middle number is different! Not a typo!

中间的数字不一样！不是错别字！

DF、pandas 的标准偏差

提问by Guforu

采纳答案by unutbu

回答by 8one6

相关推荐

最近更新

标签

DF、pandas 的标准偏差

提问by Guforu

采纳答案by unutbu

回答by 8one6

相关推荐

从 Pandas df 更新数据库中的现有行

Pandas 由布尔`loc` 和随后的`iloc` 索引

如何指定 Pandas 数据框的行数？

Pandas, groupby 列值大于 x

相关推荐

最近更新

标签