pandas 如何计算数据帧行的标准偏差?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/38361022/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How I can calculate standard deviation for rows of a dataframe?
提问by NamAshena
df:
name group S1 S2 S3
A mn 1 2 8
B mn 4 3 5
C kl 5 8 2
D kl 6 5 5
E fh 7 1 3
output:
std (S1,S2,S3)
3.78
1
3
0.57
3.05
This is working for getting std for a column:
这适用于获取列的 std:
numpy.std(df['A'])
I want to do the same for rows
我想对行做同样的事情
回答by jezrael
You can use DataFrame.std
, which omit non numeric columns:
您可以使用DataFrame.std
,它省略了非数字列:
print (df.std())
S1 2.302173
S2 2.774887
S3 2.302173
dtype: float64
If need std
by columns:
如果需要std
按列:
print (df.std(axis=1))
0 3.785939
1 1.000000
2 3.000000
3 0.577350
4 3.055050
dtype: float64
If need select only some numeric columns, use subset:
如果只需要选择一些数字列,使用子集:
print (df[['S1','S2']].std())
S1 2.302173
S2 2.774887
dtype: float64
There is different with numpy.std
by default parameter ddof
(Delta Degrees of Freedom):
与numpy.std
默认参数ddof
(Delta degree of Freedom)不同:
- pandas by default
ddof=1
- numpy by default
ddof=0
- 默认Pandas
ddof=1
- 默认为 numpy
ddof=0
So there are different outputs:
所以有不同的输出:
#ddof=1
print (df.std(axis=1))
0 3.785939
1 1.000000
2 3.000000
3 0.577350
4 3.055050
dtype: float64
#ddof=0
print (np.std(df, axis=1))
0 3.091206
1 0.816497
2 2.449490
3 0.471405
4 2.494438
dtype: float64
But you can change it very easy:
但是你可以很容易地改变它:
#same output as pandas function
print (np.std(df, ddof=1, axis=1))
0 3.785939
1 1.000000
2 3.000000
3 0.577350
4 3.055050
dtype: float64
#same output as numpy function
print (df.std(ddof=0, axis=1))
0 3.091206
1 0.816497
2 2.449490
3 0.471405
4 2.494438
dtype: float64
回答by Stefano Fedele
When you can not do on rows whatever you can do on column you may use "transpose"
当你不能在行上做任何你可以在列上做的事情时,你可以使用“转置”
np.std( df.transpose()['S1'] )