Python 列表的标准偏差
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/15389768/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Standard deviation of a list
提问by physics_for_all
I want to find mean and standard deviation of 1st, 2nd,... digits of several (Z) lists. For example, I have
我想找到几个 (Z) 列表的第一个、第二个、...数字的均值和标准差。例如,我有
A_rank=[0.8,0.4,1.2,3.7,2.6,5.8]
B_rank=[0.1,2.8,3.7,2.6,5,3.4]
C_Rank=[1.2,3.4,0.5,0.1,2.5,6.1]
# etc (up to Z_rank )...
Now I want to take the mean and std of *_Rank[0], the mean and std of *_Rank[1], etc.
(ie: mean and std of the 1st digit from all the (A..Z)_rank lists;
the mean and std of the 2nd digit from all the (A..Z)_rank lists;
the mean and std of the 3rd digit...; etc).
现在,我要带的平均值和STD *_Rank[0],平均和性病*_Rank[1]等
(即:_Rank名单均值和所有的(A到Z的第1位的标准);
平均,并从第2位的STD所有 (A..Z)_rank 列表;
第 3 位数字的平均值和标准值……;等)。
回答by NPE
I would put A_Ranket al into a 2D NumPyarray, and then use numpy.mean()and numpy.std()to compute the means and the standard deviations:
我会将A_Ranket al 放入一个 2D NumPy数组中,然后使用numpy.mean()和numpy.std()来计算均值和标准差:
In [17]: import numpy
In [18]: arr = numpy.array([A_rank, B_rank, C_rank])
In [20]: numpy.mean(arr, axis=0)
Out[20]:
array([ 0.7 , 2.2 , 1.8 , 2.13333333, 3.36666667,
5.1 ])
In [21]: numpy.std(arr, axis=0)
Out[21]:
array([ 0.45460606, 1.29614814, 1.37355985, 1.50628314, 1.15566239,
1.2083046 ])
回答by Bengt
Since Python 3.4 / PEP450there is a statistics modulein the standard library, which has a method stdevfor calculating the standard deviation of iterables like yours:
因为Python 3.4 / PEP450存在statistics module在标准库,其中有一个方法stdev,用于计算像您iterables的标准偏差:
>>> A_rank = [0.8, 0.4, 1.2, 3.7, 2.6, 5.8]
>>> import statistics
>>> statistics.stdev(A_rank)
2.0634114147853952
回答by B.Kocis
In python 2.7 you can use NumPy's numpy.std()gives the population standard deviation.
在 python 2.7 中,您可以使用 NumPy'snumpy.std()给出总体标准偏差。
In Python 3.4 statistics.stdev()returns the sample standard deviation. The pstdv()function is the same as numpy.std().
在 Python 3.4 中statistics.stdev()返回样本标准偏差。该pstdv()功能是一样的numpy.std()。
回答by Alex Riley
Here's some pure-Python code you can use to calculate the mean and standard deviation.
这是一些可用于计算均值和标准差的纯 Python 代码。
All code below is based on the statisticsmodule in Python 3.4+.
下面的所有代码都基于statisticsPython 3.4+ 中的模块。
def mean(data):
"""Return the sample arithmetic mean of data."""
n = len(data)
if n < 1:
raise ValueError('mean requires at least one data point')
return sum(data)/n # in Python 2 use sum(data)/float(n)
def _ss(data):
"""Return sum of square deviations of sequence data."""
c = mean(data)
ss = sum((x-c)**2 for x in data)
return ss
def stddev(data, ddof=0):
"""Calculates the population standard deviation
by default; specify ddof=1 to compute the sample
standard deviation."""
n = len(data)
if n < 2:
raise ValueError('variance requires at least two data points')
ss = _ss(data)
pvar = ss/(n-ddof)
return pvar**0.5
Note: for improved accuracy when summing floats, the statisticsmodule uses a custom function _sumrather than the built-in sumwhich I've used in its place.
注意:为了提高对浮点数求和时的准确性,该statistics模块使用自定义函数_sum而不是sum我在其位置使用的内置函数。
Now we have for example:
现在我们有例如:
>>> mean([1, 2, 3])
2.0
>>> stddev([1, 2, 3]) # population standard deviation
0.816496580927726
>>> stddev([1, 2, 3], ddof=1) # sample standard deviation
0.1
回答by Ome
In Python 2.7.1, you may calculate standard deviation using numpy.std()for:
在 Python 2.7.1 中,您可以使用numpy.std()for计算标准偏差:
- Population std: Just use
numpy.std()with no additional arguments besides to your data list. - Sample std: You need to pass ddof(i.e. Delta Degrees of Freedom) set to 1, as in the following example:
- 人口标准:
numpy.std()除了您的数据列表外,无需额外参数即可使用。 - 示例 std:您需要将ddof(即 Delta 自由度)设置为 1,如下例所示:
numpy.std(< your-list >, ddof=1)
The divisor used in calculations is N - ddof, where N represents the number of elements. By default ddof is zero.
numpy.std(<你的列表>, ddof=1)
计算中使用的除数是N - ddof,其中 N 表示元素的数量。默认情况下 ddof 为零。
It calculates sample std rather than population std.
它计算样本标准而不是总体标准。
回答by Samy Bencherif
The other answers cover how to do std dev in python sufficiently, but no one explains how to do the bizarre traversal you've described.
其他答案涵盖了如何在 python 中充分执行 std dev ,但没有人解释如何进行您所描述的奇异遍历。
I'm going to assume A-Z is the entire population. If not see Ome's answer on how to inference from a sample.
我将假设 AZ 是整个人口。如果没有看到Ome关于如何从样本推断的答案。
So to get the standard deviation/mean of the first digit of every list you would need something like this:
因此,要获得每个列表第一个数字的标准偏差/平均值,您需要这样的东西:
#standard deviation
numpy.std([A_rank[0], B_rank[0], C_rank[0], ..., Z_rank[0]])
#mean
numpy.mean([A_rank[0], B_rank[0], C_rank[0], ..., Z_rank[0]])
To shorten the code and generalize this to any nth digit use the following function I generated for you:
要缩短代码并将其推广到任何第 n 个数字,请使用我为您生成的以下函数:
def getAllNthRanks(n):
return [A_rank[n], B_rank[n], C_rank[n], D_rank[n], E_rank[n], F_rank[n], G_rank[n], H_rank[n], I_rank[n], J_rank[n], K_rank[n], L_rank[n], M_rank[n], N_rank[n], O_rank[n], P_rank[n], Q_rank[n], R_rank[n], S_rank[n], T_rank[n], U_rank[n], V_rank[n], W_rank[n], X_rank[n], Y_rank[n], Z_rank[n]]
Now you can simply get the stdd and mean of all the nth places from A-Z like this:
现在,您可以像这样简单地从 AZ 获取所有第 n 个位置的 stdd 和平均值:
#standard deviation
numpy.std(getAllNthRanks(n))
#mean
numpy.mean(getAllNthRanks(n))
回答by Elad Yehezkel
pure python code:
纯python代码:
from math import sqrt
def stddev(lst):
mean = float(sum(lst)) / len(lst)
return sqrt(float(reduce(lambda x, y: x + y, map(lambda x: (x - mean) ** 2, lst))) / len(lst))
回答by pankaj
Using python, here are few methods:
使用python,这里有几个方法:
import statistics as st
n = int(input())
data = list(map(int, input().split()))
Approach1 - using a function
方法 1 - 使用函数
stdev = st.pstdev(data)
Approach2: calculate variance and take square root of it
方法2:计算方差并取平方根
variance = st.pvariance(data)
devia = math.sqrt(variance)
Approach3: using basic math
方法3:使用基础数学
mean = sum(data)/n
variance = sum([((x - mean) ** 2) for x in X]) / n
stddev = variance ** 0.5
print("{0:0.1f}".format(stddev))
Note:
笔记:
variancecalculates variance of sample populationpvariancecalculates variance of entire population- similar differences between
stdevandpstdev
variance计算样本总体的方差pvariance计算整个总体的方差- 相似的差异
stdev和pstdev

