SQL - STDEVP 或 STDEV 以及如何使用它?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/14893912/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
SQL - STDEVP or STDEV and how to use it?
提问by DtotheG
I have a table:
我有一张桌子:
LocationId OriginalValue Mean
1 0.45 3.99
2 0.33 3.99
3 16.74 3.99
4 3.31 3.99
and so forth...
等等……
How would I work out the Standard Deviation
using this table and also what would you recommend - STDEVP
or STDEV
?
我将如何Standard Deviation
使用这张表以及你会推荐什么 -STDEVP
或者STDEV
?
回答by Bernhard Barker
To use it, simply:
要使用它,只需:
SELECT STDEVP(OriginalValue)
FROM yourTable
From below, you probably want STDEVP
.
从下面开始,您可能想要STDEVP
.
From here:
从这里:
STDEVis used when the group of numbers being evaluated are only a partial samplingof the whole population. The denominator for dividing the sum of squared deviations is N-1, where N is the number of observations ( a count of items in the data set ). Technically, subtracting the 1 is referred to as "non-biased."
STDEVPis used when the group of numbers being evaluated is complete - it's the entire populationof values. In this case, the 1 is NOT subtracted and the denominator for dividing the sum of squared deviations is simply N itself, the number of observations ( a count of items in the data set ). Technically, this is referred to as "biased." Remembering that the P in STDEVP stands for "population" may be helpful. Since the data set is not a mere sample, but constituted of ALL the actual values, this standard deviation function can return a more precise result.
当被评估的一组数字只是整个总体的部分抽样时,使用STDEV。除以偏差平方和的分母是 N-1,其中 N 是观察的数量(数据集中的项目数)。从技术上讲,减去 1 被称为“无偏差”。
当被评估的数字组完成时使用STDEVP- 它是整个值的总体。在这种情况下,不减去 1 并且除以平方偏差总和的分母只是 N 本身,即观察数(数据集中的项目数)。从技术上讲,这被称为“有偏见的”。记住 STDEVP 中的 P 代表“人口”可能会有所帮助。由于数据集不是单纯的样本,而是由所有实际值组成,因此该标准偏差函数可以返回更精确的结果。
回答by Vortex
Generally, you should use STDEV
when you have to estimate standard deviation based on a sample. But if you have entire column-data given as arguments, then use STDEVP
.
通常,STDEV
当您必须根据样本估计标准偏差时,您应该使用。但是,如果您将整个列数据作为参数给出,则使用STDEVP
.
In general, if your data represents the entire population, use
STDEVP
; otherwise, useSTDEV
.
通常,如果您的数据代表整个人口,请使用
STDEVP
; 否则,使用STDEV
.
Note that for large samples, the functions return nearly the same value, so better use STDEV
in this case.
请注意,对于大样本,函数返回几乎相同的值,因此最好STDEV
在这种情况下使用。
回答by Bill Qualls
In statistics there are two types of standard deviations: one for a sample and one for a population. The sample standard deviation, generally notated by the letter s, is used as an estimate of the population standard deviation. The population standard deviation, generally notated by the Greek letter lower case sigma, is used when the data constitutes the complete population. It is difficult to answer your question directly -- sample or population -- because it is difficult to tell what you are working with: a sample or a population. It often depends on context. Consider the following example. If I want to know the standard deviation of the age of students in my class, then I u=would use STDEVP because the class is my population. But if I want the use my class as a sample of the population of all students in the school (this would be what is known as a convenience sample, and would likely be biased, but I digress), then I would use STDEV because my class is a sample. The resulting value would be my best estimate of STDEVP. As mentioned above (1) for large sample sizes (say, more than thirty), the difference between the two becomes trivial, and (2) generally you should use STDEV, not STDEVP, because in practice we usually don't have access to the population. Indeed, one could argue that if we always had access to populations, then we wouldn't need statistics. The entire point of inferential statistics is to be able to make inferences about a population based on the sample.
在统计学中,有两种类型的标准差:一种用于样本,另一种用于总体。样本标准差,通常用字母 s 表示,用作总体标准差的估计值。当数据构成完整总体时,使用总体标准差,通常用希腊字母小写 sigma 表示。很难直接回答你的问题——样本或总体——因为很难说出你在处理什么:样本还是总体。这通常取决于上下文。考虑以下示例。如果我想知道我班上学生年龄的标准差,那么我 u=would use STDEVP 因为班级是我的人口。但是如果我想用我的班级作为学校所有学生的样本(这就是所谓的便利样本,可能会有偏见,但我离题了),那么我会使用 STDEV,因为我的类是一个样本。结果值将是我对 STDEVP 的最佳估计。如上所述 (1) 对于大样本量(例如超过 30 个),两者之间的区别变得微不足道,并且 (2) 通常您应该使用 STDEV,而不是 STDEVP,因为在实践中我们通常无法访问人口。事实上,有人可能会争辩说,如果我们总是能够接触到人口,那么我们就不需要统计数据了。推理统计的全部意义在于能够根据样本对总体进行推断。那么我会使用 STDEV,因为我的班级是一个样本。结果值将是我对 STDEVP 的最佳估计。如上所述 (1) 对于大样本量(例如超过 30 个),两者之间的区别变得微不足道,并且 (2) 通常您应该使用 STDEV,而不是 STDEVP,因为在实践中我们通常无法访问人口。事实上,有人可能会争辩说,如果我们总是能够接触到人口,那么我们就不需要统计数据了。推理统计的全部意义在于能够根据样本对总体进行推断。那么我会使用 STDEV 因为我的班级是一个样本。结果值将是我对 STDEVP 的最佳估计。如上所述 (1) 对于大样本量(例如超过 30 个),两者之间的区别变得微不足道,并且 (2) 通常您应该使用 STDEV,而不是 STDEVP,因为在实践中我们通常无法访问人口。事实上,有人可能会争辩说,如果我们总是能够接触到人口,那么我们就不需要统计数据了。推理统计的全部意义在于能够根据样本对总体进行推断。因为在实践中我们通常无法接触到人群。事实上,有人可能会争辩说,如果我们总是能够接触到人口,那么我们就不需要统计数据了。推理统计的全部意义在于能够根据样本对总体进行推断。因为在实践中我们通常无法接触到人群。事实上,有人可能会争辩说,如果我们总是能够接触到人口,那么我们就不需要统计数据了。推理统计的全部意义在于能够根据样本对总体进行推断。