如何在 Oracle SQL Developer 中计算标准偏差?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/22980085/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-19 02:19:11  来源:igfitidea点击:

How to calculate Standard Deviation in Oracle SQL Developer?

sqloraclemathgroup-byaggregate-functions

提问by ronilp

I have a table employees,

我有一个表员工,

CREATE TABLE employees (
employeeid NUMERIC(9) NOT NULL,
firstname VARCHAR(10),
lastname VARCHAR(20),
deptcode CHAR(5),
salary NUMERIC(9, 2),
  PRIMARY KEY(employeeid)
);

and I want to calculate Standard Deviation for salary.

我想计算工资的标准偏差。

This is the code I am using:

这是我正在使用的代码:

select avg(salary) as mean, sqrt(sum((salary-avg(salary))*(salary-avg(salary)))/count(employeeid)) as SD 
from employees
group by employeeid;

I am getting this error:

我收到此错误:

ORA-00979: not a GROUP BY expression
00979. 00000 -  "not a GROUP BY expression"
*Cause:    
*Action:
Error at Line: 260 Column: 12

Line 260 Column 12 is avg(salary)

第 260 行第 12 列是 avg(salary)

How can I sort this out?

我该如何解决这个问题?

回答by APC

Oracle has a built-in function to calculate standard deviation: STDDEV.

Oracle 有一个内置函数来计算标准偏差:STDDEV

The usage is as you'd expect for any aggregate function.

用法与您对任何聚合函数的期望一样。

select stddev(salary) 
from employees;

回答by Justin Cave

I'd just use the stddevfunction

我只是使用该stddev功能

SELECT avg(salary) as mean, 
       stddev(salary) as sd
  FROM employees

It doesn't make sense to group by employeeidsince that is, presumably unique. It doesn't make sense to talk about the average salary by employee, you want the average salary across all employees (or all departments or some other aggregatable unit)

这是没有意义的,group by employeeid因为那可能是独一无二的。谈论员工的平均工资是没有意义的,你想要所有员工(或所有部门或其他一些聚合单位)的平均工资

回答by StanislavL

The salary-avg(salary)can't be evaluated; avg(salary)is not available during execution of the query but only after all records are retrieved.

salary-avg(salary)不能进行评估; avg(salary)在查询执行期间不可用,但只有在检索到所有记录后才可用。

I would suggest to add AVG calculations in a subquery and JOIN it to the main one

我建议在子查询中添加 AVG 计算并将其加入主查询

select avg(salary) as mean, 
      sqrt(sum((salary-avg_res.avg)*(salary-avg_res.avg))/count(employeeid)) as SD 
from employees JOIN
     (select employeeid,avg(salary) as avg
      from employees 
      group by employeeid) avg_res ON employees.employeeid=avg_res.employeeid
group by employeeid;

回答by duffymo

I thought you had to include the column in the GROUP BY in the SELECT:

我认为您必须在 SELECT 的 GROUP BY 中包含该列:

select employeeid, avg(salary) as mean, sqrt(sum((salary-avg(salary))*(salary-avg(salary)))/count(employeeid)) as SD 
from employees
group by employeeid;

But on further reflection the query doesn't make much sense unless it's historical data. An employee id ought to be unique to a single employee. Unless this is an average over time there should be only one salary per employee. Your mean will be the salary and the standard deviation will be zero.

但进一步反思,除非是历史数据,否则查询没有多大意义。员工 ID 对单个员工应该是唯一的。除非这是一段时间内的平均值,否则每个员工应该只有一份工资。您的平均值将是工资,标准差将为零。

A better query might be average of all salaries. In that case, remove the GROUP BY.

更好的查询可能是所有工资的平均值。在这种情况下,删除 GROUP BY。

One more nitpick: the formula you're using is more properly called the population standard deviation. The sample deviation divides by (n-1).

再吹毛求疵:您使用的公式更恰当地称为总体标准差。样本偏差除以(n-1)