分组中最小值的 Oracle 分析函数

Question

提问by Travis Heseman

I'm new to working with analytic functions.

我是使用分析函数的新手。

DEPT EMP   SALARY
---- ----- ------
  10 MARY  100000
  10 JOHN  200000
  10 SCOTT 300000
  20 BOB   100000
  20 BETTY 200000
  30 ALAN  100000
  30 TOM   200000
  30 JEFF  300000

I want the department and employee with minimum salary.

我想要最低工资的部门和员工。

Results should look like:

结果应如下所示：

DEPT EMP   SALARY
---- ----- ------
  10 MARY  100000
  20 BOB   100000
  30 ALAN  100000

EDIT: Here's the SQL I have (but of course, it doesn't work as it wants staff in the group by clause as well):

编辑：这是我拥有的 SQL（但当然，它不起作用，因为它也需要 group by 子句中的员工）：

SELECT dept, 
  emp,
  MIN(salary) KEEP (DENSE_RANK FIRST ORDER BY salary)
FROM mytable
GROUP BY dept

Answer 1

回答by David Aldridge

I think that the Rank() function is not the way to go with this, for two reasons.

我认为 Rank() 函数不是解决这个问题的方法，原因有二。

Firstly, it is probably less efficient than a Min()-based method.

首先，它可能不如基于 Min() 的方法有效。

The reason for this is that the query has to maintain an ordered list of all salaries per department as it scans the data, and the rank will then be assigned later by re-reading this list. Obviously in the absence of indexes that can be leveraged for this, you cannot assign a rank until the last data item has been read, and maintenance of the list is expensive.

这样做的原因是查询必须在扫描数据时维护每个部门所有工资的有序列表，然后通过重新读取此列表来分配等级。显然，在没有可用于此目的的索引的情况下，您无法在读取最后一个数据项之前分配排名，并且列表的维护成本很高。

So the performance of the Rank() function is dependent on the total number of elements to be scanned, and if the number is sufficient that the sort spills to disk then performance will collapse.

因此 Rank() 函数的性能取决于要扫描的元素总数，如果数量足够，排序会溢出到磁盘，那么性能就会崩溃。

This is probably more efficient:

这可能更有效：

select dept,
       emp,
       salary
from
       (
       SELECT dept, 
              emp,
              salary,
              Min(salary) Over (Partition By dept) min_salary
       FROM   mytable
       )
where salary = min_salary
/

This method only requires that the query maintain a single value per department of the minimum value encountered so far. If a new minimum is encountered then the existing value is modified, otherwise the new value is discarded. The total number of elements that have to be held in memory is related to the number of departments, not the number of rows scanned.

这种方法只需要查询维护一个迄今为止遇到的最小值的每个部门的值。如果遇到新的最小值，则修改现有值，否则丢弃新值。必须保存在内存中的元素总数与部门数有关，而不是扫描的行数。

It could be that Oracle has a code path to recognise that the Rank does not really need to be computed in this case, but I wouldn't bet on it.

可能是 Oracle 有一个代码路径来识别在这种情况下实际上不需要计算排名，但我不会打赌。

The second reason for disliking Rank() is that it just answers the wrong question. The question is not "Which records have the salary that is the first ranking when the salaries per department are ascending ordered", it is "Which records have the salary that is the minimum per department". That makes a big difference to me, at least.

不喜欢 Rank() 的第二个原因是它只是回答了错误的问题。问题不是“当每个部门的薪水升序排列时，哪些记录的薪水排在第一位”，而是“哪些记录的薪水是每个部门的最低薪水”。至少，这对我来说有很大的不同。

Answer 2

回答by William Rose

I think you were pretty close with your original query. The following would run and do match your test case:

我认为您与原始查询非常接近。以下将运行并匹配您的测试用例：

SELECT dept, 
  MIN(emp) KEEP(DENSE_RANK FIRST ORDER BY salary, ROWID) AS emp,
  MIN(salary) KEEP (DENSE_RANK FIRST ORDER BY salary, ROWID) AS salary
FROM mytable
GROUP BY dept

In contrast to the RANK() solutions, this one guarantees at most one row per department. But that hints at a problem: what happens in a department where there are two employees on the lowest salary? The RANK() solutions will return both employees -- more than one row for the department. This answer will pick one arbitrarily and make sure there's only one for the department.

与 RANK() 解决方案相比，这个解决方案保证每个部门最多一行。但这暗示了一个问题：在一个只有两名工资最低的员工的部门会发生什么？RANK() 解决方案将返回两个员工——部门的不止一行。这个答案将任意选择一个，并确保该部门只有一个。

Answer 3

回答by Adam Paynter

You can use the RANK()syntax. For example, this query will tell you where an employee ranks within their department with regard to how large their salary is:

您可以使用RANK()语法。例如，此查询将告诉您员工在其部门内的薪水等级方面的排名：

SELECT
  dept,
  emp,
  salary,
  (RANK() OVER (PARTITION BY dept ORDER BY salary)) salary_rank_within_dept
FROM EMPLOYEES

You could then query from this where salary_rank_within_dept = 1:

然后你可以从这里查询 where salary_rank_within_dept = 1：

SELECT * FROM
  (
    SELECT
      dept,
      emp,
      salary,
      (RANK() OVER (PARTITION BY dept ORDER BY salary)) salary_rank_within_dept
    FROM EMPLOYEES
  )
WHERE salary_rank_within_dept = 1

Answer 4

回答by Chris R

select e2.dept, e2.emp, e2.salary
from employee e2
where e2.salary = (select min(e1.salary) from employee e1)

分组中最小值的 Oracle 分析函数

提问by Travis Heseman

回答by David Aldridge

回答by William Rose

回答by Adam Paynter

回答by Chris R

相关推荐

最近更新

标签

分组中最小值的 Oracle 分析函数

提问by Travis Heseman

回答by David Aldridge

回答by William Rose

回答by Adam Paynter

回答by Chris R

相关推荐

Oracle 分析功能是否昂贵？

确定 Oracle 数据库对象何时失效

oracle 设置和更新连接池 (OracleConnectionPoolDataSource) 属性以获得最佳性能

Oracle 是否会在发生错误时回滚事务？

相关推荐

最近更新

标签