oracle 我可以使用窗口函数在 SQL 查询中分组吗?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/40594466/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-19 03:19:54  来源:igfitidea点击:

Can I group by in SQL query with window function?

sqloracleoracle11g

提问by E.Saraf

I need to get employees with smallest salary in their departmentsI did it using anti join.

我需要在他们的部门中获得最低薪水的员工,我使用反加入做到了。

     select emp.employee_id,emp.last_name,emp.salary,emp.department_id
     from employees emp
     left join employees sml 
     on sml.department_id = emp.department_id and sml.salary < emp.salary
     where sml.employee_id is null and emp.department_id is not null

But I've been told that it's possible to do it using window function using one select. However I can't group it by department_id and use it at the same time. Is that a bug or me being stupid?

但是有人告诉我,可以使用一个 select使用 window 函数来做到这一点。但是我不能按部门 ID 对其进行分组并同时使用它。这是一个错误还是我愚蠢?

     SELECT  department_id,
     min(salary) OVER (partition by department_id)  as minsalary
     FROM employees;
     GROUP BY department_id

SQL Developer says 00979. 00000 - "not a GROUP BY expression"

SQL Developer 说 00979. 00000 - “不是 GROUP BY 表达式”

采纳答案by Alex Poole

If you run your second query without the group by- which you may have already tried, from the extra semicolon in what you posted - you'll see that you get one row for every employee, each showing the minimum salary in their department. That minimum is the analytic min()because it has a window clause. The PARTITION BYis the equivalent of a GROUP BY, but without the aggregation over the whole result set.

如果你运行你的第二个查询没有group by- 你可能已经尝试过,从你发布的额外分号中 - 你会看到你为每个员工获得一行,每个人都显示他们部门的最低工资。这个最小值是解析的,min()因为它有一个窗口子句。ThePARTITION BY相当于 a GROUP BY,但没有对整个结果集进行聚合。

The simplest way to get the same result (almost) is to use the RANK()analytic function instead, which ranks the values based on the partition and order you supply, while allowing for ties:

获得相同结果(几乎)的最简单方法是使用RANK()解析函数,它根据您提供的分区和顺序对值进行排名,同时允许绑定:

SELECT employee_id, last_name, salary, department_id,
  RANK() OVER (PARTITION BY department_id ORDER BY salary) AS rnk
FROM employees
ORDER BY department_id, rnk;

EMPLOYEE_ID LAST_NAME                     SALARY DEPARTMENT_ID        RNK
----------- ------------------------- ---------- ------------- ----------
        200 Whalen                          4400            10          1
        202 Fay                             6000            20          1
        201 Hartstein                      13000            20          2
        119 Colmenares                      2500            30          1
        118 Himuro                          2600            30          2
        117 Tobias                          2800            30          3
        116 Baida                           2900            30          4
        115 Khoo                            3100            30          5
        114 Raphaely                       11000            30          6
...
        102 De Haan                        17000            90          1
        101 Kochhar                        17000            90          1
        100 King                           24000            90          3
...

For departments 20 and 30 you can see the row ranked 1 is the lowest salary. For department 90 there are two employees ranked 1, because they have the same lowest salary.

对于部门 20 和 30,您可以看到排名 1 的行是最低工资。对于部门 90,有两名员工排名 1,因为他们的工资最低。

You can use that as an inline view and select just those rows ranked number 1:

您可以将其用作内联视图并仅选择排名第 1 的行:

SELECT employee_id, last_name, salary, department_id
FROM (
  SELECT employee_id, last_name, salary, department_id,
    RANK() OVER (PARTITION BY department_id ORDER BY salary) AS rnk
  FROM employees
)
WHERE rnk = 1
ORDER BY department_id;

EMPLOYEE_ID LAST_NAME                     SALARY DEPARTMENT_ID
----------- ------------------------- ---------- -------------
        200 Whalen                          4400            10
        202 Fay                             6000            20
        119 Colmenares                      2500            30
        203 Mavris                          6500            40
        132 Olson                           2100            50
        107 Lorentz                         4200            60
        204 Baer                           10000            70
        173 Kumar                           6100            80
        101 Kochhar                        17000            90
        102 De Haan                        17000            90
        113 Popp                            6900           100
        206 Gietz                           8300           110
        178 Grant                           7000              

13 rows selected. 

If you didn't have to worry about ties there is an even simpler alternative, but it ins't appropriate here.

如果您不必担心关系,还有一个更简单的选择,但在这里不合适。

Notice that this gives you one more row than your original query. You are joining on sml.department_id = emp.department_id. If the department ID is null, as it is for employee 178, that join fails because you can't compare null to null with equality tests. Because this solution doesn't have a join, that doesn't apply, and you see that employee in the results.

请注意,这比原始查询多出一行。你正在加入on sml.department_id = emp.department_id。如果部门 ID 为 null,就像员工 178 的情况一样,则该连接将失败,因为您无法使用相等测试将 null 与 null 进行比较。由于此解决方案没有联接,因此不适用,并且您会在结果中看到该员工。

回答by am2

you do not need window function in this case, cause a simple group bywould work too.

在这种情况下你不需要窗口函数,因为一个简单的group by也可以。

And the error is correct, cause the window function isn't an aggregat function. And a window function can't be a Group by- member.

错误是正确的,因为窗口函数不是聚合函数。并且窗口函数不能是 Group by- 成员。

But you could use "distinct" instead.

但是您可以改用“distinct”。

SELECT DISTINCT department_id,
     min(salary) OVER (partition by department_id)  as minsalary
FROM employees;

In your Special case all this is oversized, of course. But I think understanding is the name of the game.

当然,在您的特殊情况下,所有这些都是超大的。但我认为理解就是游戏的名称。

回答by Matt

WITH cte AS (
    SELECT
       emp.*
       ,ROW_NUMBER() OVER (PARTITION BY emp.department_id ORDER BY emp.salary) as RowNumber
    FROM
       employees emp
)

SELECT c.*
FROM
    cte c
WHERE
    c.RowNumber = 1

You can use ROW_NUMBER()to get 1 row of lowest salary by department as above. If you want all rows in the case of ties switch it to RANK()

您可以使用ROW_NUMBER()按部门获得 1 行最低工资,如上。如果您希望在关系的情况下所有行都将其切换为RANK()

Otherwise you can do it with MIN() OVERbut this will give you ties

否则你可以这样做,MIN() OVER但这会给你带来联系

WITH cte AS (
    SELECT
       emp.*
       ,MIN(emp.salary) OVER (PARTITION BY emp.department_id) as DeptMinSalary
    FROM
       employees emp
)

SELECT c.*
FROM
    cte c
WHERE
    c.salary = c.DeptMinSalary

As a derived table instead of a Common Table Expression:

作为派生表而不是公用表表达式:

SELECT t.*
FROM
    (SELECT
       emp.*
       ,ROW_NUMBER() OVER (PARTITION BY emp.department_id ORDER BY emp.salary) as RowNumber
    FROM
       employees emp) t
WHERE
    t.RowNumber = 1

One last thought on the subject because you ask "Can I group by in a SQL query with a window function?" Alex covers that the PARTITION BYis like a sub grouping within the Window Function. But to use a GROUP BYgrouping with a Window function means that the GROUP BYresult set would be evaluated PRIOR to the Window Function being evaluated.

关于这个主题的最后一个想法是因为您问“我可以使用窗口函数在 SQL 查询中分组吗?” Alex 介绍了这PARTITION BY就像窗口函数中的一个子分组。但是使用GROUP BY带有 Window 函数的分组意味着GROUP BY将在评估窗口函数之前评估结果集。

回答by AcePL

First thing to remember is that windowed functions (like OVER()clause) work on the result of the query. That is: Server first executes the query and only then applies the windowed function as defined by you.

首先要记住的是,窗口函数(如OVER()子句)对查询结果起作用。即:服务器首先执行查询,然后才应用您定义的窗口函数。

This means that you can actually use windowed function and group by clause in the same query, but you need to encapsulate it, like this:

这意味着你实际上可以在同一个查询中使用窗口函数和 group by 子句,但你需要封装它,像这样:

     SELECT  department_id,
     min(min(salary)) OVER (partition by department_id)  as minsalary
     FROM employees;
     GROUP BY department_id

However, I agree that this is not a good place to use windowed function. Matt's proposition is best here (ROW_NUMBER()in CTEor subquery, then selecting only the desired rows in main SELECT).

但是,我同意这不是使用窗口函数的好地方。马特的提议在这里是最好的(ROW_NUMBER()CTEor 中subquery,然后只在 main 中选择所需的行SELECT)。