SQL 为什么 where 子句中没有窗口函数?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/13997177/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-01 12:46:35  来源:igfitidea点击:

Why no windowed functions in where clauses?

sqlsql-servertsqlwindow-functions

提问by Chris Pfohl

Title says it all, why can't I use a windowed function in a where clause in SQL Server?

标题说明了一切,为什么我不能在 SQL Server 的 where 子句中使用窗口函数?

This query makes perfect sense:

这个查询非常有意义:

select id, sales_person_id, product_type, product_id, sale_amount
from Sales_Log
where 1 = row_number() over(partition by sales_person_id, product_type, product_id order by sale_amount desc)

But it doesn't work. Is there a better way than a CTE/Subquery?

但它不起作用。有没有比 CTE/子查询更好的方法?

EDIT

编辑

For what its worth this is the query with a CTE:

对于它的价值,这是带有 CTE 的查询:

with Best_Sales as (
    select id, sales_person_id, product_type, product_id, sale_amount, row_number() over (partition by sales_person_id, product_type, product_id order by sales_amount desc) rank
    from Sales_log
)
select id, sales_person_id, product_type, product_id, sale_amount
from Best_Sales
where rank = 1

EDIT

编辑

+1 for the answers showing with a subquery, but really I'm looking for the reasoning behind not being able to use windowing functions in where clauses.

+1 用于显示子查询的答案,但实际上我正在寻找无法在 where 子句中使用窗口函数的原因。

回答by Martin Smith

why can't I use a windowed function in a where clause in SQL Server?

为什么我不能在 SQL Server 的 where 子句中使用窗口函数?

One answer, though not particularly informative, is because the spec says that you can't.

一个答案,虽然不是特别有用,是因为规范说你不能。

See the article by Itzik Ben Gan - Logical Query Processing: What It Is And What It Means to Youand in particular the image here. Window functions are evaluated at the time of the SELECTon the result set remaining after all the WHERE/JOIN/GROUP BY/HAVINGclauses have been dealt with (step 5.1).

请参阅 Itzik Ben Gan 撰写的文章 -逻辑查询处理:它是什么以及它对您意味着什么,特别是此处的图像。窗函数在的时间评估SELECT剩余之后的所有结果集WHERE/ JOIN/ GROUP BY/HAVING子句进行了处理(步骤5.1)。

really I'm looking for the reasoning behind not being able to use windowing functions in where clauses.

我真的在寻找无法在 where 子句中使用窗口函数的原因。

The reason that they are not allowed in the WHEREclause is that it would create ambiguity. Stealing Itzik Ben Gan's example from High-Performance T-SQL Using Window Functions(p.25)

WHERE条款中不允许使用它们的原因是它会造成歧义。从使用窗口函数的高性能 T-SQL 中窃取 Itzik Ben Gan 的示例(p.25)

Suppose your table was

假设你的桌子是

CREATE TABLE T1
(
col1 CHAR(1) PRIMARY KEY
)

INSERT INTO T1 VALUES('A'),('B'),('C'),('D'),('E'),('F')

And your query

还有你的查询

SELECT col1
FROM T1
WHERE ROW_NUMBER() OVER (ORDER BY col1) <= 3
AND col1 > 'B'

What would be the right result? Would you expect that the col1 > 'B'predicate ran before or after the row numbering?

什么是正确的结果?您希望col1 > 'B'谓词在行编号之前还是之后运行?

回答by Taryn

There is no need for CTE, just use the windowing function in a subquery:

不需要 CTE,只需在子查询中使用窗口函数:

select id, sales_person_id, product_type, product_id, sale_amount
from
(
  select id, sales_person_id, product_type, product_id, sale_amount,
    row_number() over(partition by sales_person_id, product_type, product_id order by sale_amount desc) rn
  from Sales_Log
) sl
where rn = 1

Edit, moving my comment to the answer.

编辑,将我的评论移至答案。

Windowing functions are not performed until the data is actually selected which is after the WHEREclause. So if you try to use a row_numberin a WHEREclause the value is not yet assigned.

直到WHERE子句之后的数据被实际选择后,才会执行加窗函数。因此,如果您尝试row_numberWHERE子句中使用 a ,则尚未分配该值。

回答by Lukasz Szozda

First of all it something called all-at-once operation

首先它叫做 all-at-once operation

"All-at-Once Operations" means that all expressions in the same logical query process phase are evaluated logically at the same time.

“一次性操作”是指同一逻辑查询过程阶段中的所有表达式同时进行逻辑求值。

And great chapter Impact on Window Functions:

和伟大的一章对窗口函数的影响

Suppose you have:

假设你有:

CREATE TABLE #Test ( Id INT) ;

INSERT  INTO #Test VALUES  ( 1001 ), ( 1002 ) ;

SELECT Id
FROM #Test
WHERE Id = 1002
  AND ROW_NUMBER() OVER(ORDER BY Id) = 1;

All-at-Once operations tell us these two conditions evaluated logically at the same point of time.Therefore, SQL Server can evaluate conditions in WHERE clause in arbitrary order, based on estimated execution plan. So the main question here is which condition evaluates first.

All-at-Once 操作告诉我们这两个条件在同一时间点进行逻辑评估。因此,SQL Server 可以根据估计的执行计划以任意顺序评估 WHERE 子句中的条件。所以这里的主要问题是先评估哪个条件。

Case 1:

情况1:

If ( Id = 1002 ) is first, then if ( ROW_NUMBER() OVER(ORDER BY Id) = 1 )

If ( Id = 1002 ) is first, then if ( ROW_NUMBER() OVER(ORDER BY Id) = 1 )

Result: 1002

结果:1002

Case 2:

案例2:

If ( ROW_NUMBER() OVER(ORDER BY Id) = 1 ), then check if ( Id = 1002 )

If ( ROW_NUMBER() OVER(ORDER BY Id) = 1 ), then check if ( Id = 1002 )

Result: empty

结果:空

So we have a paradox.

This example shows why we cannot use Window Functions in WHERE clause. You can think more about this and find why Window Functions are allowed to be used just in SELECTand ORDER BYclauses!

所以我们有一个悖论。

这个例子说明了为什么我们不能在 WHERE 子句中使用窗口函数。您可以对此进行更多思考,并找出为什么只允许在SELECTORDER BY子句中使用窗口函数!



Addendum

附录

Teradata supports QUALIFYclause:

Teradata 支持QUALIFY条款:

Filters results of a previously computed ordered analytical function according to user?specified search conditions.

根据用户指定的搜索条件过滤先前计算的有序分析函数的结果。

SELECT Id
FROM #Test
WHERE Id = 1002
QUALIFY ROW_NUMBER() OVER(ORDER BY Id) = 1;

Addendum 2:

附录 2:

Snowflake - Qualify

QUALIFY does with window functions what HAVING does with aggregate functions and GROUP BY clauses.

In the execution order of a query, QUALIFY is therefore evaluated after window functions are computed. Typically, a SELECT statement's clauses are evaluated in the order shown below:

From

    Where

    Group by

    Having

    Window

    QUALIFY

    Distinct

    Order by

    Limit

雪花 - 合格

QUALIFY 处理窗口函数就像 HAVING 处理聚合函数和 GROUP BY 子句一样。

在查询的执行顺序中, QUALIFY 因此在计算窗口函数之后进行评估。通常,SELECT 语句的子句按如下所示的顺序进行评估:

    Where

    Group by

    Having

    Window

    QUALIFY

    Distinct

    Order by

    Limit

回答by Khan

You don't necessarily need to use a CTE, you can query the result set after using row_number()

不一定需要使用CTE,使用row_number()后可以查询结果集

select row, id, sales_person_id, product_type, product_id, sale_amount
from (
    select
        row_number() over(partition by sales_person_id, 
            product_type, product_id order by sale_amount desc) AS row,
        id, sales_person_id, product_type, product_id, sale_amount
    from Sales_Log 
    ) a
where row = 1

回答by drumsta

It's an old thread, but I'll try to answer specifically the question expressed in the topic.

这是一个旧线程,但我会尝试具体回答主题中表达的问题。

Why no windowed functions in where clauses?

为什么 where 子句中没有窗口函数?

SELECTstatement has following main clauses specified in keyed-in order:

SELECT语句具有以下按键入顺序指定的主要子句:

SELECT DISTINCT TOP list
FROM  JOIN ON / APPLY / PIVOT / UNPIVOT
WHERE
GROUP BY  WITH CUBE / WITH ROLLUP
HAVING
ORDER BY
OFFSET-FETCH

Logical Query Processing Order, or Binding Order, is conceptual interpretation order, it defines the correctness of the query. This order determines when the objects defined in one step are made available to the clauses in subsequent steps.

逻辑查询处理顺序或绑定顺序是概念解释顺序,它定义了查询的正确性。此顺序决定了在一个步骤中定义的对象何时可用于后续步骤中的子句。

----- Relational result
  1. FROM
    1.1. ON JOIN / APPLY / PIVOT / UNPIVOT
  2. WHERE
  3. GROUP BY
    3.1. WITH CUBE / WITH ROLLUP
  4. HAVING
  ---- After the HAVING step the Underlying Query Result is ready
  5. SELECT
    5.1. SELECT list
    5.2. DISTINCT
----- Relational result

----- Non-relational result (a cursor)
  6. ORDER BY
  7. TOP / OFFSET-FETCH
----- Non-relational result (a cursor)

For example, if the query processor can bind to (access) the tables or views defined in the FROMclause, these objects and their columns are made available to all subsequent steps.

例如,如果查询处理器可以绑定到(访问)FROM子句中定义的表或视图,则这些对象及其列可用于所有后续步骤。

Conversely, all clauses preceding the SELECTclause cannot reference any column aliases or derived columns defined in SELECTclause. However, those columns can be referenced by subsequent clauses such as the ORDER BYclause.

相反,子句之前的所有SELECT子句都不能引用SELECT子句中定义的任何列别名或派生列。但是,这些列可以被后续子句(例如ORDER BY子句)引用。

OVERclause determines the partitioning and ordering of a row set before the associated window function is applied. That is, the OVERclause defines a window or user-specified set of rows within an Underlying Query Resultset and window function computes result against that window.

OVER子句在应用关联的窗口函数之前确定行集的分区和排序。也就是说,该OVER子句在底层查询结果集中定义了一个窗口或用户指定的一组行,并且窗口函数针对该窗口计算结果。

Msg 4108, Level 15, State 1, …
Windowed functions can only appear in the SELECT or ORDER BY clauses.

The reason behind is because the way how Logical Query Processingworks in T-SQL. Since the underlying query resultis established only when logical query processing reaches the SELECTstep 5.1. (that is, after processing the FROM, WHERE, GROUP BYand HAVINGsteps), window functions are allowed only in the SELECTand ORDER BYclauses of the query.

背后的原因是逻辑查询处理T-SQL. 由于底层查询结果只有在逻辑查询处理达到SELECT步骤5.1时才建立。(也就是,在处理后FROMWHEREGROUP BYHAVING步骤),窗口函数也可以只在SELECTORDER BY该查询的条款。

Note to mention, window functions are still part of relational layer even Relational Model doesn't deal with ordered data. The result after the SELECTstep 5.1. with any window function is still relational.

请注意,即使关系模型不处理有序数据,窗口函数仍然是关系层的一部分。SELECT步骤 5.1后的结果。与任何窗口函数仍然是相关的。

Also, speaking strictly, the reason why window function are not allowed in the WHEREclause is not because it would create ambiguity, but because the order how Logical Query Processingprocesses SELECTstatement in T-SQL.

此外,严格的说,为什么窗口函数是不是在允许的理由WHERE条款是不是因为它会造成不确定性,但由于如何顺序查询处理逻辑处理SELECT的语句T-SQL

Links: here, hereand here

链接:这里这里这里

回答by Ayush Garg

Basically first "WHERE" clause condition is read by sql and the same column/value id looked into the table but in table row_num=1 is not there still. Hence it will not work. Thats the reason we will use parentheses first and after that we will write the WHERE clause.

基本上第一个“WHERE”子句条件是由 sql 读取的,并且相同的列/值 id 查看了表,但在表中 row_num=1 仍然不存在。因此它不会工作。这就是我们首先使用括号的原因,然后我们将编写 WHERE 子句。

回答by Ann L.

Finally, there's the old-fashioned, pre-SQL Server 2005 way, with a correlated subquery:

最后,还有老式的、pre-SQL Server 2005 方式,带有相关子查询:

select *
from   Sales_Log sl
where  sl.id = (
    Select Top 1 id
    from   Sales_Log sl2
    where  sales_person_id = sl.sales_person_id
       and product_type = sl.product_type
       and product_id = sl.product_id
    order by sale_amount desc
)

I give you this for completeness, merely.

我给你这个只是为了完整性。

回答by djangojazz

Yes unfortunately when you do a windowed function SQL gets mad at you even if your where predicate is legitimate. You make a cte or nested select having the value in your select statement, then reference your CTE or nested select with that value later. Simple example that should be self explanatory. If you really HATE cte's for some performance issue on doing a large data set you can always drop to temp table or table variable.

是的,不幸的是,当您执行窗口函数时,即使您的 where 谓词是合法的,SQL 也会生您的气。您在 select 语句中创建一个具有值的 cte 或嵌套选择,然后稍后使用该值引用您的 CTE 或嵌套选择。应该是不言自明的简单示例。如果您真的讨厌 cte 在处理大型数据集时遇到一些性能问题,您可以随时删除临时表或表变量。

declare @Person table ( PersonID int identity, PersonName varchar(8));

insert into @Person values ('Brett'),('John');

declare @Orders table ( OrderID int identity, PersonID int, OrderName varchar(8));

insert into @Orders values (1, 'Hat'),(1,'Shirt'),(1, 'Shoes'),(2,'Shirt'),(2, 'Shoes');

--Select
--  p.PersonName
--, o.OrderName
--, row_number() over(partition by o.PersonID order by o.OrderID)
--from @Person p 
--  join @Orders o on p.PersonID = o.PersonID
--where row_number() over(partition by o.PersonID order by o.orderID) = 2

-- yields:
--Msg 4108, Level 15, State 1, Line 15
--Windowed functions can only appear in the SELECT or ORDER BY clauses.
;

with a as 
    (
    Select
    p.PersonName
,   o.OrderName
,   row_number() over(partition by o.PersonID order by o.OrderID) as rnk
from @Person p 
    join @Orders o on p.PersonID = o.PersonID
    )
select *
from a 
where rnk >= 2 -- only orders after the first one.