SQL 为什么 where 子句中不允许使用聚合函数
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/42470849/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Why are aggregate functions not allowed in where clause
提问by Nishant_Singh
I am looking for clarification on this. I am writing two queries below:
我正在寻求对此的澄清。我在下面写了两个查询:
We have a table of employee name with columns ID , name , salary
我们有一张员工姓名表,其中包含 ID、姓名、薪水
1. Select name from employee
where sum(salary) > 1000 ;
2. Select name from employee
where substring_index(name,' ',1) = 'nishant' ;
Query 1 doesn't work but Query 2 does work. From my development experience, I feel the possible explanation to this is:
查询 1 不起作用,但查询 2 起作用。根据我的开发经验,我觉得可能的解释是:
The sum() works on a set of values specified in the argument. Here 'salary' column is passed , so it must add up all the values of this column. But inside where clause, the records are checked one by one , like first record 1 is checked for the test and so on. Thus sum(salary) will not be computed as it needs access to all the column values and then only it will return a value.
sum() 处理参数中指定的一组值。这里 'salary' 列被传递,所以它必须把这个列的所有值加起来。但是在 where 子句中,记录被一个一个地检查,比如第一条记录 1 被检查以进行测试等等。因此 sum(salary) 不会被计算,因为它需要访问所有列值,然后只有它会返回一个值。
Query 2 works as substring_index() works on a single value and hence here it works on the value supplied to it.
查询 2 的工作方式是 substring_index() 处理单个值,因此这里它处理提供给它的值。
Can you please validate my understanding.
你能验证我的理解吗?
回答by mathguy
The reason you can't use SUM()
in the WHERE
clause is the order of evaluation of clauses.
不能SUM()
在WHERE
子句中使用的原因是子句的求值顺序。
FROM
tells you where to read rows from. Right as rows are read from disk to memory, they are checked for the WHERE
conditions. (Actually in many cases rows that fail the WHERE
clause will not even be read from disk. "Conditions" are formally known as predicatesand some predicates are used - by the query execution engine - to decide which rows are read from the base tables. These are called accesspredicates.) As you can see, the WHERE
clause is applied to each row as it is presented to the engine.
FROM
告诉您从何处读取行。当行从磁盘读取到内存时,会检查它们的WHERE
条件。(实际上,在许多情况下,该WHERE
子句失败的行甚至不会从磁盘中读取。“条件”正式称为谓词,并且查询执行引擎使用一些谓词来决定从基表中读取哪些行。这些称为访问谓词。)如您所见,该WHERE
子句应用于呈现给引擎的每一行。
On the other hand, aggregation is done only after all rows (that verify all the predicates) have been read.
另一方面,只有在读取了所有行(验证所有谓词)之后,才会进行聚合。
Think about this: SUM()
applies ONLY to the rows that satisfy the WHERE
conditions. If you put SUM()
in the WHERE
clause, you are asking for circular logic. Does a new row pass the WHERE
clause? How would I know? If it will pass, then I must include it in the SUM
, but if not, it should not be included in the SUM
. So how do I even evaluate the SUM
condition?
想一想:SUM()
仅适用于满足WHERE
条件的行。如果你把SUM()
中WHERE
条款,你所要求的循环逻辑。新行是否通过WHERE
子句?我怎么会知道?如果它通过,那么我必须将它SUM
包含在SUM
. 那么我如何评估SUM
病情?
回答by Gurwinder Singh
Why can we use aggregate function in where clause
为什么我们可以在 where 子句中使用聚合函数
Aggregate functions work on sets of data. A WHERE
clause doesn't have access to entire set, but only to the row that it is currently working on.
聚合函数处理数据集。一个WHERE
条款没有获得一整套,但仅限于该行,它目前正在对。
You can of course use HAVING clause:
您当然可以使用 HAVING 子句:
select name from employee
group by name having sum(salary) > 1000;
If you must use WHERE
, you can use a subquery:
如果必须使用WHERE
,则可以使用子查询:
select name from (
select name, sum(salary) total_salary from employee
group by name
) t where total_salary > 1000;
回答by Gordon Linoff
sum()
is an aggregation function. In general, you would expect it to work with group by
. Hence, your first query is missing a group by
. In a group by
query, having
is used for filtering afterthe aggregation:
sum()
是一个聚合函数。通常,您希望它与group by
. 因此,您的第一个查询缺少group by
. 在group by
查询中,having
用于聚合后的过滤:
Select name
from employee
group by name
having sum(salary) > 1000 ;
回答by Paul Kiarie
Using having works since the query goes direct to the rows in that column while where fails since the query keep looping back and forth whenever conditions is not met.
使用 have 有效,因为查询直接转到该列中的行,而 where 失败,因为只要不满足条件,查询就会不断来回循环。