SQL 了解 WHERE 如何与 GROUP BY 和聚合配合使用
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/14006290/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Understanding how WHERE works with GROUP BY and Aggregation
提问by david blaine
My query -
我的查询 -
select cu.CustomerID,cu.FirstName,cu.LastName, COUNT(si.InvoiceID)as inv --1
from Customer as cu inner join SalesInvoice as si --2
on cu.CustomerID = si.CustomerID -- 3
-- put the WHERE clause here ! --4
group by cu.CustomerID,cu.FirstName,cu.LastName -- 5
where cu.FirstName = 'mark' -- 6
Output with correct code -
输出正确的代码 -
Error i get - Incorrect syntax near the keyword 'where'.
我得到的错误 - 关键字“where”附近的语法不正确。
Can you tell me why I get this error ? I want to know why WHERE comes before GROUP BY and not after.
你能告诉我为什么我会收到这个错误吗?我想知道为什么 WHERE 在 GROUP BY 之前而不是之后。
回答by Taryn
You have the order wrong. The WHERE
clause goes before the GROUP BY
:
你顺序错了 该WHERE
子句在 之前GROUP BY
:
select cu.CustomerID,cu.FirstName,cu.LastName, COUNT(si.InvoiceID)as inv
from Customer as cu
inner join SalesInvoice as si
on cu.CustomerID = si.CustomerID
where cu.FirstName = 'mark'
group by cu.CustomerID,cu.FirstName,cu.LastName
If you want to perform a filter after the GROUP BY
, then you will use a HAVING
clause:
如果要在 之后执行过滤器GROUP BY
,则将使用HAVING
子句:
select cu.CustomerID,cu.FirstName,cu.LastName, COUNT(si.InvoiceID)as inv
from Customer as cu
inner join SalesInvoice as si
on cu.CustomerID = si.CustomerID
group by cu.CustomerID,cu.FirstName,cu.LastName
having cu.FirstName = 'mark'
A HAVING
clause is typically used for aggregate function filtering, so it makes sense that this would be applied after the GROUP BY
甲HAVING
子句通常用于聚合函数滤波,所以是非常有意义的,这将在之后被施加GROUP BY
To learn about the order of operations here is article explaining the order. From the article the order of operation in SQL is:
要了解操作顺序,请参阅解释顺序的文章。从文章中SQL中的操作顺序是:
To start out, I thought it would be good to look up the order in which SQL directives get executed as this will change the way I can optimize:
首先,我认为查找 SQL 指令执行的顺序会很好,因为这将改变我可以优化的方式:
FROM clause
WHERE clause
GROUP BY clause
HAVING clause
SELECT clause
ORDER BY clause
Using this order you will apply the filter in the WHERE
prior to a GROUP BY
. The WHERE
is used to limit the number of records.
使用此命令将应用过滤器的WHERE
前一个GROUP BY
。该WHERE
用于限制的记录数。
Think of it this way, if you were applying the WHERE
after then you would return more records then you would want to group on. Applying it first, reduces the recordset then applies the grouping.
可以这样想,如果您应用WHERE
after ,那么您将返回更多记录,然后您想要分组。首先应用它,减少记录集然后应用分组。
回答by tvanfosson
The where
clause comes before the group by
because conceptually you filter before you group, not after. You want to restrict the output of the that is grouped to only those that match rather than perform the grouping on items that you will, potentially, throw away due to the filter.
该where
子句出现在 the 之前,group by
因为从概念上讲,您在分组之前进行过滤,而不是在分组之后。您希望将分组的输出限制为仅匹配的那些,而不是对您可能会因过滤器而丢弃的项目执行分组。
回答by GolezTrol
The WHERE
clause is used before GROUP BY
, because it makes more sense. The filter specified in the WHERE
clause is used before grouping. After grouping, you can have a HAVING
clause, which is similar to WHERE
, except you can filter by aggregate values as well.
该WHERE
子句在 之前使用GROUP BY
,因为它更有意义。WHERE
在分组之前使用子句中指定的过滤器。分组后,您可以有一个HAVING
类似于的子句WHERE
,但您也可以按聚合值进行过滤。
Compare:
相比:
-- Select the number of invoices per customer (for Customer 1 only)
SELECT
si.CustomerID,
COUNT(*) as InvoiceCount
FROM
SalesInvoice as si
WHERE
si.CustomerID = 1
-- You cannot filter by count(*) here, because grouping hasn't taken place yet.
GROUP BY
si.CustomerID -- (Not needed in this case, because of only 1 customer)
against
反对
-- Select all invoices of customers that have more than three invoices
SELECT
si.CustomerID,
COUNT(*) as InvoiceCount
FROM
SalesInvoice as si
GROUP BY
si.CustomerId
HAVING
-- You can filter by aggregates, like count, here.
COUNT(*) > 3
回答by ExactaBox
SQL does allow you to filter on the results of a GROUP BY -- it's called the HAVING clause.
SQL 确实允许您过滤 GROUP BY 的结果——它被称为 HAVING 子句。
If you want to filter on something that could be determined prior to the grouping (i.e. everyone with FirstName = 'Mark'), that's done via WHERE.
如果您想过滤可以在分组之前确定的内容(即 FirstName = 'Mark' 的每个人),可以通过 WHERE 完成。
However, if you want to filter on everyone with 4 or more invoices (i.e., something you wouldn't know until afterdoing the COUNT), then you use HAVING.
但是,如果您想过滤拥有 4 个或更多发票的每个人(即,在执行 COUNT之前您不会知道的东西),那么您可以使用 HAVING。
回答by ExactaBox
Let's say you have 100,000 people in your database. 9 of whom are named Mark. Why should the database do a Count operation on all 100,000, then throw out the 99,991 NOT named Mark? Doesn't it seem smarter to filter out the Marks first, then do the Count only 9 times? Makes the operation a whole lot faster.
假设您的数据库中有 100,000 人。其中 9 人名为马克。为什么数据库要对所有 100,000 个进行 Count 操作,然后丢弃 99,991 个 NOT 命名为 Mark?先过滤掉Marks,然后只做9次计数是不是看起来更聪明?使操作变得更快。