MySQL SELECT 子句中不存在聚合函数时的 GROUP BY 行为
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1591909/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
GROUP BY behavior when no aggregate functions are present in the SELECT clause
提问by Harish Shetty
I have a table emp
with following structure and data:
我有一个emp
具有以下结构和数据的表:
name dept salary
----- ----- -----
Hyman a 2
Jill a 1
Tom b 2
Fred b 1
When I execute the following SQL:
当我执行以下 SQL 时:
SELECT * FROM emp GROUP BY dept
I get the following result:
我得到以下结果:
name dept salary
----- ----- -----
Jill a 1
Fred b 1
On what basis did the server decide return Jill and Fred and exclude Hyman and Tom?
服务器根据什么决定返回吉尔和弗雷德并排除Hyman和汤姆?
I am running this query in MySQL.
我在 MySQL 中运行这个查询。
Note 1: I know the query doesn't make sense on its own. I am trying to debug a problem with a 'GROUP BY' scenario. I am trying to understand the default behavior for this purpose.
注 1:我知道查询本身没有意义。我正在尝试使用“GROUP BY”方案调试问题。我试图了解为此目的的默认行为。
Note 2: I am used to writing the SELECT clause same as the GROUP BY clause (minus the aggregate fields). When I came across the behavior described above, I started wondering if I can rely on this for scenarios such as: select the rows from emp table where the salary is the lowest/highest in the dept. E.g.: The SQL statements like this works on MySQL:
注 2:我习惯于编写与 GROUP BY 子句相同的 SELECT 子句(减去聚合字段)。当我遇到上述行为时,我开始怀疑是否可以将其用于以下场景:从 emp 表中选择部门中工资最低/最高的行。例如:像这样的 SQL 语句适用于 MySQL:
SELECT A.*, MIN(A.salary) AS min_salary FROM emp AS A GROUP BY A.dept
I didn't find any material describing why such SQL works, more importantly if I can rely on such behavior consistently. If this is a reliable behavior then I can avoid queries like:
我没有找到任何描述此类 SQL 为何起作用的材料,更重要的是,如果我能始终如一地依赖此类行为。如果这是一个可靠的行为,那么我可以避免以下查询:
SELECT A.* FROM emp AS A WHERE A.salary = (
SELECT MAX(B.salary) FROM emp B WHERE B.dept = A.dept)
采纳答案by mjv
Read MySQL documentationon this particular point.
阅读有关此特定点的MySQL 文档。
In a nutshell, MySQL allows omitting some columns from the GROUP BY, for performance purposes, however this works only if the omitted columns all have the same value(within a grouping), otherwise, the value returned by the query are indeed indeterminate, as properly guessed by others in this post. To be sure adding an ORDER BY clause would not re-introduce any form of deterministic behavior.
简而言之,出于性能目的,MySQL 允许从 GROUP BY 中省略一些列,但是这仅在省略的列都具有相同值(在一个分组内)时才有效,否则,查询返回的值确实是不确定的,如其他人在这篇文章中正确猜到了。可以肯定的是,添加 ORDER BY 子句不会重新引入任何形式的确定性行为。
Although not at the core of the issue, this example shows how using * rather than an explicit enumeration of desired columns is often a bad idea.
虽然不是问题的核心,但这个例子显示了使用 * 而不是显式枚举所需的列通常是一个坏主意。
Excerpt from MySQL 5.0 documentation:
摘自 MySQL 5.0 文档:
When using this feature, all rows in each group should have the same values for the columns that are omitted from the GROUP BY part. The server is free to return any value from the group, so the results are indeterminate unless all values are the same.
回答by Samuel Hodge
This is a bit late, but I'll put this up for future reference.
这有点晚了,但我会把它放出来以备将来参考。
The GROUP BY takes the first row that has a duplicate and discards any rows that match after it in the result set. So if Hyman and Tom have the same department, whoever appears first in a normal SELECT will be the resulting row in the GROUP BY.
GROUP BY 获取具有重复项的第一行,并丢弃结果集中在该行之后匹配的所有行。因此,如果 Hyman 和 Tom 有相同的部门,那么在普通 SELECT 中首先出现的人将是 GROUP BY 中的结果行。
If you want to control what appears first in the list, you need to do an ORDER BY. However, SQL does not allow ORDER BY to come before GROUP BY, as it will throw an exception. The best workaround for this issue is to do the ORDER BY in a subquery and then a GROUP BY in the outer query. Here's an example:
如果要控制列表中最先出现的内容,则需要执行 ORDER BY。但是,SQL 不允许 ORDER BY 出现在 GROUP BY 之前,因为它会引发异常。此问题的最佳解决方法是在子查询中执行 ORDER BY,然后在外部查询中执行 GROUP BY。下面是一个例子:
SELECT * FROM (SELECT * FROM emp ORDER BY name) as foo GROUP BY dept
This is the best performing technique I've found. I hope this helps someone out.
这是我发现的性能最好的技术。我希望这可以帮助某人。
回答by Joel L
As far as I know, for your purposes the specific rows returned can be considered to be random.
据我所知,出于您的目的,返回的特定行可以被认为是随机的。
Ordering only takes place after GROUP BY
is done
订货仅发生后GROUP BY
完成
回答by cube.head
You can put a:
你可以放一个:
SET GLOBAL sql_mode=(SELECT REPLACE(@@sql_mode,'ONLY_FULL_GROUP_BY',''));
before your query to enforce SQL standard GROUP BY
behavior
在您执行 SQL 标准GROUP BY
行为的查询之前
回答by Rob Farley
I find that the best thing to do is to consider this type of query unsupported. In most other database systems, you can't include columns that aren't either in the GROUP BY clause or in an aggregate function in the HAVING, SELECT or ORDER BY clauses.
我发现最好的办法是考虑不支持这种类型的查询。在大多数其他数据库系统中,您不能包含不在 GROUP BY 子句中或不在 HAVING、SELECT 或 ORDER BY 子句中的聚合函数中的列。
Instead, consider that your query reads:
相反,请考虑您的查询内容:
SELECT ANY(name), dept, ANY(salary)
FROM emp
GROUP BY dept;
...since this is what's going on.
......因为这是正在发生的事情。
Hope this helps....
希望这可以帮助....
回答by Petruza
I think ANSI SQL requires that the select includes only fields from the GROUP BY clause, plus aggregate functions. This behaviour of MySQL looks like returns some row, possibly the last one the server read, or any row it had at hand, but don't rely on that.
我认为 ANSI SQL 要求选择只包含来自 GROUP BY 子句的字段,以及聚合函数。MySQL的这种行为看起来像是返回一些行,可能是服务器读取的最后一行,或者它手头的任何行,但不要依赖它。
回答by Paul Chris Jones
This would select the most recent row for each person:
这将为每个人选择最近的行:
SELECT * FROM emp
WHERE ID IN
(
SELECT
MAX(ID) AS ID
FROM
emp
GROUP BY
name
)
回答by CSharpAtl
If you are grouping by department does it matter about the other data? I know Sql Server will not even allow this query. If there is a possibility of this sounds like there might be other issues.
如果您按部门分组,其他数据是否重要?我知道 Sql Server 甚至不允许这个查询。如果有这种可能性,这听起来可能还有其他问题。
回答by Marius
Try using ORDER BY to pick the row that you want.
尝试使用 ORDER BY 选择所需的行。
SELECT * FROM emp GROUP BY dept ORDER BY name ASC;
Will return the following:
将返回以下内容:
name dept salary
----- ----- -----
Hyman a 2
fred b 1