MySQL SELECT 子句中不存在聚合函数时的 GROUP BY 行为

Question

提问by Harish Shetty

I have a table empwith following structure and data:

我有一个emp具有以下结构和数据的表：

name   dept    salary
-----  -----   -----
Hyman   a       2
Jill   a       1
Tom    b       2
Fred   b       1

When I execute the following SQL:

当我执行以下 SQL 时：

SELECT * FROM emp GROUP BY dept

I get the following result:

我得到以下结果：

name   dept    salary
-----  -----   -----
Jill   a       1
Fred   b       1

On what basis did the server decide return Jill and Fred and exclude Hyman and Tom?

服务器根据什么决定返回吉尔和弗雷德并排除Hyman和汤姆？

I am running this query in MySQL.

我在 MySQL 中运行这个查询。

Note 1: I know the query doesn't make sense on its own. I am trying to debug a problem with a 'GROUP BY' scenario. I am trying to understand the default behavior for this purpose.

注 1：我知道查询本身没有意义。我正在尝试使用“GROUP BY”方案调试问题。我试图了解为此目的的默认行为。

Note 2: I am used to writing the SELECT clause same as the GROUP BY clause (minus the aggregate fields). When I came across the behavior described above, I started wondering if I can rely on this for scenarios such as: select the rows from emp table where the salary is the lowest/highest in the dept. E.g.: The SQL statements like this works on MySQL:

注 2：我习惯于编写与 GROUP BY 子句相同的 SELECT 子句（减去聚合字段）。当我遇到上述行为时，我开始怀疑是否可以将其用于以下场景：从 emp 表中选择部门中工资最低/最高的行。例如：像这样的 SQL 语句适用于 MySQL：

SELECT A.*, MIN(A.salary) AS min_salary FROM emp AS A GROUP BY A.dept

I didn't find any material describing why such SQL works, more importantly if I can rely on such behavior consistently. If this is a reliable behavior then I can avoid queries like:

我没有找到任何描述此类 SQL 为何起作用的材料，更重要的是，如果我能始终如一地依赖此类行为。如果这是一个可靠的行为，那么我可以避免以下查询：

SELECT A.* FROM emp AS A WHERE A.salary = ( 
            SELECT MAX(B.salary) FROM emp B WHERE B.dept = A.dept)

Answer 1

采纳答案by mjv

Read MySQL documentationon this particular point.

阅读有关此特定点的MySQL 文档。

In a nutshell, MySQL allows omitting some columns from the GROUP BY, for performance purposes, however this works only if the omitted columns all have the same value(within a grouping), otherwise, the value returned by the query are indeed indeterminate, as properly guessed by others in this post. To be sure adding an ORDER BY clause would not re-introduce any form of deterministic behavior.

简而言之，出于性能目的，MySQL 允许从 GROUP BY 中省略一些列，但是这仅在省略的列都具有相同值（在一个分组内）时才有效，否则，查询返回的值确实是不确定的，如其他人在这篇文章中正确猜到了。可以肯定的是，添加 ORDER BY 子句不会重新引入任何形式的确定性行为。

Although not at the core of the issue, this example shows how using * rather than an explicit enumeration of desired columns is often a bad idea.

虽然不是问题的核心，但这个例子显示了使用 * 而不是显式枚举所需的列通常是一个坏主意。

Excerpt from MySQL 5.0 documentation:

摘自 MySQL 5.0 文档：

When using this feature, all rows in each group should have the same values
for the columns that are omitted from the GROUP BY part. The server is free
to return any value from the group, so the results are indeterminate unless
all values are the same.

Answer 2

回答by Samuel Hodge

This is a bit late, but I'll put this up for future reference.

这有点晚了，但我会把它放出来以备将来参考。

The GROUP BY takes the first row that has a duplicate and discards any rows that match after it in the result set. So if Hyman and Tom have the same department, whoever appears first in a normal SELECT will be the resulting row in the GROUP BY.

GROUP BY 获取具有重复项的第一行，并丢弃结果集中在该行之后匹配的所有行。因此，如果 Hyman 和 Tom 有相同的部门，那么在普通 SELECT 中首先出现的人将是 GROUP BY 中的结果行。

If you want to control what appears first in the list, you need to do an ORDER BY. However, SQL does not allow ORDER BY to come before GROUP BY, as it will throw an exception. The best workaround for this issue is to do the ORDER BY in a subquery and then a GROUP BY in the outer query. Here's an example:

如果要控制列表中最先出现的内容，则需要执行 ORDER BY。但是，SQL 不允许 ORDER BY 出现在 GROUP BY 之前，因为它会引发异常。此问题的最佳解决方法是在子查询中执行 ORDER BY，然后在外部查询中执行 GROUP BY。下面是一个例子：

SELECT * FROM (SELECT * FROM emp ORDER BY name) as foo GROUP BY dept

This is the best performing technique I've found. I hope this helps someone out.

这是我发现的性能最好的技术。我希望这可以帮助某人。

Answer 3

回答by Joel L

As far as I know, for your purposes the specific rows returned can be considered to be random.

据我所知，出于您的目的，返回的特定行可以被认为是随机的。

Ordering only takes place after GROUP BYis done

订货仅发生后GROUP BY完成

Answer 4

回答by cube.head

You can put a:

你可以放一个：

SET GLOBAL sql_mode=(SELECT REPLACE(@@sql_mode,'ONLY_FULL_GROUP_BY',''));

before your query to enforce SQL standard GROUP BYbehavior

在您执行 SQL 标准GROUP BY行为的查询之前

Answer 5

回答by Rob Farley

I find that the best thing to do is to consider this type of query unsupported. In most other database systems, you can't include columns that aren't either in the GROUP BY clause or in an aggregate function in the HAVING, SELECT or ORDER BY clauses.

我发现最好的办法是考虑不支持这种类型的查询。在大多数其他数据库系统中，您不能包含不在 GROUP BY 子句中或不在 HAVING、SELECT 或 ORDER BY 子句中的聚合函数中的列。

Instead, consider that your query reads:

相反，请考虑您的查询内容：

SELECT ANY(name), dept, ANY(salary)
FROM emp 
GROUP BY dept;

...since this is what's going on.

......因为这是正在发生的事情。

Hope this helps....

希望这可以帮助....

Answer 6

回答by Petruza

I think ANSI SQL requires that the select includes only fields from the GROUP BY clause, plus aggregate functions. This behaviour of MySQL looks like returns some row, possibly the last one the server read, or any row it had at hand, but don't rely on that.

我认为 ANSI SQL 要求选择只包含来自 GROUP BY 子句的字段，以及聚合函数。MySQL的这种行为看起来像是返回一些行，可能是服务器读取的最后一行，或者它手头的任何行，但不要依赖它。

Answer 7

回答by Paul Chris Jones

This would select the most recent row for each person:

这将为每个人选择最近的行：

SELECT * FROM emp
WHERE ID IN
(
    SELECT
        MAX(ID) AS ID
    FROM
        emp
    GROUP BY
        name
)

Answer 8

回答by CSharpAtl

If you are grouping by department does it matter about the other data? I know Sql Server will not even allow this query. If there is a possibility of this sounds like there might be other issues.

如果您按部门分组，其他数据是否重要？我知道 Sql Server 甚至不允许这个查询。如果有这种可能性，这听起来可能还有其他问题。

Answer 9

回答by Marius

Try using ORDER BY to pick the row that you want.

尝试使用 ORDER BY 选择所需的行。

SELECT * FROM emp GROUP BY dept ORDER BY name ASC;

Will return the following:

将返回以下内容：

name   dept    salary
-----  -----   -----
Hyman   a       2
fred   b       1

MySQL SELECT 子句中不存在聚合函数时的 GROUP BY 行为

提问by Harish Shetty

采纳答案by mjv

回答by Samuel Hodge

回答by Joel L

回答by cube.head

回答by Rob Farley

回答by Petruza

回答by Paul Chris Jones

回答by CSharpAtl

回答by Marius

相关推荐

最近更新

标签

MySQL SELECT 子句中不存在聚合函数时的 GROUP BY 行为

提问by Harish Shetty

采纳答案by mjv

回答by Samuel Hodge

回答by Joel L

回答by cube.head

回答by Rob Farley

回答by Petruza

回答by Paul Chris Jones

回答by CSharpAtl

回答by Marius

相关推荐

不允许主机 'xxx.xx.xxx.xxx' 连接到此 MySQL 服务器

MySQL #1062 - 键“PRIMARY”的重复条目

如何在 MySQL 中删除唯一性？

MySQL 插入日期时出错 - 日期值不正确：

相关推荐

最近更新

标签