为什么 MySQL 允许“分组依据”查询而没有聚合函数?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1225144/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-31 13:47:37  来源:igfitidea点击:

Why does MySQL allow "group by" queries WITHOUT aggregate functions?

mysqlsqlstandards-complianceansi-sql

提问by Aaron Fi

Surprise -- this is a perfectly valid query in MySQL:

惊喜——这是一个在 MySQL 中完全有效的查询:

select X, Y from someTable group by X

If you tried this query in Oracle or SQL Server, you'd get the natural error message:

如果您在 Oracle 或 SQL Server 中尝试此查询,您会收到自然错误消息:

Column 'Y' is invalid in the select list because it is not contained in 
either an aggregate function or the GROUP BY clause.

So how does MySQL determine which Y to show for each X? It just picks one. From what I can tell, it just picks the first Y it finds. The rationale being, if Y is neither an aggregate function nor in the group by clause, then specifying “select Y” in your query makes no sense to begin with. Therefore, I as the database engine will return whatever I want, and you'll like it.

那么 MySQL 是如何确定每个 X 显示哪个 Y 的呢?它只是选择一个。据我所知,它只是选择它找到的第一个 Y。基本原理是,如果 Y 既不是聚合函数也不是 group by 子句,那么在查询中指定“select Y”就毫无意义。因此,我作为数据库引擎会返回任何我想要的东西,你会喜欢的。

There's even a MySQL configuration parameter to turn off this “looseness”. http://dev.mysql.com/doc/refman/5.7/en/sql-mode.html#sqlmode_only_full_group_by

甚至还有一个 MySQL 配置参数来关闭这种“松散”。 http://dev.mysql.com/doc/refman/5.7/en/sql-mode.html#sqlmode_only_full_group_by

This article even mentions how MySQL has been criticized for being ANSI-SQL non-compliant in this regard. http://www.oreillynet.com/databases/blog/2007/05/debunking_group_by_myths.html

这篇文章甚至提到了 MySQL 在这方面如何因不符合 ANSI-SQL 标准而受到批评。 http://www.oreillynet.com/databases/blog/2007/05/debunking_group_by_myths.html

My question is: Whywas MySQL designed this way? What was their rationale for breaking with ANSI-SQL?

我的问题是: 为什么MySQL 是这样设计的?他们打破 ANSI-SQL 的理由是什么?

采纳答案by Cebjyre

I believe that it was to handle the case where grouping by one field would imply other fields are also being grouped:

我相信这是为了处理按一个字段分组意味着其他字段也被分组的情况:

SELECT user.id, user.name, COUNT(post.*) AS posts 
FROM user 
  LEFT OUTER JOIN post ON post.owner_id=user.id 
GROUP BY user.id

In this case the user.name will always be unique per user.id, so there is convenience in not requiring the user.name in the GROUP BYclause (although, as you say, there is definite scope for problems)

在这种情况下,每个 user.id 的 user.name 将始终是唯一的,因此不需要在GROUP BY子句中使用 user.name 会很方便(尽管正如您所说,存在一定的问题范围)

回答by Miroslav Genev

According to this page(the 5.0 online manual), it's for better performance and user convenience.

根据这个页面(5.0 在线手册),它是为了更好的性能和用户方便。

回答by Rob Farley

Unfortunately almost all the SQL varieties have situations where they break ANSI and have unpredictable results.

不幸的是,几乎所有 SQL 变体都存在破坏 ANSI 并产生不可预测结果的情况。

It sounds to me like they intended it to be treated like the "FIRST(Y)" function that many other systems have.

在我看来,他们打算将其视为许多其他系统具有的“FIRST(Y)”功能。

More than likely, this construct is something that the MySQL team regret, but don't want to stop supporting because of the number of applications that would break.

很可能,这种结构是 MySQL 团队后悔的,但不想因为会破坏的应用程序数量而停止支持。

Rob

回答by GL_Stephen

MySQL treats this is a single column DISTINCT when you use GROUP BY without an aggregate function. Using other options you either have the whole result be distinct, or have to use subqueries, etc. The question is whether the results are truly predictable.

当您使用没有聚合函数的 GROUP BY 时,MySQL 将其视为单列 DISTINCT。使用其他选项,您要么使整个结果不同,要么必须使用子查询等。问题是结果是否真正可预测。

Also, good info is in this thread.

此外,这个线程中有很好的信息。

回答by Giancarlo Nebiolo Navidad

From what I have read in the mysql reference page, it says: "You can use this feature to get better performance by avoiding unnecessary column sorting and grouping. However, this is useful primarily when all values in each nonaggregated column not named in the GROUP BY are the same for each group."

从我在 mysql 参考页面中读到的内容,它说: “您可以使用此功能通过避免不必要的列排序和分组来获得更好的性能。但是,这主要在每个非聚合列中的所有值都没有在 GROUP 中命名时有用BY 对每个组都是相同的。”

I suggest you to read this page (link to the reference manual of mysql): http://dev.mysql.com/doc/refman/5.5/en//group-by-extensions.html

我建议你阅读这个页面(链接到mysql的参考手册):http: //dev.mysql.com/doc/refman/5.5/en//group-by-extensions.html

回答by Nick Dennies

Its actually a very useful tool that all other fields dont have to be in an aggregate function when you group by a field. You can manipulate the result which will be returned by simply ordering it first and then grouping it after. for instance if i wanted to get user login information and i wanted to see the last time the user logged in i would do this.

它实际上是一个非常有用的工具,当您按字段分组时,所有其他字段都不必位于聚合函数中。您可以操作返回的结果,只需先对其进行排序,然后再对其进行分组。例如,如果我想获取用户登录信息,并且想查看用户上次登录的时间,我会这样做。

Tables

USER
user_id | name

USER_LOGIN_HISTORY 
user_id | date_logged_in

USER_LOGIN_HISTORY has multiple rows for one user so if i joined users to it it would return many rows. as i am only interested in the last entry i would do this

USER_LOGIN_HISTORY 有一个用户的多行,所以如果我将用户加入它,它将返回多行。因为我只对最后一个条目感兴趣,所以我会这样做

select 
  user_id,
  name,
  date_logged_in

from(

  select 
    u.user_id, 
    u.name, 
    ulh.date_logged_in

  from users as u

    join user_login_history as ulh
      on u.user_id = ulh.user_id

  where u.user_id = 1234

  order by ulh.date_logged_in desc 

)as table1

group by user_id

This would return one row with the name of the user and the last time that user logged in.

这将返回一行,其中包含用户名和该用户上次登录的时间。