MySQL 我可以将非聚合列与 group by 一起使用吗?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/3168042/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-31 16:27:56  来源:igfitidea点击:

Can I use non-aggregate columns with group by?

sqlmysqlgroup-byaggregate

提问by deft_code

You cannot (should not) put non-aggregates in the SELECTline of a GROUP BYquery.

您不能(不应该)将非聚合放在查询SELECT行中GROUP BY

I would however like access the one of the non-aggregates associated with the max. In plain english, I want a table with the oldest id of each kind.

但是,我想访问与最大值关联的非聚合之一。用简单的英语,我想要一张带有各种最旧 ID 的表。

CREATE TABLE stuff (
   id int,
   kind int,
   age int
);

This query gives me the information I'm after:

这个查询给了我我想要的信息:

SELECT kind, MAX(age)
FROM stuff
GROUP BY kind;

But it's not in the most useful form. I really want the idassociated with each row so I can use it in later queries.

但它不是最有用的形式。我真的希望id与每一行相关联,以便我可以在以后的查询中使用它。

I'm looking for something like this:

我正在寻找这样的东西:

SELECT id, kind, MAX(age)
FROM stuff
GROUP BY kind;

That outputs this:

输出这个:

SELECT stuff.*
FROM
   stuff,
   ( SELECT kind, MAX(age)
     FROM stuff
     GROUP BY kind) maxes
WHERE
   stuff.kind = maxes.kind AND
   stuff.age = maxes.age

It really seems like there should be a way to get this information without needing to join. I just need the SQL engine to remember the other columns when it's calculating the max.

似乎真的应该有一种无需加入即可获取此信息的方法。我只需要 SQL 引擎在计算最大值时记住其他列。

采纳答案by Blorgbeard is out

You can't get the Id of the row that MAX found, because there might not be only one id with the maximum age.

您无法获得 MAX 找到的行的 ID,因为可能不会只有一个具有最大年龄的 ID。

回答by OMG Ponies

You cannot (should not) put non-aggregates in the SELECT line of a GROUP BY query.

您不能(不应该)在 GROUP BY 查询的 SELECT 行中放置非聚合。

You can, and have to, define what you are grouping by for the aggregate function to return the correct result.

您可以并且必须定义聚合函数的分组依据,以返回正确的结果。

MySQL (and SQLite) decided in their infinite wisdom that they would go against spec, and allow queries to accept GROUP BY clauses missing columns quoted in the SELECT - it effectively makes these queries not portable.

MySQL(和 SQLite)以其无限的智慧决定他们将违反规范,并允许查询接受 GROUP BY 子句缺少 SELECT 中引用的列 - 它有效地使这些查询不可移植。

It really seems like there should be a way to get this information without needing to join.

似乎真的应该有一种无需加入即可获取此信息的方法。

Without access to the analytic/ranking/windowing functions that MySQL doesn't support, the self join to a derived table/inline view is the most portable means of getting the result you desire.

如果无法访问 MySQL 不支持的分析/排名/窗口功能,则自连接到派生表/内联视图是获得所需结果的最便携方法。

回答by mb14

I think it's tempting indeed to ask the system to solve the problem in one pass rather than having to do the job twice (find the max, and the find the corresponding id). You can do using CONCAT (as suggested in Naktibalda refered article), not sure that would be more effeciant

我认为要求系统一次性解决问题而不是必须做两次工作(找到最大值,然后找到相应的 id)确实很诱人。您可以使用 CONCAT(如 Naktibalda 引用的文章中所建议的那样),不确定这会更有效

SELECT MAX( CONCAT( LPAD(age, 10, '0'), '-', id)
FROM STUFF1
GROUP BY kind;

Should work, you have to split the answer to get the age and the id. (That's really ugly though)

应该有效,您必须拆分答案以获取年龄和 ID。(虽然真的很丑)

回答by Grimaldi

In recent databases you can use sum() over (parition by ...) to solve this problem:

在最近的数据库中,您可以使用 sum() over (partition by ...) 来解决这个问题:

select id, kind, age as max_age from (
  select id, kind, age, max(age) over (partition by kind) as mage
    from table)
where age = mage

This can then be single pass

这可以是单程

回答by developer

You have to have a join because the aggregate function max retrieves many rows and chooses the max. So you need a join to choose the one that the agregate function has found.

你必须有一个连接,因为聚合函数 max 检索许多行并选择最大值。所以你需要一个连接来选择聚合函数找到的那个。

To put it a different way how would you expect the query to behave if you replaced max with sum?

换句话说,如果您用 sum 替换 max,您希望查询的行为如何?

An inner join might be more efficient than your sub query though.

不过,内部联接可能比您的子查询更有效。

回答by Aneesh Dash

PostgesSQL's DISTINCT ON will be useful here.

PostgesSQL 的 DISTINCT ON 在这里很有用。

SELECT DISTINCT ON (kind) kind, id, age 
FROM stuff
ORDER BY kind, age DESC;

This groups by kind and returns the first row in the ordered format. As we have ordered by age in descending order, we will get the row with max age for kind.

这按种类分组并以有序格式返回第一行。由于我们已按年龄降序排序,因此我们将获得最大年龄的行。

P.S. columns in DISTINCT ON should appear first in order by

DISTINCT ON 中的 PS 列应按以下顺序首先出现