postgresql 为什么 Postgres 中没有“SELECT foo.* ... GROUP BY foo.id”?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1135997/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-19 23:46:23  来源:igfitidea点击:

Why no "SELECT foo.* ... GROUP BY foo.id" in Postgres?

sqlpostgresql

提问by Sijmen Mulder

I have a query like this:

我有一个这样的查询:

select foo.*, count(bar.id)
from foo inner join bar on foo.id = bar.foo_id
group by foo.id

This worked great with SQLite and MySQL. Postgres however, complains about me not including all columns of foo in the group byclause. Why is this? Isn't it enough that foo.id is unique?

这对 SQLite 和 MySQL 非常有效。但是,Postgres 抱怨我没有在group by子句中包含 foo 的所有列。为什么是这样?foo.id 的唯一性还不够吗?

回答by a_horse_with_no_name

Just in case other people stumble over this question:

以防万一其他人偶然发现这个问题:

Starting with PostgreSQL 9.1 it's sufficient to list the columns of the primary key in the group by clause (so the example from the question would work now).

从 PostgreSQL 9.1 开始,在 group by 子句中列出主键的列就足够了(因此问题中的示例现在可以使用)。

回答by Guffa

Some databases are more relaxed about this, for good and bad. The query is unspecific, so the result is equally unspecific. If the database allows the query, it will return one record from each group and it won't care which one. Other databases are more specific, and require you to specify which value you want from the group. They won't let you write a query that has an unspecific result.

一些数据库对此更加放松,无论好坏。查询不具体,因此结果同样不具体。如果数据库允许查询,它将从每个组返回一个记录,它不会关心是哪一个。其他数据库更具体,需要您指定要从组中获取的值。他们不会让您编写具有不确定结果的查询。

The only values that you can select without an aggregate is the ones in the group byclause:

您可以在没有聚合的情况下选择的唯一值是group by子句中的值:

select foo.id, count(bar.id)
from foo inner join bar on foo.id = bar.foo_id
group by foo.id

You can use aggregates to get other values:

您可以使用聚合来获取其他值:

select foo.id, min(foo.price), count(bar.id)
from foo inner join bar on foo.id = bar.foo_id
group by foo.id

If you want all the values from the foo table, you can either put them all in the group byclause (if that gives the correct result):

如果您想要 foo 表中的所有值,您可以将它们全部放在group by子句中(如果给出正确的结果):

select foo.id, foo.price, foo.name, foo.address, count(bar.id)
from foo inner join bar on foo.id = bar.foo_id
group by foo.id, foo.price, foo.name, foo.address

Or, you can join the table with a subquery:

或者,您可以使用子查询加入表:

select foo.id, foo.price, foo.name, foo.address, sub.bar_count
from foo
inner join (
   select foo.id, bar_count = count(bar.id)
   from foo inner join bar on foo.id = bar.foo_id
   group by foo.id
) sub on sub.id = foo.id

回答by Thomas

What exactly would you have postgresql output? You're using an aggregate function and trying to output "something".

你到底有什么 postgresql 输出?您正在使用聚合函数并尝试输出“某物”。

Ah. I see what you may want to do. Use a subselect.

啊。我明白你可能想要做什么。使用子选择。

select foo.*, (select count(*) from bar where bar.foo_id=foo.id) from foo;

Check with explain that the plan looks good though. A subselect is not always bad. I just checked with a database I'm using and my execution plan was good for that query.

检查并解释该计划看起来不错。子选择并不总是坏的。我刚刚检查了我正在使用的数据库,我的执行计划适合该查询。

Yes, in theory grouping by foo.id would be enough (i.e.: your query plus "group by foo.id"). But apparently (I tested it) postgresql will not do that. The other option is to "group by foo.id, foo.foo, foo.bar, foo.baz" and everything else that's in "foo.*".

是的,理论上按 foo.id 分组就足够了(即:您的查询加上“按 foo.id 分组”)。但显然(我测试过)postgresql 不会那样做。另一种选择是“按 foo.id、foo.foo、foo.bar、foo.baz 以及“foo.*”中的所有其他内容分组。

Another way, that Guffa is on to, is this:

Guffa 的另一种方式是:

SELECT foo.*, COALESCE(sub.cnt, 0)
FROM foo
LEFT OUTER JOIN (
  SELECT foo_id, count(*) AS cnt
  FROM bar
  GROUP BY foo_id) sub
ON sub.foo_id = foo.id;

This will be two queries though (one subquery, which is run just once), which can matter, but probably won't. If you can just do without "foo.*" you can use the second version that explicitly groups by all columns.

这将是两个查询(一个子查询,只运行一次),这可能很重要,但可能不会。如果您可以不用“foo.*”,您可以使用按所有列显式分组的第二个版本。

回答by Martin B

A GROUP BYclause requires that every column that the query returns is either a column contained in the GROUP BYstatement or an aggregate function (such as the COUNTin your example). Without seeing what your GROUP BYclause is or what the columns of fooare, it's hard to tell what exactly is going on, but I would guess the problem is that foo.*is trying to return one or several columns that are not in your GROUP BYclause.

一个GROUP BY条款规定,每列,该查询返回的是包含在任一列GROUP BY或声明的聚合函数(如COUNT在你的例子)。没有看到您的GROUP BY子句是什么或列是什么foo,很难说到底发生了什么,但我想问题foo.*是试图返回一个或几个不在您的GROUP BY子句中的列。

This is really a general property of SQL and should not be specific to PostgreSQL. No idea why it worked for you with SQLite or MySQL -- perhaps all the columns in foo.*are actually in your GROUP BYclause but PostgreSQL can't figure that out -- so try listing out all of the columns of fooexplicitly.

这实际上是 SQL 的一般属性,不应特定于 PostgreSQL。不知道为什么它适用于 SQLite 或 MySQL——也许所有列foo.*实际上都在你的GROUP BY子句中,但 PostgreSQL 无法弄清楚——所以尝试foo明确列出所有列。