PostgreSQL GROUP BY 与 MySQL 不同?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1769361/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
PostgreSQL GROUP BY different from MySQL?
提问by holden
I've been migrating some of my MySQL queries to PostgreSQL to use Heroku. Most of my queries work fine, but I keep having a similar recurring error when I use group by:
我一直在将我的一些 MySQL 查询迁移到 PostgreSQL 以使用 Heroku。我的大多数查询工作正常,但是当我使用 group by 时,我一直遇到类似的重复错误:
ERROR: column "XYZ" must appear in the GROUP BY clause or be used in an aggregate function
错误:列“XYZ”必须出现在 GROUP BY 子句中或用于聚合函数中
Could someone tell me what I'm doing wrong?
有人能告诉我我做错了什么吗?
MySQL which works 100%:
MySQL 100% 工作:
SELECT `availables`.*
FROM `availables`
INNER JOIN `rooms` ON `rooms`.id = `availables`.room_id
WHERE (rooms.hotel_id = 5056 AND availables.bookdate BETWEEN '2009-11-22' AND '2009-11-24')
GROUP BY availables.bookdate
ORDER BY availables.updated_at
PostgreSQL error:
PostgreSQL 错误:
ActiveRecord::StatementInvalid: PGError: ERROR: column "availables.id" must appear in the GROUP BY clause or be used in an aggregate function:
SELECT "availables".* FROM "availables" INNER JOIN "rooms" ON "rooms".id = "availables".room_id WHERE (rooms.hotel_id = 5056 AND availables.bookdate BETWEEN E'2009-10-21' AND E'2009-10-23') GROUP BY availables.bookdate ORDER BY availables.updated_at
ActiveRecord::StatementInvalid: PGError: ERROR: column "availables.id" 必须出现在 GROUP BY 子句中或用于聚合函数:
SELECT "availables".* FROM "availables" INNER JOIN "rooms" ON "rooms"。 id = "availables".room_id WHERE (rooms.hotel_id = 5056 AND availables.bookdate BETWEEN E'2009-10-21' AND E'2009-10-23') GROUP BY availables.bookdate ORDER BY availables.updated_at
Ruby code generating the SQL:
生成 SQL 的 Ruby 代码:
expiration = Available.find(:all,
:joins => [ :room ],
:conditions => [ "rooms.hotel_id = ? AND availables.bookdate BETWEEN ? AND ?", hostel_id, date.to_s, (date+days-1).to_s ],
:group => 'availables.bookdate',
:order => 'availables.updated_at')
Expected Output (from working MySQL query):
预期输出(来自工作 MySQL 查询):
+-----+-------+-------+------------+---------+---------------+---------------+ | id | price | spots | bookdate | room_id | created_at | updated_at | +-----+-------+-------+------------+---------+---------------+---------------+ | 414 | 38.0 | 1 | 2009-11-22 | 1762 | 2009-11-20... | 2009-11-20... | | 415 | 38.0 | 1 | 2009-11-23 | 1762 | 2009-11-20... | 2009-11-20... | | 416 | 38.0 | 2 | 2009-11-24 | 1762 | 2009-11-20... | 2009-11-20... | +-----+-------+-------+------------+---------+---------------+---------------+ 3 rows in set
回答by peufeu
MySQL's totally non standards compliant GROUP BY
can be emulated by Postgres' DISTINCT ON
. Consider this:
MySQL 完全不符合标准的特性GROUP BY
可以被 Postgres 的DISTINCT ON
. 考虑一下:
MySQL:
MySQL:
SELECT a,b,c,d,e FROM table GROUP BY a
This delivers 1 row per value of a
(which one, you don't really know). Well actually you can guess, because MySQL doesn't know about hash aggregates, so it will probably use a sort... but it will only sort on a
, so the order of the rows could be random. Unless it uses a multicolumn index instead of sorting. Well, anyway, it's not specified by the query.
这为每个值提供 1 行a
(您不知道是哪一个)。实际上你可以猜到,因为 MySQL 不知道哈希聚合,所以它可能会使用排序......但它只会排序 on a
,所以行的顺序可能是随机的。除非它使用多列索引而不是排序。好吧,无论如何,它不是由查询指定的。
Postgres:
Postgres:
SELECT DISTINCT ON (a) a,b,c,d,e FROM table ORDER BY a,b,c
This delivers 1 row per value of a
, this row will be the first one in the sort according to the ORDER BY
specified by the query. Simple.
这会为 的每个值提供 1 行a
,该行将是根据ORDER BY
查询指定的排序中的第一行。简单的。
Note that here, it's not an aggregate I'm computing. So GROUP BY
actually makes no sense. DISTINCT ON
makes a lot more sense.
请注意,这里不是我正在计算的聚合。所以GROUP BY
实际上没有任何意义。DISTINCT ON
更有意义。
Rails is married to MySQL, so I'm not surprised that it generates SQL that doesn't work in Postgres.
Rails 与 MySQL 结合在一起,所以我对它生成的 SQL 在 Postgres 中不起作用并不感到惊讶。
回答by Erlock
PostgreSQL is more SQL compliant than MySQL. All fields - except computed field with aggregation function - in the output must be present in the GROUP BY clause.
PostgreSQL 比 MySQL 更符合 SQL。输出中的所有字段(带聚合函数的计算字段除外)都必须出现在 GROUP BY 子句中。
回答by Bozho
MySQL's GROUP BY can be used without an aggregate function (which is contrary to the SQL standard), and returns the first row in the group (I don't know based on what criteria), while PostgreSQL must have an aggregate function (MAX, SUM, etc) on the column, on which the GROUP BY clause is issued.
MySQL 的 GROUP BY 可以不用聚合函数(这与 SQL 标准相反),并返回组中的第一行(我不知道基于什么标准),而 PostgreSQL 必须有聚合函数(MAX, SUM 等)在发出 GROUP BY 子句的列上。
回答by Omar Qureshi
Correct, the solution to fixing this is to use :select and to select each field that you wish to decorate the resulting object with and group by them.
正确,解决此问题的解决方案是使用 :select 并选择您希望用它们装饰结果对象并按它们分组的每个字段。
Nasty - but it is how group by shouldwork as opposed to how MySQL works with it by guessing what you mean if you don't stick fields in your group by.
令人讨厌 - 但它是 group by应该如何工作,而不是 MySQL 如何通过猜测你的意思来使用它,如果你没有在你的 group by 中粘贴字段。
回答by Franz
If I remember correctly, in PostgreSQL you have to add every column you fetch from the table where the GROUP BY clause applies tothe GROUP BY clause.
如果我没记错的话,在 PostgreSQL 中,您必须添加从表中提取的每一列,其中 GROUP BY 子句适用于GROUP BY 子句。
回答by Ilia
Not the prettiest solution, but changing the group parameter to output every column in model works in PostgreSQL:
不是最漂亮的解决方案,但更改 group 参数以输出模型中的每一列在 PostgreSQL 中有效:
expiration = Available.find(:all,
:joins => [ :room ],
:conditions => [ "rooms.hotel_id = ? AND availables.bookdate BETWEEN ? AND ?", hostel_id, date.to_s, (date+days-1).to_s ],
:group => Available.column_names.collect{|col| "availables.#{col}"},
:order => 'availables.updated_at')
回答by riley
For others looking for a way to order by any field, including joined field, in postgresql, use a subquery:
对于其他人在 postgresql 中寻找按任何字段(包括连接字段)排序的方法,请使用子查询:
SELECT * FROM(
SELECT DISTINCT ON(availables.bookdate) `availables`.*
FROM `availables` INNER JOIN `rooms` ON `rooms`.id = `availables`.room_id
WHERE (rooms.hotel_id = 5056
AND availables.bookdate BETWEEN '2009-11-22' AND '2009-11-24')
) AS distinct_selected
ORDER BY availables.updated_at
or arel:
subquery = SomeRecord.select("distinct on(xx.id) xx.*, jointable.order_field")
.where("").joins(")
result = SomeRecord.select("*").from("(#{subquery.to_sql}) AS distinct_selected").order(" xx.order_field ASC, jointable.order_field ASC")
回答by Leonel Galán
According to MySQL's "Debuking GROUP BY Myths" http://dev.mysql.com/tech-resources/articles/debunking-group-by-myths.html. SQL (2003 version of the standard) doesn't requires columns referenced in the SELECT list of a query to also appear in the GROUP BY clause.
根据 MySQL 的“Debuking GROUP BY Myths” http://dev.mysql.com/tech-resources/articles/debunking-group-by-myths.html。SQL(标准的 2003 版)不要求查询的 SELECT 列表中引用的列也出现在 GROUP BY 子句中。
回答by Lucas D'Avila
I think that .uniq [1] will solve your problem.
我认为 .uniq [1] 将解决您的问题。
[1] Available.select('...').uniq
Take a look at http://guides.rubyonrails.org/active_record_querying.html#selecting-specific-fields
看看http://guides.rubyonrails.org/active_record_querying.html#selecting-specific-fields