PostgreSQL - GROUP BY 子句
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/18991625/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
PostgreSQL - GROUP BY clause
提问by Adrian
I want to search by tags, and then list all articles with that tag, and also how many of given tags they match. So for example I might have:
我想按标签搜索,然后列出所有带有该标签的文章,以及它们匹配的给定标签的数量。例如,我可能有:
Page1 - 2 (has css and php tag)
Page2 - 1 (has only css tag)
Query:
询问:
SELECT COUNT(t.tag)
FROM a_tags t
JOIN w_articles2tag a2t ON a2t.tag = t.id
JOIN w_article a ON a.id = a2t.article
WHERE t.tag = 'css' OR t.tag = 'php'
GROUP BY t.tag
LIMIT 9
When I only put COUNT(t.tag)
the query works, and I get okay results. But if I append e.g. ID
of my article I get following error:
当我只放置COUNT(t.tag)
查询时,我得到了不错的结果。但是,如果我ID
在我的文章中附加例如,我会收到以下错误:
ERROR: column "a.title" must appear in the GROUP BY clause or be used in an aggregate function LINE 1: SELECT COUNT(t.tag), a.title FROM a_tags t
错误:列“a.title”必须出现在 GROUP BY 子句中或用于聚合函数第 1 行:SELECT COUNT(t.tag), a.title FROM a_tags t
How to add said columns to this query?
如何将所述列添加到此查询中?
回答by Michael Krebs
When you use a "GROUP BY" clause, you need to enclose all columns that are not grouped in an aggregate function. Try adding title to the GROUP BY list, or selecting "min(a.title)" instead.
当您使用“GROUP BY”子句时,您需要将所有未分组到聚合函数中的列括起来。尝试将标题添加到 GROUP BY 列表,或改为选择“min(a.title)”。
SELECT COUNT(t.tag), a.title FROM a_tags t
JOIN w_articles2tag a2t ON a2t.tag = t.id
JOIN w_article a ON a.id = a2t.article
WHERE t.tag = 'css' OR t.tag = 'php' GROUP BY t.tag, a.title LIMIT 9
回答by Erwin Brandstetter
First, to clarify, Postgres 9.1 or later (quoting release notes of 9.1) ...
首先,澄清一下,Postgres 9.1 或更高版本(引用 9.1 的发行说明)......
Allow non-GROUP BY columns in the query target list when the primary key is specified in the GROUP BY clause (Peter Eisentraut)
当在 GROUP BY 子句中指定主键时,允许查询目标列表中的非 GROUP BY 列 (Peter Eisentraut)
More in this related answer:
Return a grouped list with occurrences using Rails and PostgreSQL
此相关答案中的更多内容:
使用 Rails 和 PostgreSQL 返回一个包含出现次数的分组列表
Next, the queries in the question and in @Michael's answerhave got the logic backwards. We want to count how many tags match per article, not how many articles have a certain tag. So we need to GROUP BY w_article.id
, not by a_tags.id
.
接下来,问题中的查询和@Michael 的回答中的逻辑倒退了。我们要计算每篇文章有多少匹配的标签,而不是有多少文章有某个标签。所以我们需要GROUP BY w_article.id
,而不是通过a_tags.id
。
list all articles with that tag, and also how many of given tags they match
列出所有带有该标签的文章,以及它们匹配的给定标签的数量
To fixthis:
要解决此问题:
SELECT COUNT(t.tag) AS ct, a.* -- any column from a allowed ...
FROM a_tags t
JOIN w_articles2tag a2t ON a2t.tag = t.id
JOIN w_article a ON a.id = a2t.article
WHERE t.tag IN ('css', 'php')
GROUP BY a.id -- ... since grouped by pk column of a
LIMIT 9
Assuming id
is the primary key of w_article
.
However, this form will be fasterwhile doing the same:
假设id
是 的主键w_article
。
但是,在执行相同操作时,此表单会更快:
SELECT a.*, ct
FROM (
SELECT a2t.article AS id, COUNT(*) AS ct
FROM a_tags t
JOIN w_articles2tag a2t ON a2t.tag = t.id
GROUP BY a.article
LIMIT 9 -- LIMIT early - cheaper
) sub
JOIN w_article a USING (id); -- attached alias to article in the sub
More in this closely related answer from just yesterday:
Why does the following join increase the query time significantly?
昨天的这个密切相关的答案中有更多内容:
为什么以下连接显着增加了查询时间?
As an aside: It is an anti-pattern to use the generic, non-descriptive id
as column name. Call it article_id
etc. in both tables. Easier to join and you don't have to use aliases in queries all the time.
顺便说一句:使用通用的、非描述性的id
作为列名是一种反模式。article_id
在两个表中调用它等。更容易加入,您不必一直在查询中使用别名。