PostgreSQL - GROUP BY 子句

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/18991625/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-21 01:07:26  来源:igfitidea点击:

PostgreSQL - GROUP BY clause

sqlpostgresqlgroup-byaggregate-functions

提问by Adrian

I want to search by tags, and then list all articles with that tag, and also how many of given tags they match. So for example I might have:

我想按标签搜索,然后列出所有带有该标签的文章,以及它们匹配的给定标签的数量。例如,我可能有:

 Page1 - 2 (has css and php tag)
 Page2 - 1 (has only css tag)

Query:

询问:

SELECT COUNT(t.tag)
FROM a_tags t
JOIN w_articles2tag a2t ON a2t.tag = t.id 
JOIN w_article a ON a.id = a2t.article 
WHERE t.tag = 'css' OR t.tag = 'php'
GROUP BY t.tag
LIMIT 9

When I only put COUNT(t.tag)the query works, and I get okay results. But if I append e.g. IDof my article I get following error:

当我只放置COUNT(t.tag)查询时,我得到了不错的结果。但是,如果我ID在我的文章中附加例如,我会收到以下错误:

ERROR: column "a.title" must appear in the GROUP BY clause or be used in an aggregate function LINE 1: SELECT COUNT(t.tag), a.title FROM a_tags t

错误:列“a.title”必须出现在 GROUP BY 子句中或用于聚合函数第 1 行:SELECT COUNT(t.tag), a.title FROM a_tags t

How to add said columns to this query?

如何将所述列添加到此查询中?

回答by Michael Krebs

When you use a "GROUP BY" clause, you need to enclose all columns that are not grouped in an aggregate function. Try adding title to the GROUP BY list, or selecting "min(a.title)" instead.

当您使用“GROUP BY”子句时,您需要将所有未分组到聚合函数中的列括起来。尝试将标题添加到 GROUP BY 列表,或改为选择“min(a.title)”。

SELECT COUNT(t.tag), a.title FROM a_tags t
JOIN w_articles2tag a2t ON a2t.tag = t.id 
JOIN w_article a ON a.id = a2t.article 
WHERE t.tag = 'css' OR t.tag = 'php' GROUP BY t.tag, a.title LIMIT 9

回答by Erwin Brandstetter

First, to clarify, Postgres 9.1 or later (quoting release notes of 9.1) ...

首先,澄清一下,Postgres 9.1 或更高版本(引用 9.1 的发行说明)......

Allow non-GROUP BY columns in the query target list when the primary key is specified in the GROUP BY clause (Peter Eisentraut)

当在 GROUP BY 子句中指定主键时,允许查询目标列表中的非 GROUP BY 列 (Peter Eisentraut)

More in this related answer:
Return a grouped list with occurrences using Rails and PostgreSQL

此相关答案中的更多内容:
使用 Rails 和 PostgreSQL 返回一个包含出现次数的分组列表

Next, the queries in the question and in @Michael's answerhave got the logic backwards. We want to count how many tags match per article, not how many articles have a certain tag. So we need to GROUP BY w_article.id, not by a_tags.id.

接下来,问题中的查询和@Michael 的回答中的逻辑倒退了。我们要计算每篇文章有多少匹配的标签,而不是有多少文章有某个标签。所以我们需要GROUP BY w_article.id,而不是通过a_tags.id

list all articles with that tag, and also how many of given tags they match

列出所有带有该标签的文章,以及它们匹配的给定标签的数量

To fixthis:

解决此问题:

SELECT COUNT(t.tag) AS ct, a.* -- any column from a allowed ...
FROM   a_tags         t
JOIN   w_articles2tag a2t ON a2t.tag = t.id 
JOIN   w_article      a   ON a.id = a2t.article 
WHERE  t.tag IN ('css', 'php')
GROUP  BY a.id           -- ... since grouped by pk column of a
LIMIT  9

Assuming idis the primary key of w_article.
However, this form will be fasterwhile doing the same:

假设id是 的主键w_article
但是,在执行相同操作时,此表单会更快

SELECT a.*, ct
FROM  (
   SELECT a2t.article AS id, COUNT(*) AS ct
   FROM   a_tags         t
   JOIN   w_articles2tag a2t ON a2t.tag = t.id 
   GROUP  BY a.article 
   LIMIT  9      -- LIMIT early - cheaper
   ) sub
JOIN   w_article a USING (id);  -- attached alias to article in the sub

More in this closely related answer from just yesterday:
Why does the following join increase the query time significantly?

昨天的这个密切相关的答案中有更多内容:
为什么以下连接显着增加了查询时间?

As an aside: It is an anti-pattern to use the generic, non-descriptive idas column name. Call it article_idetc. in both tables. Easier to join and you don't have to use aliases in queries all the time.

顺便说一句:使用通用的、非描述性的id作为列名是一种反模式。article_id在两个表中调用它等。更容易加入,您不必一直在查询中使用别名。