postgresql SQL 计数如果列

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/29020065/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-21 01:49:00  来源:igfitidea点击:

SQL count if columns

sqlpostgresqlcountgroup-byaggregate-filter

提问by Erwin Brandstetter

What is the best way to create columns which count the number of occurrences of data in a table? The table needs to be grouped by one column.

创建计算表中数据出现次数的列的最佳方法是什么?该表需要按一列分组。

I have seen

我见过

SELECT
    sum(CASE WHEN question1 = 0 THEN 1 ELSE 0 END) AS ZERO,
    sum(CASE WHEN question1 = 1 THEN 1 ELSE 0 END) AS ONE,
    sum(CASE WHEN question1 = 2 THEN 1 ELSE 0 END) AS TWO,
    category
FROM reviews
    GROUP BY category

where question1 can have a value of either 0, 1 or 2.

其中 question1 的值可以是 0、1 或 2。

I have also seen a version of that using count(CASE WHEN question1 = 0 THEN 1)

我还看到了一个版本的使用 count(CASE WHEN question1 = 0 THEN 1)

However, this becomes more cumbersome to write as the number of possible values for question1 increases. Is there a convenient way to write this query, possibly optimizing performance?

但是,随着 question1 的可能值数量的增加,编写起来会变得更加麻烦。有没有一种方便的方法来编写这个查询,可能会优化性能?

PS. My database is PostgreSQL

附注。我的数据库是 PostgreSQL

回答by Erwin Brandstetter

In Postgres 9.4 there is new, cleaner aggregate FILTERoption:

在 Postgres 9.4 中有一个新的、更干净的聚合FILTER选项:

SELECT category
     , count(*) FILTER (WHERE question1 = 0) AS zero
     , count(*) FILTER (WHERE question1 = 1) AS one
     , count(*) FILTER (WHERE question1 = 2) AS two
FROM   reviews
GROUP  BY 1;

Details for the new FILTERclause:

FILTER条款的详细信息:

If you want it short:

如果你想简短

SELECT category
     , count(question1 = 0 OR NULL) AS zero
     , count(question1 = 1 OR NULL) AS one
     , count(question1 = 2 OR NULL) AS two
FROM   reviews
GROUP  BY 1;

Overview over possible variants:

可能的变体概览:

Proper crosstab query

正确的交叉表查询

crosstab()yields the best performance and is shorter for longer lists of options:

crosstab()产生最佳性能并且对于更长的选项列表更短:

SELECT * FROM crosstab(
     'SELECT category, question1, count(*)::int AS ct
      FROM   reviews
      GROUP  BY 1, 2
      ORDER  BY 1, 2'
   , 'VALUES (0), (1), (2)'
   ) AS ct (category text, zero int, one int, two int);

Detailed explanation:

详细解释:

回答by Igor Romanchenko

The "best" way (for me) is to write a query like:

“最好”的方式(对我来说)是写一个查询,如:

SELECT
    category,
    question1,
    count(*)
FROM reviews
GROUP BY category, question1

Then I use this data to draw a table in application logic.

然后我使用这些数据在应用程序逻辑中绘制一个表格。

Other option is to use one JSON column for all grouping results. This will result in something like:

另一种选择是对所有分组结果使用一个 JSON 列。这将导致类似:

category1 | {"zero": 1, "one": 3, "two": 5}
category2 | {"one": 7, "two": 4}

and so on.

等等。

The query for this option you can build from the previous one with json_build_objectand json_agg. The best thing for this option - you do not need to know number of possible question1values ahead of time.

您可以使用json_build_object和从前一个选项构建此选项的查询json_agg。此选项的最佳之处 - 您不需要提前知道可能question1值的数量。