postgresql 在 Redshift/Postgres 中,如何计算满足条件的行数?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/21288458/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
In Redshift/Postgres, how to count rows that meet a condition?
提问by ILikeTacos
I'm trying to write a query that count only the rows that meet a condition.
我正在尝试编写一个只计算满足条件的行的查询。
For example, in MySQL I would write it like this:
例如,在 MySQL 中,我会这样写:
SELECT
COUNT(IF(grade < 70), 1, NULL)
FROM
grades
ORDER BY
id DESC;
However, when I attempt to do that on Redshift, it returns the following error:
但是,当我尝试在 Redshift 上执行此操作时,它返回以下错误:
ERROR: function if(boolean, integer, "unknown") does not exist
错误:函数 if(boolean, integer, "unknown") 不存在
Hint: No function matches the given name and argument types. You may need to add explicit type casts.
提示:没有函数匹配给定的名称和参数类型。您可能需要添加显式类型转换。
I checked the documentation for conditional statements, and I found
我检查了条件语句的文档,我发现
NULLIF(value1, value2)
NULLIF(value1, value2)
but it only compares value1 and value2 and if such values are equal, it returns null.
但它只比较 value1 和 value2,如果这些值相等,则返回 null。
I couldn't find a simple IF statement, and at first glance I couldn't find a way to do what I want to do.
我找不到一个简单的IF语句,乍一看我找不到做我想做的事情的方法。
I tried to use the CASE expression, but I'm not getting the results I want:
我尝试使用 CASE 表达式,但没有得到我想要的结果:
SELECT
CASE
WHEN grade < 70 THEN COUNT(rank)
ELSE COUNT(rank)
END
FROM
grades
This is the way I want to count things:
这是我想计算事物的方式:
failed (grade < 70)
average (70 <= grade < 80)
good (80 <= grade < 90)
excellent (90 <= grade <= 100)
失败(等级 < 70)
平均(70 <= 等级 < 80)
好(80 <= 等级 < 90)
优秀(90 <= 等级 <= 100)
and this is how I expect to see the results:
这就是我希望看到的结果:
+========+=========+======+===========+
| failed | average | good | excellent |
+========+=========+======+===========+
| 4 | 2 | 1 | 4 |
+========+=========+======+===========+
but I'm getting this:
但我得到了这个:
+========+=========+======+===========+
| failed | average | good | excellent |
+========+=========+======+===========+
| 11 | 11 | 11 | 11 |
+========+=========+======+===========+
I hope someone could point me to the right direction!
我希望有人能指出我正确的方向!
If this helps here's some sample info
如果这有帮助,这里有一些示例信息
CREATE TABLE grades(
grade integer DEFAULT 0,
);
INSERT INTO grades(grade) VALUES(69, 50, 55, 60, 75, 70, 87, 100, 100, 98, 94);
回答by yieldsfalsehood
First, the issue you're having here is that what you're saying is "If the grade is less than 70, the value of this case expression is count(rank). Otherwise, the value of this expression is count(rank)." So, in either case, you're always getting the same value.
首先,您在这里遇到的问题是您所说的是“如果成绩小于 70,则此 case 表达式的值为 count(rank)。否则,此表达式的值为 count(rank) .” 因此,无论哪种情况,您总是获得相同的价值。
SELECT
CASE
WHEN grade < 70 THEN COUNT(rank)
ELSE COUNT(rank)
END
FROM
grades
count() only counts non-null values, so typically the pattern you'll see to accomplish what you're trying is this:
count() 只计算非空值,所以通常你会看到完成你正在尝试的模式是这样的:
SELECT
count(CASE WHEN grade < 70 THEN 1 END) as grade_less_than_70,
count(CASE WHEN grade >= 70 and grade < 80 THEN 1 END) as grade_between_70_and_80
FROM
grades
That way the case expression will only evaluate to 1 when the test expression is true and will be null otherwise. Then the count() will only count the non-null instances, i.e. when the test expression is true, which should give you what you need.
这样 case 表达式只会在测试表达式为真时计算为 1,否则为 null。然后 count() 将只计算非空实例,即当测试表达式为真时,它应该给你你需要的东西。
Edit: As a side note, notice that this is exactly the same as how you had originally written this using count(if(test, true-value, false-value))
, only re-written as count(case when test then true-value end)
(and null is the stand in false-value since an else
wasn't supplied to the case).
编辑:作为旁注,请注意,这与您最初使用 编写此内容的方式完全相同count(if(test, true-value, false-value))
,只是重写为count(case when test then true-value end)
(并且 null 是假值的立场,因为 anelse
未提供给案例)。
Edit: postgres 9.4 was released a few months after this original exchange. That version introduced aggregate filters, which can make scenarios like this look a little nicer and clearer. This answer still gets some occasional upvotes, so if you've stumbled upon here and are using a newer postgres (i.e. 9.4+) you might want to consider this equivalent version:
编辑:postgres 9.4 在这个原始交换几个月后发布。该版本引入了聚合过滤器,它可以使这样的场景看起来更好更清晰。这个答案仍然偶尔会得到一些赞成,所以如果你在这里偶然发现并使用更新的 postgres(即 9.4+),你可能需要考虑这个等效版本:
SELECT
count(*) filter (where grade < 70) as grade_less_than_70,
count(*) filter (where grade >= 70 and grade < 80) as grade_between_70_and_80
FROM
grades
回答by ILikeTacos
Another method:
另一种方法:
SELECT
sum(CASE WHEN grade < 70 THEN 1 else 0 END) as grade_less_than_70,
sum(CASE WHEN grade >= 70 and grade < 80 THEN 1 else 0 END) as grade_between_70_and_80
FROM
grades
Works just fine in case you want to group the counts by a categorical column.
如果您想按分类列对计数进行分组,则效果很好。
回答by mysticfyst
The solution given by @yieldsfalsehood works perfectly:
@yieldsfalsehood 给出的解决方案非常有效:
SELECT
count(*) filter (where grade < 70) as grade_less_than_70,
count(*) filter (where grade >= 70 and grade < 80) as grade_between_70_and_80
FROM
grades
But since you talked about NULLIF(value1, value2)
, there's a way with nullif that can give the same result:
但是既然你谈到了NULLIF(value1, value2)
,那么有一种 nullif 方法可以给出相同的结果:
select count(nullif(grade < 70 ,true)) as failed from grades;
select count(nullif(grade < 70 ,true)) as failed from grades;
回答by TautrimasPajarskas
Redshift only
仅红移
For lazy typers, here's a "COUNTIF
" sum integer casting version built on top of @user1509107 answer:
对于懒惰的打字员,这是一个COUNTIF
建立在@user1509107 答案之上的“ ” sum 整数转换版本:
SELECT
SUM((grade < 70)::INT) AS grade_less_than_70,
SUM((grade >= 70 AND grade < 80)::INT) AS grade_between_70_and_80
FROM
grades