SQL 如何计算postgres中的重复行?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/8129441/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to COUNT duplicate rows inside postgres?
提问by fabrizioM
I want to be able to create a histogram out of a tuple containing two integers values. Using postgresql.
我希望能够从包含两个整数值的元组中创建直方图。使用 postgresql。
Here it is the query:
这是查询:
SELECT temp.ad_id, temp.distance as hits FROM (
'UNION ALL .join(cupound_query)' # python
) as temp GROUP BY temp.ad_id,temp.distance
EDIT: Better example of what I want
编辑:我想要的更好的例子
For this input:
对于此输入:
(51, 5)
(51, 0)
(51, 3)
(51, 0)
(88, 2)
(88, 2)
(88, 2)
(84, 1)
(81, 9)
Would be:
将是:
(88,2) : 3
(51,0) : 2
(51,3) : 1
(51,5) : 1
(84,1) : 1
(81,9) : 1
How can I create a histogram of those values ? Put in another way, how can I count how many times a row has a duplicate ?
如何创建这些值的直方图?换句话说,我如何计算一行有多少次重复?
Thanks
谢谢
回答by Erwin Brandstetter
Your question leaves room for interpretation. This testcase shows 2 nested steps:
你的问题留下了解释的空间。此测试用例显示了 2 个嵌套步骤:
CREATE TEMP TABLE x (ad_id int, distance int);
INSERT INTO x VALUES
(510, 0),(956, 3),(823, 3),(880, 2)
,(523, 3),(467, 0),(843, 1),(816, 9)
,(533, 4),(721, 7),(288, 3),(900, 3)
,(526, 9),(750, 7),(302, 8),(463, 6)
,(742, 8),(804, 2),(62, 7),(880, 2)
,(523, 3),(467, 0),(843, 1),(816, 9)
,(533, 4),(721, 7),(288, 3),(900, 3)
,(526, 9),(750, 7),(302, 8),(816, 9)
,(533, 4),(721, 7),(288, 3),(900, 3)
,(533, 4),(721, 7),(288, 3),(396, 5);
How many duplicates per value ?
每个值有多少重复?
SELECT ad_id, count(*) AS ct FROM x GROUP BY 1;
Result:
结果:
ad_id | ct
-------+----
62 | 1
288 | 4
302 | 2
396 | 1
...
Read: ad_id 62
exists 1x, ad_id 288
exists 4 times, ...
阅读:ad_id 62
存在 1ad_id 288
次,存在 4 次,...
"How can I count how many times a row has a duplicate?"
“我如何计算一行有多少次重复?”
SELECT ct
,COUNT (*) AS ct_ct
FROM (SELECT ad_id, COUNT (*) AS ct FROM x GROUP BY 1) a
GROUP BY 1
ORDER BY 1;
Result:
结果:
ct | ct_ct
----+---
1 | 8
2 | 7
3 | 2
4 | 3
Read: 8 occurrences of "ad_id
is unique", 7 occurrences of "2 rows with same ad_id
", ...
阅读:8 次出现“ad_id
是唯一的”,7 次出现“2 行相同ad_id
”,...
回答by Andomar
Just add count(*)
to your select:
只需添加count(*)
到您的选择:
SELECT temp.ad_id, temp.distance as hits, count(*)
....
回答by Jason Buberel
Below is a tutorial I wrote on how to generate histograms directly in SQL using Postgres:
下面是我写的关于如何使用 Postgres 在 SQL 中直接生成直方图的教程:
I think you could easily adapt this example to your table structure.
我认为您可以轻松地将此示例适应您的表结构。