SQL 如何计算postgres中的重复行?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/8129441/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-01 13:15:06  来源:igfitidea点击:

How to COUNT duplicate rows inside postgres?

sqlpostgresqlhistogram

提问by fabrizioM

I want to be able to create a histogram out of a tuple containing two integers values. Using postgresql.

我希望能够从包含两个整数值的元组中创建直方图。使用 postgresql。

Here it is the query:

这是查询:

 SELECT temp.ad_id, temp.distance  as hits FROM ( 
 'UNION ALL .join(cupound_query)' # python
) as temp GROUP BY temp.ad_id,temp.distance 

EDIT: Better example of what I want

编辑:我想要的更好的例子

For this input:

对于此输入:

(51, 5)
(51, 0)
(51, 3)
(51, 0)
(88, 2)
(88, 2)
(88, 2)
(84, 1)
(81, 9)

Would be:

将是:

 (88,2) : 3
 (51,0) : 2
 (51,3) : 1
 (51,5) : 1
 (84,1) : 1
 (81,9) : 1

How can I create a histogram of those values ? Put in another way, how can I count how many times a row has a duplicate ?

如何创建这些值的直方图?换句话说,我如何计算一行有多少次重复?

Thanks

谢谢

回答by Erwin Brandstetter

Your question leaves room for interpretation. This testcase shows 2 nested steps:

你的问题留下了解释的空间。此测试用例显示了 2 个嵌套步骤:

CREATE TEMP TABLE x (ad_id int, distance int);
INSERT INTO x VALUES
 (510, 0),(956, 3),(823, 3),(880, 2)
,(523, 3),(467, 0),(843, 1),(816, 9)
,(533, 4),(721, 7),(288, 3),(900, 3)
,(526, 9),(750, 7),(302, 8),(463, 6)
,(742, 8),(804, 2),(62,  7),(880, 2)
,(523, 3),(467, 0),(843, 1),(816, 9)
,(533, 4),(721, 7),(288, 3),(900, 3)
,(526, 9),(750, 7),(302, 8),(816, 9)
,(533, 4),(721, 7),(288, 3),(900, 3)
,(533, 4),(721, 7),(288, 3),(396, 5);

How many duplicates per value ?

每个值有多少重复?

SELECT ad_id, count(*) AS ct FROM x GROUP BY 1;

Result:

结果:

ad_id  | ct
-------+----
62     | 1
288    | 4
302    | 2
396    | 1
...

Read: ad_id 62exists 1x, ad_id 288exists 4 times, ...

阅读:ad_id 62存在 1ad_id 288次,存在 4 次,...



"How can I count how many times a row has a duplicate?"

“我如何计算一行有多少次重复?”

SELECT ct
      ,COUNT (*) AS ct_ct
FROM   (SELECT ad_id, COUNT (*) AS ct FROM x GROUP  BY 1) a
GROUP BY 1
ORDER BY 1;

Result:

结果:

 ct | ct_ct
----+---
1   | 8
2   | 7
3   | 2
4   | 3

Read: 8 occurrences of "ad_idis unique", 7 occurrences of "2 rows with same ad_id", ...

阅读:8 次出现“ad_id是唯一的”,7 次出现“2 行相同ad_id”,...

回答by Andomar

Just add count(*)to your select:

只需添加count(*)到您的选择:

SELECT temp.ad_id, temp.distance as hits, count(*)
....

回答by Jason Buberel

Below is a tutorial I wrote on how to generate histograms directly in SQL using Postgres:

下面是我写的关于如何使用 Postgres 在 SQL 中直接生成直方图的教程:

Simple Histograms in SQL

SQL 中的简单直方图

I think you could easily adapt this example to your table structure.

我认为您可以轻松地将此示例适应您的表结构。