postgresql 使用 OVER(PARTITION BY id) 计算不同的值
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/21728991/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Count distinct values with OVER(PARTITION BY id)
提问by user007
Is it possible to count distinct values in conjunction with window functions like OVER(PARTITION BY id)
? Currently my query is as follows:
是否可以结合窗口函数来计算不同的值OVER(PARTITION BY id)
?目前我的查询如下:
SELECT congestion.date, congestion.week_nb, congestion.id_congestion,
congestion.id_element,
ROW_NUMBER() OVER(
PARTITION BY congestion.id_element
ORDER BY congestion.date),
COUNT(DISTINCT congestion.week_nb) OVER(
PARTITION BY congestion.id_element
) AS week_count
FROM congestion
WHERE congestion.date >= '2014.01.01'
AND congestion.date <= '2014.12.31'
ORDER BY id_element, date
However, when I try to execute the query I get the following error:
但是,当我尝试执行查询时,出现以下错误:
"COUNT(DISTINCT": "DISTINCT is not implemented for window functions"
回答by Simo Kivist?
No, as the error message states, DISTINCT
is not implemented with windows functions. Aplying info from this linkinto your case you could use something like:
不,正如错误消息所述,DISTINCT
不是用 Windows 函数实现的。将此链接中的信息应用到您的案例中,您可以使用以下内容:
WITH uniques AS (
SELECT congestion.id_element, COUNT(DISTINCT congestion.week_nb) AS unique_references
FROM congestion
WHERE congestion.date >= '2014.01.01'
AND congestion.date <= '2014.12.31'
GROUP BY congestion.id_element
)
SELECT congestion.date, congestion.week_nb, congestion.id_congestion,
congestion.id_element,
ROW_NUMBER() OVER(
PARTITION BY congestion.id_element
ORDER BY congestion.date),
uniques.unique_references AS week_count
FROM congestion
JOIN uniques USING (id_element)
WHERE congestion.date >= '2014.01.01'
AND congestion.date <= '2014.12.31'
ORDER BY id_element, date
Depending on the situation you could also put a subquery straight into SELECT
-list:
根据情况,您还可以将子查询直接放入SELECT
-list 中:
SELECT congestion.date, congestion.week_nb, congestion.id_congestion,
congestion.id_element,
ROW_NUMBER() OVER(
PARTITION BY congestion.id_element
ORDER BY congestion.date),
(SELECT COUNT(DISTINCT dist_con.week_nb)
FROM congestion AS dist_con
WHERE dist_con.date >= '2014.01.01'
AND dist_con.date <= '2014.12.31'
AND dist_con.id_element = congestion.id_element) AS week_count
FROM congestion
WHERE congestion.date >= '2014.01.01'
AND congestion.date <= '2014.12.31'
ORDER BY id_element, date
回答by Agnius Vasiliauskas
Make partitioned set smaller, up to the point there is no duplicates over counted field :
使分区集更小,直到计数字段没有重复项:
SELECT congestion.date, congestion.week_nb, congestion.id_congestion,
congestion.id_element,
ROW_NUMBER() OVER(
PARTITION BY congestion.id_element
ORDER BY congestion.date),
COUNT(congestion.week_nb) -- remove distinct
OVER(
PARTITION BY congestion.id_element,
-- add new fields which will restart counter in case duplication
congestion.id_congestion
) AS week_count
FROM congestion
WHERE congestion.date >= '2014.01.01'
AND congestion.date <= '2014.12.31'
ORDER BY id_element, date
回答by Gordon Linoff
I find that the easiest way is to use a subquery/CTE and conditional aggregation:
我发现最简单的方法是使用子查询/CTE 和条件聚合:
SELECT c.date, c.week_nb, c.id_congestion, c.id_element,
ROW_NUMBER() OVER (PARTITION BY c.id_element ORDER BY c.date),
(CASE WHEN seqnum = 1 THEN 1 ELSE 0 END) as week_count
FROM (SELECT c.*,
ROW_NUMBER() OVER (PARTITION BY c.congestion.id_element, c.week_nb
ORDER BY c.date) as seqnum
FROM congestion c
) c
WHERE c.date >= '2014.01.01' AND c.date <= '2014.12.31'
ORDER BY id_element, date