oracle 计算列日期范围内每个日期出现的行数

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1394153/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-18 18:57:35  来源:igfitidea点击:

count number of rows that occur for each date in column date range

sqloracleplsqloracle9i

提问by Will Dieterich

I have a table with data such as below

我有一个包含如下数据的表格

Group       Start Date        End Date
A        01/01/01       01/03/01
A       01/01/01        01/02/01
A       01/03/01        01/04/01
B       01/01/01        01/01/01
ETC

I am looking to produce a view that gives a count for each day, like this

我正在寻找一种视图,可以计算每天的数量,就像这样

Group       Date        Count
A       01/01/01            2
A       01/02/01            2
A       01/03/01            2
A       01/04/01         1
B       01/01/01            1

I am using Oracle 9 and am at a total loss on what how to handle this and am looking for any idea to get me started.
Note: Generating a table to hold the dates is not practical because I final product has to break down to the minute.

我正在使用 Oracle 9 并且完全不知道如何处理这个问题,我正在寻找任何让我开始的想法。
注意:生成一个表格来保存日期是不切实际的,因为我的最终产品必须分解到分钟。

采纳答案by Cade Roux

Typically I solve this kind of problem with a numbers table:

通常我用数字表解决这类问题:

WITH Dates AS (
    SELECT DateAdd(d, Numbers.Number - 1, '1/1/2001') AS Date
    FROM Numbers
    WHERE Numbers.Number BETWEEN 1 AND 100000 -- Arbitrary date range
)
SELECT GroupData.Group, Dates.Date, COUNT(*)
FROM Dates
LEFT JOIN GroupData
    ON Dates.Date BETWEEN GroupData.StartDate AND GroupData.EndDate
GROUP BY GroupData.Group, Dates.Date
ORDER BY GroupData.Group, Dates.Date

回答by Quassnoi

WITH    q AS
        (
        SELECT  (
                SELECT  MIN(start_date)
                FROM    mytable
                ) + level - 1 AS mydate
        FROM    dual
        CONNECT BY
                level <= (
                SELECT  MAX(end_date) - MIN(start_date)
                FROM    mytable
                )
        )
SELECT  group, mydate,
        (
        SELECT  COUNT(*)
        FROM    mytable mi
        WHERE   mi.group = mo.group
                AND q BETWEEN mi.start_date AND mi.end_date
        ) 
FROM    q
CROSS JOIN
        (
        SELECT  DISTINCT group
        FROM    mytable
        ) mo

Update:

更新:

A better and faster query making use of analytic functions.

使用分析函数的更好更快的查询。

The main idea is that the number of ranges containing each date is the difference before the count of ranges started before that date and the count of ranges that ended before it.

主要思想是包含每个日期的范围数是该日期之前开始的范围计数与在该日期之前结束的范围计数之前的差值。

SELECT  cur_date,
        grouper,
        SUM(COALESCE(scnt, 0) - COALESCE(ecnt, 0)) OVER (PARTITION BY grouper ORDER BY cur_date) AS ranges
FROM    (
        SELECT  (
                SELECT  MIN(start_date)
                FROM    t_range
                ) + level - 1 AS cur_date
        FROM    dual
        CONNECT BY
                level <=
                (
                SELECT  MAX(end_date)
                FROM    t_range
                ) -
                (
                SELECT  MIN(start_date)
                FROM    t_range
                ) + 1
        ) dates
CROSS JOIN
        (
        SELECT  DISTINCT grouper AS grouper
        FROM    t_range
        ) groups
LEFT JOIN
        (
        SELECT  grouper AS sgrp, start_date, COUNT(*) AS scnt
        FROM    t_range
        GROUP BY
                grouper, start_date
        ) starts
ON      sgrp = grouper
        AND start_date = cur_date
LEFT JOIN
        (
        SELECT  grouper AS egrp, end_date, COUNT(*) AS ecnt
        FROM    t_range
        GROUP BY
                grouper, end_date
        ) ends
ON      egrp = grouper
        AND end_date = cur_date - 1
ORDER BY
        grouper, cur_date

This query completes in 1second on 1,000,000rows.

此查询在1第二1,000,000行中完成。

See this entry in my blog for more detail:

有关更多详细信息,请参阅我博客中的此条目:

回答by Vincent Malgrat

You could use the method described in these SO:

您可以使用这些 SO 中描述的方法:

Basically: join with a generated calendar and GROUP BY your subset of columns.

基本上:加入生成的日历并按您的列子集进行分组。

SQL> WITH DATA AS (
  2  SELECT 'A' grp, to_date('01/01/01') start_date, to_date('01/03/01') end_date FROM DUAL
  3  UNION ALL SELECT 'A', to_date('01/01/01'), to_date('01/02/01') FROM DUAL
  4  UNION ALL SELECT 'A', to_date('01/03/01'), to_date('01/04/01') FROM DUAL
  5  UNION ALL SELECT 'B', to_date('01/01/01'), to_date('01/01/01') FROM DUAL
  6  ), calendar AS (
  7  SELECT to_date('01/01/01') + ROWNUM - 1 d
  8    FROM dual
  9    CONNECT BY LEVEL <= to_date('01/04/01') - to_date('01/01/01') + 1
 10  )
 11  SELECT data.grp, calendar.d, COUNT(*) cnt
 12    FROM data
 13    JOIN calendar ON calendar.d BETWEEN data.start_date AND data.end_date
 14   GROUP BY data.grp, calendar.d;

GRP D                  CNT
--- ----------- ----------
A   04/01/2001           1
A   02/01/2001           2
B   01/01/2001           1
A   03/01/2001           2
A   01/01/2001           2