SQL PostgreSQL 查询按天计数/分组并显示没有数据的天数

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/15691127/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-01 14:31:55  来源:igfitidea点击:

PostgreSQL query to count/group by day and display days with no data

sqlpostgresqljoingroup-by

提问by Marcel Chastain

I need to create a PostgreSQL query that returns

我需要创建一个返回的 PostgreSQL 查询

  • a day
  • the number of objects found for that day
  • 一天
  • 当天找到的对象数量

It's important that every single day appear in the results, even if no objects were found on that day. (This has been discussed before but I haven't been able to get things working in my specific case.)

重要的是每一天都出现在结果中,即使当天没有发现任何对象。(这之前已经讨论过,但我无法在我的特定情况下使事情发挥作用。)

First, I found a sql query to generate a range of days, with which I can join:

首先,我找到了一个sql 查询来生成一个 days 范围,我可以加入:

SELECT to_char(date_trunc('day', (current_date - offs)), 'YYYY-MM-DD')
AS date 
FROM generate_series(0, 365, 1) 
AS offs

Results in:

结果是:

    date    
------------
 2013-03-28
 2013-03-27
 2013-03-26
 2013-03-25
 ...
 2012-03-28
(366 rows)

Now I'm trying to join that to a table named 'sharer_emailshare' which has a 'created' column:

现在我试图将它加入一个名为“sharer_emailshare”的表中,该表有一个“创建”列:

Table 'public.sharer_emailshare'
column    |   type  
-------------------
id        | integer
created   | timestamp with time zone
message   | text
to        | character varying(75)

Here's the best GROUP BYquery I have so far:

这是GROUP BY我迄今为止最好的查询:

SELECT d.date, count(se.id) FROM (
    select to_char(date_trunc('day', (current_date - offs)), 'YYYY-MM-DD')
    AS date 
    FROM generate_series(0, 365, 1) 
    AS offs
    ) d 
JOIN sharer_emailshare se 
ON (d.date=to_char(date_trunc('day', se.created), 'YYYY-MM-DD'))  
GROUP BY d.date;

The results:

结果:

    date    | count 
------------+-------
 2013-03-27 |    11
 2013-03-24 |     2
 2013-02-14 |     2
(3 rows)

Desired results:

预期结果:

    date    | count 
------------+-------
 2013-03-28 |     0
 2013-03-27 |    11
 2013-03-26 |     0
 2013-03-25 |     0
 2013-03-24 |     2
 2013-03-23 |     0
 ...
 2012-03-28 |     0
(366 rows)

If I understand correctly this is because I'm using a plain (implied INNER) JOIN, and this is the expected behavior, as discussed in the postgres docs.

如果我理解正确,这是因为我使用的是普通 (implied INNER) JOIN,这是预期的行为,如postgres 文档中所述

I've looked through dozens of StackOverflow solutions, and all the ones with working queries seem specific to MySQL/Oracle/MSSQL and I'm having a hard time translating them to PostgreSQL.

我查看了数十个 StackOverflow 解决方案,所有具有工作查询的解决方案似乎都特定于 MySQL/Oracle/MSSQL,我很难将它们转换为 PostgreSQL。

The guy asking this questionfound his answer, with Postgres, but put it on a pastebin link that expired some time ago.

这个问题的人用 Postgres 找到了他的答案,但把它放在了一个前一段时间过期的 pastebin 链接上。

I've tried to switch to LEFT OUTER JOIN, RIGHT JOIN, RIGHT OUTER JOIN, CROSS JOIN, use a CASEstatement to sub in another value if null, COALESCEto provide a default value, etc, but I haven't been able to use them in a way that gets me what I need.

我尝试切换到LEFT OUTER JOIN, RIGHT JOIN, RIGHT OUTER JOIN, CROSS JOINCASE如果为 null,则使用语句子代入另一个值,COALESCE以提供默认值等,但我无法以某种方式使用它们来满足我的需求。

Any assistance is appreciated! And I promise I'll get around to reading that giant PostgreSQL book soon ;)

任何帮助表示赞赏!我保证我很快就会开始阅读那本巨大的 PostgreSQL 书;)

采纳答案by Gordon Linoff

You just need a left outer joininstead of an inner join:

您只需要一个left outer join而不是内部连接:

SELECT d.date, count(se.id)
FROM (SELECT to_char(date_trunc('day', (current_date - offs)), 'YYYY-MM-DD') AS date 
      FROM generate_series(0, 365, 1) AS offs
     ) d LEFT OUTER JOIN
     sharer_emailshare se 
     ON d.date = to_char(date_trunc('day', se.created), 'YYYY-MM-DD'))  
GROUP BY d.date;

回答by Travis

Extending Gordon Linoff's helpful answer, I would suggest a couple of improvements such as:

扩展 Gordon Linoff 的有用答案,我建议进行一些改进,例如:

  • Use ::dateinstead of date_trunc('day', ...)
  • Join on a date type rather than a character type (it's cleaner).
  • Use specific date ranges so they're easier to change later. In this case I select a year before the most recent entry in the table - something that couldn't have been done easily with the other query.
  • Compute the totals for an arbitrary subquery (using a CTE). You just have to cast the column of interest to the date type and call it date_column.
  • Include a column for cumulative total. (Why not?)
  • 使用::date代替date_trunc('day', ...)
  • 加入日期类型而不是字符类型(它更干净)。
  • 使用特定的日期范围,以便以后更容易更改。在这种情况下,我选择表中最新条目之前的一年 - 这是其他查询无法轻松完成的事情。
  • 计算任意子查询的总数(使用 CTE)。您只需将感兴趣的列转换为日期类型并将其命名为 date_column。
  • 包括一列累计总数。(为什么不?)

Here's my query:

这是我的查询:

WITH dates_table AS (
    SELECT created::date AS date_column FROM sharer_emailshare WHERE showroom_id=5
)
SELECT series_table.date, COUNT(dates_table.date_column), SUM(COUNT(dates_table.date_column)) OVER (ORDER BY series_table.date) FROM (
    SELECT (last_date - b.offs) AS date
        FROM (
            SELECT GENERATE_SERIES(0, last_date - first_date, 1) AS offs, last_date from (
                 SELECT MAX(date_column) AS last_date, (MAX(date_column) - '1 year'::interval)::date AS first_date FROM dates_table
            ) AS a
        ) AS b
) AS series_table
LEFT OUTER JOIN dates_table
    ON (series_table.date = dates_table.date_column)
GROUP BY series_table.date
ORDER BY series_table.date

I tested the query, and it produces the same results, plus the column for cumulative total.

我测试了查询,它产生了相同的结果,加上累积总数的列。

回答by Marcel Chastain

Based on Gordon Linoff's answer I realized another problem was that I had a WHEREclause that I didn't mention in the original question.

根据 Gordon Linoff 的回答,我意识到另一个问题是我WHERE在原始问题中没有提到一个条款。

Instead of a naked WHERE, I made a subquery:

而不是一个裸体WHERE,我做了一个子查询:

SELECT d.date, count(se.id) FROM (
    select to_char(date_trunc('day', (current_date - offs)), 'YYYY-MM-DD')
    AS date 
    FROM generate_series(0, 365, 1) 
    AS offs
    ) d 
LEFT OUTER JOIN (
    SELECT * FROM sharer_emailshare 
    WHERE showroom_id=5
) se
ON (d.date=to_char(date_trunc('day', se.created), 'YYYY-MM-DD')) 
GROUP BY d.date;

回答by Jason Swett

I'll try to provide an answer that includes some explanation. I'll start with the smallest building block and work up.

我会尝试提供一个包含一些解释的答案。我将从最小的构建块开始,然后继续工作。

If you run a query like this:

如果您运行这样的查询:

SELECT series.number FROM generate_series(0, 9) AS series(number)

You get output like this:

你得到这样的输出:

 number 
--------
      0
      1
      2
      3
      4
      5
      6
      7
      8
      9
(10 rows)

This can be turned into dates like this:

这可以变成这样的日期:

SELECT CURRENT_DATE + sequential_dates.date AS date
  FROM generate_series(0, 9) AS sequential_dates(date)

Which will give output like this:

这将给出这样的输出:

    date    
------------
 2019-09-29
 2019-09-30
 2019-10-01
 2019-10-02
 2019-10-03
 2019-10-04
 2019-10-05
 2019-10-06
 2019-10-07
 2019-10-08
(10 rows)

Then you can do a query like this (for example), joining the original query as a subquery against whatever table you're ultimately interested in:

然后您可以执行这样的查询(例如),将原始查询作为子查询加入您最终感兴趣的任何表:

   SELECT sequential_dates.date,
          COUNT(calendar_items.*) AS calendar_item_count
     FROM (SELECT CURRENT_DATE + sequential_dates.date AS date
             FROM generate_series(0, 9) AS sequential_dates(date)) sequential_dates
LEFT JOIN calendar_items ON calendar_items.starts_at::date = sequential_dates.date
 GROUP BY sequential_dates.date

Which will give output like this:

这将给出这样的输出:

    date    | calendar_item_count 
------------+---------------------
 2019-09-29 |                   1
 2019-09-30 |                   8
 2019-10-01 |                  15
 2019-10-02 |                  11
 2019-10-03 |                   1
 2019-10-04 |                  12
 2019-10-05 |                   0
 2019-10-06 |                   0
 2019-10-07 |                  27
 2019-10-08 |                  24