SQL 计算 Postgresql 中的累计总数

Question

提问by khairul

I am using countand group byto get the number of subscribers registered each day:

我正在使用count并group by获取每天注册的订阅者数量：

  SELECT created_at, COUNT(email)  
    FROM subscriptions 
GROUP BY created at;

Result:

结果：

created_at  count
-----------------
04-04-2011  100
05-04-2011   50
06-04-2011   50
07-04-2011  300

I want to get the cumulative total of subscribers every day instead. How do I get this?

我想每天获得订阅者的累计总数。我怎么得到这个？

created_at  count
-----------------
04-04-2011  100
05-04-2011  150
06-04-2011  200
07-04-2011  500

Answer 1

回答by intgr

With larger datasets, window functionsare the most efficient way to perform these kinds of queries -- the table will be scanned only once, instead of once for each date, like a self-join would do. It also looks a lot simpler. :) PostgreSQL 8.4 and up have support for window functions.

对于较大的数据集，窗口函数是执行此类查询的最有效方式——表将只扫描一次，而不是像自联接那样对每个日期扫描一次。它看起来也简单了很多。:) PostgreSQL 8.4 及更高版本支持窗口函数。

This is what it looks like:

这是它的样子：

SELECT created_at, sum(count(email)) OVER (ORDER BY created_at)
FROM subscriptions
GROUP BY created_at;

Here OVERcreates the window; ORDER BY created_atmeans that it has to sum up the counts in created_atorder.

这里OVER创建了窗口；ORDER BY created_at意味着它必须按created_at顺序总结计数。

Edit:If you want to remove duplicate emails within a single day, you can use sum(count(distinct email)). Unfortunately this won't remove duplicates that cross different dates.

编辑：如果您想在一天内删除重复的电子邮件，您可以使用sum(count(distinct email)). 不幸的是，这不会删除跨越不同日期的重复项。

If you want to remove allduplicates, I think the easiest is to use a subquery and DISTINCT ON. This will attribute emails to their earliest date (because I'm sorting by created_at in ascending order, it'll choose the earliest one):

如果要删除所有重复项，我认为最简单的方法是使用子查询和DISTINCT ON. 这会将电子邮件归因于它们的最早日期（因为我按 created_at 升序排序，它将选择最早的日期）：

SELECT created_at, sum(count(email)) OVER (ORDER BY created_at)
FROM (
    SELECT DISTINCT ON (email) created_at, email
    FROM subscriptions ORDER BY email, created_at
) AS subq
GROUP BY created_at;

If you create an index on (email, created_at), this query shouldn't be too slow either.

如果您在上创建索引(email, created_at)，则此查询也不应该太慢。

(If you want to test, this is how I created the sample dataset)

（如果你想测试，这就是我创建示例数据集的方式）

create table subscriptions as
   select date '2000-04-04' + (i/10000)::int as created_at,
          '[email protected]' || (i%700000)::text as email
   from generate_series(1,1000000) i;
create index on subscriptions (email, created_at);

Answer 2

回答by OMG Ponies

Use:

用：

SELECT a.created_at,
       (SELECT COUNT(b.email)
          FROM SUBSCRIPTIONS b
         WHERE b.created_at <= a.created_at) AS count
  FROM SUBSCRIPTIONS a

Answer 3

回答by Andriy M

SELECT
  s1.created_at,
  COUNT(s2.email) AS cumul_count
FROM subscriptions s1
  INNER JOIN subscriptions s2 ON s1.created_at >= s2.created_at
GROUP BY s1.created_at

Answer 4

回答by Endy Tjahjono

I assume you want only one row per day and you want to still show days without any subscriptions (suppose nobody subscribes for a certain date, do you want to show that date with the balance of the previous day?). If this is the case, you can use the 'with' feature:

我假设您每天只需要一行，并且您仍然希望显示没有任何订阅的天数（假设没有人订阅某个日期，您想显示该日期与前一天的余额吗？）。如果是这种情况，您可以使用“with”功能：

with recursive serialdates(adate) as (
    select cast('2011-04-04' as date)
    union all
    select adate + 1 from serialdates where adate < cast('2011-04-07' as date)
)
select D.adate,
(
    select count(distinct email)
    from subscriptions
    where created_at between date_trunc('month', D.adate) and D.adate
)
from serialdates D

Answer 5

回答by mentat

The best way is to have a calendar table: calendar ( date date, month int, quarter int, half int, week int, year int )

最好的方法是有一个日历表：日历（日期日期，月份整数，季度整数，半整数，周整数，年整数）

Then, you can join this table to make summary for the field you need.

然后，您可以加入此表以对您需要的字段进行汇总。

SQL 计算 Postgresql 中的累计总数

提问by khairul

回答by intgr

回答by OMG Ponies

回答by Andriy M

回答by Endy Tjahjono

回答by mentat

相关推荐

最近更新

标签

SQL 计算 Postgresql 中的累计总数

提问by khairul

回答by intgr

回答by OMG Ponies

回答by Andriy M

回答by Endy Tjahjono

回答by mentat

相关推荐

SQL 如何从存储过程返回表？

SQL 候选键和复合键有什么区别？

SQL 带 WHERE 的 DISTINCT 子句

如何在 sql server 中按特定日期检索行？

相关推荐

最近更新

标签