SQL Server 中 7 天滚动平均值的 SQL 查询

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/25922379/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-01 02:39:54  来源:igfitidea点击:

SQL Query for 7 Day Rolling Average in SQL Server

sqlsql-servermoving-average

提问by Andy T

I have a table of hourly product usage (how many times the product is used) data –

我有一张每小时产品使用量(产品使用多少次)数据的表格——

ID (bigint)| ProductId (tinyint)| Date (int - YYYYMMDD) | Hour (tinyint)| UsageCount (int)
#|1 | 20140901 | 0 | 10
#|1 | 20140901 | 1 | 15
#|1 | 20140902 | 5 | 25
#|1 | 20140903 | 5 | 25
#|1 | 20140904 | 3 | 25
#|1 | 20140905 | 7 | 25
#|1 | 20140906 | 10 | 25
#|1 | 20140907 | 9 | 25
#|1 | 20140908 | 5 | 25
#|2 | 20140903 | 16 | 10
#|2 | 20140903 | 13 | 115

Likewise, I have the usage data for 4 different products (ProductId from 1 through 4) stored for every hour in the product_usage table. As you can imagine, it is constantly growing as the nightly ETL process dumps the data for the entire previous day. If a product is not used on any hour of a day, the record for that hour won't appear in this table. Similarly, if a product is not used for the entire day, there won't be any record for that day in the table. I need to generate a report that gives daily usage and last 7 days' rolling average –

同样,我在 product_usage 表中每小时存储 4 种不同产品(ProductId 从 1 到 4)的使用数据。可以想象,随着每晚 ETL 过程转储前一天的数据,它不断增长。如果产品在一天中的任何一个小时都没有使用,则该小时的记录将不会出现在此表中。同样,如果产品一整天都没有使用,则表中不会有当天的任何记录。我需要生成一份报告,提供每日使用情况和过去 7 天的滚动平均值 –

For example:

例如:

ProductId | Date | DailyUsage | RollingAverage
1 | 20140901 | sum of usages of that day | (Sum of usages from 20140901 through 20140826) / 7
1 | 20140901 | sum of usages of that day | (Sum of usages from 20140901 through 20140826) / 7
1 | 20140902 | sum of usages of that day | (Sum of usages from 20140902 through 20140827) / 7
2 | 20140902 | sum of usages of that day | (Sum of usages from 20140902 through 20140827) / 7

And so on.. I am planning to create an Indexed View in SQL server 2014. Can you think of an efficient SQL query to do this?

等等.. 我打算在 SQL Server 2014 中创建一个索引视图。你能想到一个高效的 SQL 查询来做到这一点吗?

回答by Brian DeMilia

Try:

尝试:

select x.*,
       avg(dailyusage) over(partition by productid order by productid, date rows between 6 preceding and current row) as rolling_avg
  from (select productid, date, sum(usagecount) as dailyusage
          from tbl
         group by productid, date) x

Fiddle:

小提琴:

http://sqlfiddle.com/#!6/f674a7/4/0

http://sqlfiddle.com/#!6/f674a7/4/0

Replace "avg(dailusage) over...." with sum (rather than avg) if what you really want is the sum for the past week. In your title you say you want the average but later you say you want the sum. The query should be the same other than that, so use whichever you actually want.

如果您真正想要的是过去一周的总和,请将“avg(dailusage) over....”替换为 sum(而不是 avg)。在你的标题中,你说你想要平均值,但后来你说你想要总和。除此以外,查询应该相同,因此请使用您实际想要的任何一个。

As was pointed out by Gordon this is basically the average of the past 6 dates in which the product was used, which might be more than just the past 6 days if there are days without any rows for that product on the table because it wasn't used at all. To get around that you could use a date table and your products table.

正如 Gordon 所指出的,这基本上是过去 6 天使用该产品的日期的平均值,如果表上没有该产品的任何行的日子,这可能不仅仅是过去 6 天,因为它不是'根本没用过。为了解决这个问题,您可以使用日期表和产品表。

回答by Gordon Linoff

You have to be careful if you can be missing data on some days. If I assume that there is data for some product on each day, then this approach will work:

如果您可能在某些日子丢失数据,您必须小心。如果我假设每天都有某个产品的数据,那么这种方法将起作用:

select p.productid, d.date, sum(usagecount),
       sum(sum(usagecount)) over (partition by p.productid order by d.date
                                  rows between 6 preceding and current row) as Sum7day
from (select distinct productid from hourly) p cross join
     (select distinct date from hourly) d left join
     hourly h
     on h.productid = p.productid and h.date = p.date
group by p.productid, d.date;