MySQL MySQL如何填充范围内缺少的日期?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/3538858/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
MySQL how to fill missing dates in range?
提问by Jerry2
I have a table with 2 columns, date and score. It has at most 30 entries, for each of the last 30 days one.
我有一个包含 2 列、日期和分数的表格。它最多有 30 个条目,对于过去 30 天的每一天。
date score
-----------------
1.8.2010 19
2.8.2010 21
4.8.2010 14
7.8.2010 10
10.8.2010 14
My problem is that some dates are missing - I want to see:
我的问题是缺少某些日期 - 我想看看:
date score
-----------------
1.8.2010 19
2.8.2010 21
3.8.2010 0
4.8.2010 14
5.8.2010 0
6.8.2010 0
7.8.2010 10
...
What I need from the single query is to get: 19,21,9,14,0,0,10,0,0,14... That means that the missing dates are filled with 0.
我需要从单个查询中得到: 19,21,9,14,0,0,10,0,0,14 ... 这意味着缺少的日期用 0 填充。
I know how to get all the values and in server side language iterating through dates and missing the blanks. But is this possible to do in mysql, so that I sort the result by date and get the missing pieces.
我知道如何获取所有值并使用服务器端语言遍历日期并缺少空格。但这是否可以在 mysql 中进行,以便我按日期对结果进行排序并获取丢失的部分。
EDIT: In this table there is another column named UserID, so I have 30.000 users and some of them have the score in this table. I delete the dates every day if date < 30 days ago because I need last 30 days score for each user. The reason is I am making a graph of the user activity over the last 30 days and to plot a chart I need the 30 values separated by comma. So I can say in query get me the USERID=10203 activity and the query would get me the 30 scores, one for each of the last 30 days. I hope I am more clear now.
编辑:在此表中还有另一列名为 UserID,因此我有 30.000 个用户,其中一些在此表中有分数。如果日期 < 30 天前,我每天都会删除日期,因为我需要每个用户的过去 30 天分数。原因是我正在制作过去 30 天的用户活动图表,并绘制图表我需要用逗号分隔的 30 个值。所以我可以说在查询中获取 USERID=10203 活动,查询将获取 30 个分数,过去 30 天中的每一天都有一个分数。我希望我现在更清楚了。
采纳答案by OMG Ponies
MySQL doesn't have recursive functionality, so you're left with using the NUMBERS table trick -
MySQL 没有递归功能,因此您只能使用 NUMBERS 表技巧 -
Create a table that only holds incrementing numbers - easy to do using an auto_increment:
DROP TABLE IF EXISTS `example`.`numbers`; CREATE TABLE `example`.`numbers` ( `id` int(10) unsigned NOT NULL auto_increment, PRIMARY KEY (`id`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1;
Populate the table using:
INSERT INTO `example`.`numbers` ( `id` ) VALUES ( NULL )
...for as many values as you need.
Use DATE_ADDto construct a list of dates, increasing the days based on the NUMBERS.id value. Replace "2010-06-06" and "2010-06-14" with your respective start and end dates (but use the same format, YYYY-MM-DD) -
SELECT `x`.* FROM (SELECT DATE_ADD('2010-06-06', INTERVAL `n`.`id` - 1 DAY) FROM `numbers` `n` WHERE DATE_ADD('2010-06-06', INTERVAL `n`.`id` -1 DAY) <= '2010-06-14' ) x
LEFT JOIN onto your table of data based on the time portion:
SELECT `x`.`ts` AS `timestamp`, COALESCE(`y`.`score`, 0) AS `cnt` FROM (SELECT DATE_FORMAT(DATE_ADD('2010-06-06', INTERVAL `n`.`id` - 1 DAY), '%m/%d/%Y') AS `ts` FROM `numbers` `n` WHERE DATE_ADD('2010-06-06', INTERVAL `n`.`id` - 1 DAY) <= '2010-06-14') x LEFT JOIN TABLE `y` ON STR_TO_DATE(`y`.`date`, '%d.%m.%Y') = `x`.`ts`
创建一个只保存递增数字的表 - 使用 auto_increment 很容易做到:
DROP TABLE IF EXISTS `example`.`numbers`; CREATE TABLE `example`.`numbers` ( `id` int(10) unsigned NOT NULL auto_increment, PRIMARY KEY (`id`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1;
使用以下方法填充表:
INSERT INTO `example`.`numbers` ( `id` ) VALUES ( NULL )
...根据您的需要获得尽可能多的值。
使用DATE_ADD构建日期列表,根据 NUMBERS.id 值增加天数。用您各自的开始和结束日期替换“2010-06-06”和“2010-06-14”(但使用相同的格式,YYYY-MM-DD)-
SELECT `x`.* FROM (SELECT DATE_ADD('2010-06-06', INTERVAL `n`.`id` - 1 DAY) FROM `numbers` `n` WHERE DATE_ADD('2010-06-06', INTERVAL `n`.`id` -1 DAY) <= '2010-06-14' ) x
根据时间部分 LEFT JOIN 到您的数据表上:
SELECT `x`.`ts` AS `timestamp`, COALESCE(`y`.`score`, 0) AS `cnt` FROM (SELECT DATE_FORMAT(DATE_ADD('2010-06-06', INTERVAL `n`.`id` - 1 DAY), '%m/%d/%Y') AS `ts` FROM `numbers` `n` WHERE DATE_ADD('2010-06-06', INTERVAL `n`.`id` - 1 DAY) <= '2010-06-14') x LEFT JOIN TABLE `y` ON STR_TO_DATE(`y`.`date`, '%d.%m.%Y') = `x`.`ts`
If you want to maintain the date format, use the DATE_FORMAT function:
如果要保持日期格式,请使用DATE_FORMAT 函数:
DATE_FORMAT(`x`.`ts`, '%d.%m.%Y') AS `timestamp`
回答by Michael Conard
I'm not a fan of the other answers, requiring tables to be created and such. This query does it efficiently without helper tables.
我不喜欢其他答案,需要创建表格等等。此查询无需辅助表即可高效完成。
SELECT
IF(score IS NULL, 0, score) AS score,
b.Days AS date
FROM
(SELECT a.Days
FROM (
SELECT curdate() - INTERVAL (a.a + (10 * b.a) + (100 * c.a)) DAY AS Days
FROM (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS a
CROSS JOIN (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS b
CROSS JOIN (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS c
) a
WHERE a.Days >= curdate() - INTERVAL 30 DAY) b
LEFT JOIN your_table
ON date = b.Days
ORDER BY b.Days;
So lets dissect this.
所以让我们剖析一下。
SELECT
IF(score IS NULL, 0, score) AS score,
b.Days AS date
The if will detect days that had no score and set them to 0. b.Days is the configured amount of days you chose to get from the current date, up to 1000.
if 将检测没有分数的天数并将它们设置为 0。 b.Days 是您选择从当前日期获得的配置天数,最多 1000。
(SELECT a.Days
FROM (
SELECT curdate() - INTERVAL (a.a + (10 * b.a) + (100 * c.a)) DAY AS Days
FROM (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS a
CROSS JOIN (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS b
CROSS JOIN (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS c
) a
WHERE a.Days >= curdate() - INTERVAL 30 DAY) b
This subquery is something I saw on stackoverflow. It efficiently generates a list of the past 1000 days from the current date. The interval (currently 30) in the WHERE clause at the end determines which days are returned; the maximum is 1000. This query could be easily modified to return 100s of years worth of dates, but 1000 should be good for most things.
这个子查询是我在 stackoverflow 上看到的。它有效地生成了从当前日期开始的过去 1000 天的列表。最后的 WHERE 子句中的间隔(当前为 30)决定了返回的天数;最大值是 1000。这个查询可以很容易地修改为返回 100 年的日期,但 1000 应该对大多数事情都有好处。
LEFT JOIN your_table
ON date = b.Days
ORDER BY b.Days;
This is the part that brings your table that contains the score into it. You compare to the selected date range from the date generator query to be able to fill in 0s where needed (the score will be set to NULL
initially, because it is a LEFT JOIN
; this is fixed in the select statement). I also order it by the dates, just because. This is preference, you could also order by score.
这是将包含分数的表格带入其中的部分。您与从日期生成器查询中选择的日期范围进行比较,以便能够在需要的地方填充 0(分数将被设置为NULL
最初,因为它是一个LEFT JOIN
;这在 select 语句中是固定的)。我也按日期订购,只是因为。这是偏好,您也可以按分数排序。
Before the ORDER BY
you could easily join with your table about user info you mentioned with your edit, to add that last requirement.
在ORDER BY
您可以轻松加入关于您在编辑中提到的用户信息的表格之前,添加最后一个要求。
I hope this version of the query helps someone. Thanks for reading.
我希望这个版本的查询对某人有所帮助。谢谢阅读。
回答by Soundlink
You can accomplish this by using a Calendar Table. That's a table which you create once and fill with a date range (e.g. one dataset for each day 2000-2050; that depends on your data). Then you can make an outer join of your table against the calendar table. If a date is missing in your table, you return 0 for the score.
您可以通过使用日历表来完成此操作。这是一个您创建一次并填充日期范围的表(例如,2000-2050 每天一个数据集;这取决于您的数据)。然后,您可以将您的表与日历表进行外部联接。如果表中缺少日期,则返回 0 作为分数。
回答by GMB
Time went by since this question was asked. MySQL 8.0 was released in 2018 and added support for recursive common table expressions, which provide an elegant, state-of-the-art way to solve this question.
自从提出这个问题以来,时间已经过去了。MySQL 8.0 于 2018 年发布,增加了对递归公用表表达式的支持,提供了一种优雅、最先进的方法来解决这个问题。
The following query can be used to generate a list of dates, say for the first 15 days of August 2010:
以下查询可用于生成日期列表,例如 2010 年 8 月的前 15 天:
with recursive all_dates(dt) as (
-- anchor
select '2010-08-01' dt
union all
-- recursion with stop condition
select dt + interval 1 day from all_dates where dt + interval 1 day <= '2010-08-15'
)
select * from all_dates
You can then left join
this resultset with your table to generate the expected output:
然后,您可以left join
将此结果集与您的表一起生成预期的输出:
with recursive all_dates(dt) as (
-- anchor
select '2010-08-01' dt
union all
-- recursion with stop condition
select dt + interval 1 day from all_dates where dt + interval 1 day <= '2010-08-15'
)
select d.dt date, coalesce(t.score, 0) score
from all_dates d
left join mytable t on t.date = d.dt
order by d.dt
date | score :--------- | ----: 2010-08-01 | 19 2010-08-02 | 21 2010-08-03 | 0 2010-08-04 | 14 2010-08-05 | 0 2010-08-06 | 0 2010-08-07 | 10 2010-08-08 | 0 2010-08-09 | 0 2010-08-10 | 14 2010-08-11 | 0 2010-08-12 | 0 2010-08-13 | 0 2010-08-14 | 0 2010-08-15 | 0
回答by phoenix
Michael Conard answer is great but I needed intervals of 15 minutes where the time must always start at the top of every 15th minute:
Michael Conard 的回答很好,但我需要 15 分钟的时间间隔,其中时间必须始终从每 15 分钟开始:
SELECT a.Days
FROM (
SELECT FROM_UNIXTIME( FLOOR( UNIX_TIMESTAMP() / (15 * 60) ) * (15 * 60)) - INTERVAL 15 * (a.a + (10 * b.a) + (100 * c.a)) MINUTE AS Days
FROM (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS a
CROSS JOIN (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS b
CROSS JOIN (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS c
) a
WHERE a.Days >= curdate() - INTERVAL 30 DAY
This will set the current time to the previous round 15th minute:
这会将当前时间设置为上一轮第 15 分钟:
FROM_UNIXTIME( FLOOR( UNIX_TIMESTAMP() / (15 * 60) ) * (15 * 60))
And this will remove time with a 15 minute step:
这将删除 15 分钟的时间:
- INTERVAL 15 * (a.a + (10 * b.a) + (100 * c.a)) MINUTE
If there's a simpler way to do it, please let me know.
如果有更简单的方法,请告诉我。