在 PostgreSQL 中查找重叠的日期范围

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/4480715/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-20 00:26:52  来源:igfitidea点击:

Find overlapping date ranges in PostgreSQL

sqlpostgresqloverlapdate-range

提问by aocferreira

Is this correct?

这样对吗?

SELECT * 
FROM   contract 
JOIN   team USING (name_team) 
JOIN   player USING(name_player) 
WHERE  name_team = ? 
AND    DATE_PART('YEAR',date_join)>= ? 
AND    DATE_PART('YEAR',date_leave)<= ?

My table contracthas the player name, team name and the dates when he joined and left the club.
I want to make a function listing all players that were on the team in specific years.
The above query doesn't seem to be working ...

我的表上contract有球员姓名、球队名称以及他加入和离开俱乐部的日期。
我想制作一个功能,列出特定年份球队中的所有球员。
上面的查询似乎不起作用......

采纳答案by Scott Marlowe

Why not use between without the date part thing:

为什么不在没有日期部分的情况下使用:

WHERE datefield BETWEEN '2009-10-10 00:00:00' AND '2009-10-11 00:00:00'

or something like that?

或类似的东西?

回答by Erwin Brandstetter

The currently accepted answerdoes not answer the question. And it is wrong in principle. a BETWEEN x AND ytranslates to:

目前接受的答案不回答这个问题。而且原则上是错误的。a BETWEEN x AND y翻译成:

a >= x AND a <= y

Includingthe upper border, while people typically need to excludeit:

包括上边界,而人们通常需要排除它:

a >= x AND a < y

With datesyou can easily adjust. For the year 2009 use '2009-12-31' as upper border.
But it's not as simple with timestampswhich allow fractional digits. Modern Postgres versions use an 8-byte integer internally to store up to 6 fractional seconds (μs resolution). Knowing this we couldstill make it work, but that's not intuitive and depends on an implementation detail. Bad idea.

有了日期,您可以轻松调整。对于 2009 年,使用“2009-12-31”作为上边界。
但是对于允许小数位数的时间戳,它并不那么简单。现代 Postgres 版本在内部使用 8 字节整数来存储最多 6 个小数秒(μs 分辨率)。知道了这一点,我们可能仍然使它的工作,但是这并不直观,依赖于实现细节。馊主意。

Moreover, a BETWEEN x AND ydoes not find overlapping ranges. We need:

此外,a BETWEEN x AND y没有找到重叠的范围。我们需要:

b >= x AND a < y

And players that never leftare not considered, yet.

从未离开的球员不被考虑在内。

Proper answer

正确答案

Assuming the year 2009, I'll rephrase the question without changing its meaning:

假设为 year 2009,我将在不改变其含义的情况下重新表述这个问题:

"Find all players of a given team who joined before 2010 and did not leave before 2009."

“查找给定球队中所有在 2010 年之前加入但在 2009 年之前没有离开的球员。”

Basic query:

基本查询:

SELECT p.* 
FROM   team     t
JOIN   contract c USING (name_team) 
JOIN   player   p USING (name_player) 
WHERE  t.name_team = ? 
AND    c.date_join  <  date '2010-01-01'
AND    c.date_leave >= date '2009-01-01';

But there is more:

但还有更多:

If referential integrity is enforced with FK constraints, the table teamitself is just noise in the query and can be removed.

如果使用 FK 约束强制执行参照完整性,则表team本身只是查询中的噪音,可以删除。

While the same player can leave and rejoin the same team, we also need to fold possible duplicates, for instance with DISTINCT.

虽然同一个玩家可以离开并重新加入同一个团队,但我们还需要折叠可能的重复项,例如使用DISTINCT.

And we mayneed to provide for a special case: players that never left. Assuming those players have NULL in date_leave.

我们可能需要提供一种特殊情况:从未离开的球员。假设这些玩家在date_leave.

"A player that is not known to have left is assumed to be playing for the team to this day."

“一个不知道已经离开的球员被认为是为球队效力至今。”

Refined query:

细化查询:

SELECT DISTINCT p.* 
FROM   contract c
JOIN   player   p USING (name_player) 
WHERE  c.name_team = ? 
AND    c.date_join  <  date '2010-01-01'
AND   (c.date_leave >= date '2009-01-01' OR c.date_leave IS NULL);

Operator precedenceworks against us, ANDbinds before OR. We need parentheses.

运算符优先级对我们不利,AND在 之前绑定OR。我们需要括号。

Related answer with optimized DISTINCT(if duplicates are common):

优化的相关答案DISTINCT(如果重复很常见):

Typically, namesof natural persons aren't unique and a surrogate primary key is used. But, obviously, name_playeris the primary key of player. If all you need is player names we don't need the table playerin the query, either:

通常,自然人的姓名不是唯一的,并且使用代理主键。但是,显然,name_player是 的主键player。如果您只需要玩家姓名,我们也不需要player查询中的表,或者:

SELECT DISTINCT name_player 
FROM   contract
WHERE  name_team = ? 
AND    date_join  <  date '2010-01-01'
AND   (date_leave >= date '2009-01-01' OR date_leave IS NULL);

SQL OVERLAPSoperator

SQLOVERLAPS运算符

The manual:

手册:

OVERLAPSautomatically takes the earlier value of the pair as the start. Each time period is considered to represent the half-open interval start <= time < end, unless startand endare equal in which case it represents that single time instant.

OVERLAPS自动以该对的较早值作为开始。每个时间段都被认为代表半开区间start <= time < end,除非startend相等,在这种情况下它代表单个时间瞬间。

To take care of potential NULLvalues, COALESCEseems easiest:

要处理潜在的NULL价值,COALESCE似乎最简单:

SELECT DISTINCT name_player 
FROM   contract
WHERE  name_team = ? 
AND    (date_join, COALESCE(date_leave, CURRENT_DATE)) OVERLAPS
       (date '2009-01-01', date '2010-01-01');  -- upper bound excluded

Range type with index support

带索引支持的范围类型

In Postgres 9.2 or lateryou can also operate with actual range types:

在 Postgres 9.2 或更高版本中,您还可以使用实际范围类型进行操作

SELECT DISTINCT name_player 
FROM   contract
WHERE  name_team = ? 
AND    daterange(date_join, date_leave) &&
       daterange '[2009-01-01,2010-01-01)';  -- upper bound excluded

Range types add some overhead and occupy more space. 2 x date= 8 bytes; 1 x daterange= 14 bytes on disk or 17 bytes in RAM. But in combination with the overlap operator &&the query can be supported with a GiST index.

范围类型会增加一些开销并占用更多空间。2 x date= 8 个字节;1 x daterange= 磁盘上的 14 个字节或 RAM 中的 17 个字节。但是结合重叠运算符&&,可以使用 GiST 索引支持查询。

Also, no need to special-case NULL values. NULL means "open range" in a range type - exactly what we need. The table definition doesn't even have to change: we can create the range type on the fly - and support the query with a matching expression index:

此外,不需要特殊情况下的 NULL 值。NULL 表示范围类型中的“开放范围”——正是我们所需要的。表定义甚至不必更改:我们可以动态创建范围类型 - 并支持具有匹配表达式索引的查询:

CREATE INDEX mv_stock_dr_idx ON mv_stock USING gist (daterange(date_join, date_leave));

Related:

有关的: