用于检测重叠时间范围的 PostgreSQL 查询

Question

提问by EM0

I have a table in PostgreSQL 9.2 that looks like this (simplified):

我在 PostgreSQL 9.2 中有一个表，看起来像这样（简化）：

CREATE TABLE my_features
(
  id integer NOT NULL,
  feature_id integer NOT NULL,
  begin_time timestamp NOT NULL,
  end_time timestamp
)

For each feature_id there may be multiple rows with time ranges specified by begin_time/end_time. They may overlap, but this is relatively rare. I'm looking for a fast way to find all feature_ids that have/don't have anyoverlaps.

对于每个 feature_id，可能有多个行，其时间范围由 begin_time/end_time 指定。它们可能会重叠，但这种情况相对较少。我正在寻找一种快速方法来查找所有具有/没有任何重叠的feature_id 。

I tried to do this using window functions, like this:

我尝试使用窗口函数来做到这一点，如下所示：

SELECT feature_id, bool_or(end_time > lead(begin_time) OVER ts_win) OVER ts_win AS overlaps_any
FROM my_features
WINDOW ts_win AS (PARTITION BY feature_id ORDER BY begin_time)

... but this doesn't work:

...但这不起作用：

ERROR:  window function calls cannot be nested

The algorithm is simple: order the rows for a given feature_id by begin_time and check if any end_time > the next begin_time (if any). I suspect there must be an easy way to do this, perhaps with tsrange functions, but can't seem to find it just now.

算法很简单：按begin_time 对给定feature_id 的行进行排序，并检查是否有end_time > 下一个begin_time（如果有）。我怀疑一定有一种简单的方法可以做到这一点，也许使用 tsrange 函数，但现在似乎找不到它。

Answer 1

回答by a_horse_with_no_name

This can indeed be done using range types.

这确实可以使用范围类型来完成。

The following selects all those rows that do have overlapping ranges:

以下选择所有具有重叠范围的行：

select f1.*
from my_features f1
where exists (select 1
              from my_features f2
              where tsrange(f2.begin_time, f2.end_time, '[]') && tsrange(f1.begin_time, f1.end_time, '[]')
                and f2.feature_id = f1.feature_id
                and f2.id <> f1.id);

When you change the condition to NOT EXISTSyou'll find those that don't have any overlapping ranges.

当您将条件更改为时，NOT EXISTS您会发现那些没有任何重叠范围的条件。

SQLFiddle example: http://sqlfiddle.com/#!15/40b1e/1

SQLFiddle 示例：http://sqlfiddle.com/#!15/40b1e/1

tsrange(f2.begin_time, f2.end_time, '[]')creates a range that includes the upper and lower bounds. You can also create ranges that exclude either one or both.

tsrange(f2.begin_time, f2.end_time, '[]')创建一个包含上限和下限的范围。您还可以创建排除其中一个或两个的范围。

More details can be found in the manual:
http://www.postgresql.org/docs/current/static/rangetypes.html#RANGETYPES-INCLUSIVITY

更多细节可以在手册中找到：http:
//www.postgresql.org/docs/current/static/rangetypes.html#RANGETYPES-INCLUSIVITY

The &&operator checks if the two ranges overlap: http://www.postgresql.org/docs/current/static/functions-range.html

的&&操作者检查是否两个范围重叠：http://www.postgresql.org/docs/current/static/functions-range.html

^{(I just wish Oracle had something fancy like that...)}

^{（我只是希望甲骨文有这样的东西......）}

Answer 2

回答by Gordon Linoff

Here is an observation. If there are overlapping time periods for a feature, then at least one time period overlaps with the preceding one as defined by begin_time. (You can look at this the other way. If there are no such overlaps then there is always a gap between one time frame and the next and nothing overlaps.)

这是一个观察。如果某一特征存在重叠的时间段，则至少有一个时间段与由定义的前一时间段重叠begin_time。（您可以换一种方式看待这一点。如果没有这种重叠，那么一个时间范围和下一个时间范围之间总是存在差距，并且没有任何重叠。）

This leads to the following query for determining overlaps:

这导致以下用于确定重叠的查询：

select f.feature_id
from (select f.feature_id,
             (case when lag(end_time) over (partition by feature_id order by begin_time) > begin_time
                   then 1 else 0
              end) as HasOverlap
      from my_features f
     ) f
group by f.feature_id
having max(HaxOverlap) = 1;

用于检测重叠时间范围的 PostgreSQL 查询

提问by EM0

回答by a_horse_with_no_name

回答by Gordon Linoff

相关推荐

最近更新

标签

用于检测重叠时间范围的 PostgreSQL 查询

提问by EM0

回答by a_horse_with_no_name

回答by Gordon Linoff

相关推荐

postgresql SELECT 多行和多列到一个记录变量中

postgresql 如何使用 GROUP BY 子句将正确的属性名称设置为 json 聚合结果？

brew install postgresql (upgrade) 错误，无法链接 - 死链接到旧的不存在的版本

postgresql 如何杀死不会死的postgres进程？

相关推荐

最近更新

标签