postgresql 在 Postgres 中将时间戳截断为 5 分钟的最快方法是什么？

Question

提问by DNS

Postgres can round (truncate) timestamps using the date_trunc function, like this:

Postgres 可以使用 date_trunc 函数舍入（截断）时间戳，如下所示：

date_trunc('hour', val)
date_trunc('minute', val)

I'm looking for a way to truncate a timestamp to the nearest 5-minute boundary so, for example, 14:26:57 becomes 14:25:00. The straightforward way to do it is like this:

我正在寻找一种将时间戳截断到最近的 5 分钟边界的方法，例如，14:26:57 变为 14:25:00。这样做的直接方法是这样的：

date_trunc('hour', val) + date_part('minute', val)::int / 5 * interval '5 min'

Since this is a performance-critical part of the query, I'm wondering whether this is the fastest solution, or whether there's some shortcut (compatible with Postgres 8.1+) that I've overlooked.

由于这是查询的性能关键部分，我想知道这是否是最快的解决方案，或者是否有一些我忽略的快捷方式（与 Postgres 8.1+ 兼容）。

Answer 1

采纳答案by a_horse_with_no_name

I don't think there is any quicker method.

我不认为有任何更快的方法。

And I don't think you should be worried about the performance of the expression.

而且我认为您不应该担心表达式的性能。

Everything else that is involved in executing your (SELECT, UPDATE, ...) statement is most probably a lot more expensive (e.g. the I/O to retrieve rows) than that date/time calculation.

执行 (SELECT, UPDATE, ...) 语句所涉及的所有其他内容很可能比日期/时间计算要昂贵得多（例如检索行的 I/O）。

Answer 2

回答by André C. Andersen

I was wondering the same thing. I found two alternative ways of doing this, but the one you suggested was faster.

我想知道同样的事情。我找到了两种替代方法，但您建议的方法更快。

I informally benchmarked against one of our larger tables. I limited the query to the first 4 million rows. I alternated between the two queries in order to avoid giving one a unfair advantage due to db caching.

我非正式地对我们的一张大表进行了基准测试。我将查询限制在前 400 万行。我在两个查询之间交替，以避免由于数据库缓存而给一个不公平的优势。

Going through epoch/unix time

经历纪元/Unix 时间

SELECT to_timestamp(
    floor(EXTRACT(epoch FROM ht.time) / EXTRACT(epoch FROM interval '5 min'))
    * EXTRACT(epoch FROM interval '5 min')
) FROM huge_table AS ht LIMIT 4000000

(Note this produces timestamptzeven if you used a time zone unaware datatype)

（请注意，timestamptz即使您使用了不知道时区的数据类型，这也会产生）

Results

结果

Run 1: 39.368 seconds
Run 3: 39.526 seconds
Run 5: 39.883 seconds

运行 1：39.368 秒
运行 3：39.526 秒
运行 5：39.883 秒

Using date_trunc and date_part

使用 date_trunc 和 date_part

SELECT 
    date_trunc('hour', ht.time) 
    + date_part('minute', ht.time)::int / 5 * interval '5 min'
FROM huge_table AS ht LIMIT 4000000

Results

结果

Run 2: 34.189 seconds
Run 4: 37.028 seconds
Run 6: 32.397 seconds

运行 2：34.189 秒
运行 4：37.028 秒
运行 6：32.397 秒

System

系统

DB version: PostgreSQL 9.6.2 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 4.8.2-19ubuntu1) 4.8.2, 64-bit
Cores: Intel? Xeon?, E5-1650v2, Hexa-Core
RAM: 64 GB, DDR3 ECC RAM

DB 版本：PostgreSQL 9.6.2 on x86_64-pc-linux-gnu，由 gcc (Ubuntu 4.8.2-19ubuntu1) 4.8.2 编译，64 位
核心：英特尔？至强?, E5-1650v2, 六核
内存：64 GB，DDR3 ECC 内存

Conclusion

结论

Your version seems to be faster. But not fast enough for my specific use case. The advantage of not having to specify the hour makes the epoch version more versatile and produces simpler parameterization in client side code. It handles 2 hourintervals just as well as 5 minuteintervals without having to bump the date_trunctime unit argument up. On a end note, I wish this time unit argument was changed to a time interval argument instead.

你的版本似乎更快。但对于我的特定用例来说还不够快。不必指定小时的优势使 epoch 版本更加通用，并在客户端代码中生成更简单的参数化。它2 hour可以像处理间隔一样处理间隔，5 minute而不必增加date_trunc时间单位参数。最后，我希望这个时间单位参数改为时间间隔参数。

Answer 3

回答by Benjamin Crouzier

Full query for those wondering (based on @DNS question):

对那些想知道的人的完整查询（基于@DNS 问题）：

Assuming you have orders and you want to count them by slices of 5min and shop_id:

假设您有订单并且您想按 5 分钟和 shop_id 的切片来计算它们：

SELECT date_trunc('hour', created_at) + date_part('minute', created_at)::int / 5 * interval '5 min' AS minute
      , shop_id, count(id) as orders_count
FROM orders
GROUP BY 1, shop_id
ORDER BY 1 ASC

postgresql 在 Postgres 中将时间戳截断为 5 分钟的最快方法是什么？

提问by DNS

采纳答案by a_horse_with_no_name

回答by André C. Andersen

Going through epoch/unix time

经历纪元/Unix 时间

Using date_trunc and date_part

使用 date_trunc 和 date_part

Conclusion

结论

回答by Benjamin Crouzier

相关推荐

最近更新

标签

postgresql 在 Postgres 中将时间戳截断为 5 分钟的最快方法是什么？

提问by DNS

采纳答案by a_horse_with_no_name

回答by André C. Andersen

Going through epoch/unix time

经历纪元/Unix 时间

Using date_trunc and date_part

使用 date_trunc 和 date_part

Conclusion

结论

回答by Benjamin Crouzier

相关推荐

PostgreSQL 问题：无法访问文件“$libdir/plpgsql”：没有那个文件或目录

postgresql 为什么PostgreSQL中的新用户可以连接所有数据库？

postgresql：错误重复键值违反唯一约束

postgresql 错误：无法读取关系 1663/16384/16564 的块 4707：成功

相关推荐

最近更新

标签