加快 PostgreSQL 查询,其中数据位于两个日期之间

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/2461947/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-20 00:04:18  来源:igfitidea点击:

Speeding up PostgreSQL query where data is between two dates

sqlperformancepostgresqlspatial-index

提问by Roger

I have a large table (> 50m rows) which has some data with an ID and timestamp:

我有一个大表(> 50m 行),其中包含一些带有 ID 和时间戳的数据:

id, timestamp, data1, ..., dataN

...with a multi-column index on (id, timestamp).

...上的多列索引(id, timestamp)

I need to query the table to select all rows with a certain ID where the timestamp is between two dates, which I am currently doing using:

我需要查询表以选择具有特定 ID 的所有行,其中时间戳在两个日期之间,我目前正在使用:

SELECT * FROM mytable WHERE id = x AND timestamp BETWEEN y AND z

This currently takes over 2 minutes on a high end machine (2x 3Ghz dual-core Xeons w/HT, 16GB RAM, 2x 1TB drives in RAID 0) and I'd really like to speed it up.

这目前在高端机器上需要超过 2 分钟(2x 3Ghz 双核 Xeon,带 HT,16GB RAM,2x 1TB RAID 0 驱动器),我真的很想加快速度。

I have found this tipwhich recommends using a spatial index, but the example it gives is for IP addresses. However, the speed increase (436s to 3s) is impressive.

我发现这个提示建议使用空间索引,但它给出的示例是针对 IP 地址的。然而,速度的提升(436s 到 3s)令人印象深刻。

How can I use this with timestamps?

如何将其与时间戳一起使用?

回答by Konrad Garus

That tip is only suitable when you have two columns A and B and use queries like:

该提示仅适用于有两列 A 和 B 并使用以下查询的情况:

where 'a' between A and B

That's not:

那不是:

where A between 'a' and 'b'

Using index on date(column)rather than columncould speed it up a little bit.

使用 index ondate(column)而不是column可以稍微加快速度。

回答by Frank Heikens

Could you EXPLAINthe query for us? Then we know how the database executes your query. And what about the configuration? What are the settings for shared_buffers and work_mem? And when did you (or your system) the last vacuum and analyze? And last thing, what OS and pgSQL-version are you using?

能帮我们解释一下这个查询吗?然后我们知道数据库如何执行您的查询。那配置呢?shared_buffers 和 work_mem 的设置是什么?您(或您的系统)上次抽真空和分析是什么时候?最后一件事,您使用的是什么操作系统和 pgSQL 版本?

You can create wonderfull indexes but without proper settings, the database can't use them very efficient.

您可以创建奇妙的索引,但如果没有适当的设置,数据库将无法非常有效地使用它们。

回答by KM.

Make sure the index is TableID+TableTimestamp, and you do a query like:

确保索引是 TableID+TableTimestamp,然后执行如下查询:

SELECT
    ....
    FROM YourTable
    WHERE TableID=..YourID.. 
        AND TableTimestamp>=..startrange.. 
        AND TableTimestamp<=..endrange..

if you apply functions to the table's TableTimestamp column in the WHERE, you will not be able to completely use the index.

如果对WHERE 中表的TableTimestamp 列应用函数,将无法完全使用索引。

if you are already doing all of this, then your hardware might not be up to the task.

如果您已经在执行所有这些操作,那么您的硬件可能无法胜任这项任务。

if you are using version 8.2 or later, you should try:

如果您使用的是 8.2 或更高版本,您应该尝试:

WHERE (TableID, TableTimestamp) >= (..YourID.., ..startrange.. ) 
    and (TableID, TableTimestamp) <= (..YourID.., ..endrange..)