postgresql 性能调优:为布尔列创建索引
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/12025094/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Performance Tuning: Create index for boolean column
提问by Pavunkumar
I have written a daemon processor which will fetch rows from one database and insert them into another for synchronizing. It will fetch rows based on a boolean
indication flag sync_done
.
我编写了一个守护进程处理器,它将从一个数据库中获取行并将它们插入另一个数据库中以进行同步。它将根据boolean
指示标志获取行sync_done
。
My table has hundreds of thousands of rows. When I select all rows with sync_done is false
, will it cause any database performance issues? Should I apply indexing for that sync_done
column to improve performance, since only rows with a sync_done
value of false
are fetched?
我的表有数十万行。当我选择所有行时 sync_done is false
,是否会导致任何数据库性能问题?我是否应该对该sync_done
列应用索引以提高性能,因为只提取sync_done
值为 的行false
?
Say, I have 10000 rows. Of those, 9500 have already been synchronized (sync_done is true
) and will not be selected.
说,我有 10000 行。其中9500个已经同步(sync_done is true
),不会被选中。
Please suggest how I might proceed.
请建议我如何继续。
回答by Erwin Brandstetter
For a query like this a partial indexwould serve you best.
对于这样的查询,部分索引最适合您。
CREATE INDEX ON tbl (id) WHERE sync_done = FALSE;
However, for a use case like this, other synchronization methods may be preferable to begin with:
但是,对于这样的用例,其他同步方法可能更适合开始:
- Have a look at
LISTEN
/NOTIFY
. - Or use a triggerin combination with dblink.
- Or one of the many available replication methods.
Streaming Replicationwas added with Postgres 9.0 and has become increasingly popular.
- 看看
LISTEN
/NOTIFY
。 - 或者将触发器与dblink结合使用。
- 或者许多可用的复制方法之一。
流复制是随 Postgres 9.0 添加的,并且变得越来越流行。
回答by Johan
I suggest that you do notindex the table (the boolean is a low cardinality field), but partition it instead on the boolean value.
我建议您不要索引表(布尔值是一个低基数字段),而是根据布尔值对其进行分区。
See: http://www.postgresql.org/docs/9.1/static/ddl-partitioning.html
请参阅:http: //www.postgresql.org/docs/9.1/static/ddl-partitioning.html
回答by Bharat Sinha
A table with records and a boolean field should be the way to do it.
带有记录和布尔字段的表应该是这样做的方法。
Here is something which I believe might help you...
这里有一些我相信可以帮助你的东西......
回答by Gavin
An index will certainly help but rather than polling which can impose load and concurrency issues if your database is heavily used it might be worth considering a notification method such as amqpor trigger/database queue based approach instead like Slonyor Skytools Londiste. I have used both Slony and Londiste for trigger based replication and have found both excellent. My preference is for Londiste as it is much simpler to set up and manage (and if you have a simple use case stick to the older 2. branch).
索引肯定会有所帮助,但如果您的数据库被大量使用,它可能会带来负载和并发问题,而不是轮询,它可能值得考虑通知方法,例如amqp或基于触发器/数据库队列的方法,而不是像Slony或Skytools Lodiste。我已经将 Slony 和 Londiste 用于基于触发器的复制,并且发现两者都很棒。我更喜欢 Londiste,因为它的设置和管理要简单得多(如果你有一个简单的用例,请坚持使用旧的 2. 分支)。