PostgreSQL:GIN 还是 GiST 索引?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/21830/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
PostgreSQL: GIN or GiST indexes?
提问by Grey Panther
From what information I could find, they both solve the same problems - more esoteric operations like array containment and intersection (&&, @>, <@, etc). However I would be interested in advice about when to use one or the other (or neither possibly).
The PostgreSQL documentationhas some information about this:
从我能找到的信息来看,它们都解决了相同的问题 - 更深奥的操作,如数组包含和交集(&&、@>、<@ 等)。但是,我对何时使用其中一个(或都不可能)的建议感兴趣。
在PostgreSQL文档大约有一些这方面的信息:
- GIN index lookups are about three times faster than GiST
- GIN indexes take about three times longer to build than GiST
- GIN indexes are about ten times slower to update than GiST
- GIN indexes are two-to-three times larger than GiST
- GIN 索引查找速度大约是 GiST 的三倍
- GIN 索引的构建时间大约是 GiST 的三倍
- GIN 索引的更新速度比 GiST 慢大约十倍
- GIN 索引比 GiST 大两到三倍
However I would be particularly interested to know if there is a performance impact when the memory to index size ration starts getting small (ie. the index size becomes much bigger than the available memory)? I've been told on the #postgresql IRC channel that GIN needs to keep all the index in memory, otherwise it won't be effective, because, unlike B-Tree, it doesn't know which part to read in from disk for a particular query? The question would be: is this true (because I've also been told the opposite of this)? Does GiST have the same restrictions? Are there other restrictions I should be aware of while using one of these indexing algorithms?
但是,我特别想知道当内存与索引大小的比率开始变小时(即索引大小变得比可用内存大得多)时是否会对性能产生影响?我在#postgresql IRC 频道上被告知 GIN 需要将所有索引保存在内存中,否则它不会有效,因为与 B-Tree 不同,它不知道从磁盘读取哪个部分一个特定的查询?问题是:这是真的吗(因为我也被告知与此相反)?GiST 有同样的限制吗?在使用这些索引算法之一时,我还应该注意其他限制吗?
采纳答案by Andrea Bertani
First of all, do you need to use them for text search indexing? GIN and GiST are index specialized for some data types. If you need to index simple char or integer values then the normal B-Tree index is the best.
Anyway, PostgreSQL documentation has a chapter on GISTand one on GIN, where you can find more info.
And, last but not least, the best way to find which is best is to generate sample data (as much as you need to be a real scenario) and then create a GIST index, measuring how much time is needed to create the index, insert a new value, execute a sample query. Then drop the index and do the same with a GIN index. Compare the values and you will have the answer you need, based on your data.
首先,您是否需要将它们用于文本搜索索引?GIN 和 GiST 是专门针对某些数据类型的索引。如果您需要索引简单的字符或整数值,那么普通的 B 树索引是最好的。
无论如何,PostgreSQL 文档有一章是关于GIST 的,有一章是关于GIN 的,您可以在其中找到更多信息。
并且,最后但并非最不重要的,找出哪个最好的最好方法是生成样本数据(尽可能多的真实场景),然后创建一个 GIST 索引,测量创建索引所需的时间,插入新值,执行示例查询。然后删除索引并对 GIN 索引执行相同操作。比较这些值,您将根据您的数据获得所需的答案。

