postgresql GiST和GIN索引的区别
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/28975517/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Difference between GiST and GIN index
提问by Walker Farrow
I am implementing a table that has a column with a data type of tsvector
and I am trying to understand what index would be better to use?
我正在实现一个包含数据类型为 的列的表,tsvector
我试图了解使用哪种索引会更好?
GIN or GiST?
GIN 还是 GiST?
In looking through the postgres documentation hereI seem to get that:
在查看这里的postgres 文档时,我似乎明白了:
GiST is faster to update and build the index and less accurate than gin.
GIN is slower to update and build the index but is more accurate.
GiST 更新和构建索引的速度更快,但不如 gin 准确。
GIN 更新和构建索引的速度较慢,但更准确。
OK, so why would anybody want a gist indexed field over gin? If gist could give you the wrong results? There must be some advantage (outside performance) on this.
好的,那么为什么有人想要一个 gist 索引字段而不是 gin 呢?如果要点可以给你错误的结果?在这方面肯定有一些优势(外部性能)。
Can anybody explain in layman's terms when I would want to use GIN vs. GiST?
当我想使用 GIN 与 GiST 时,有人可以用外行的术语解释一下吗?
回答by Erwin Brandstetter
I don't think I could explain it better than the manualalready does:
我认为我无法比手册更好地解释它:
In choosing which index type to use, GiST or GIN, consider these performance differences:
GIN index lookups are about three times faster than GiST
GIN indexes take about three times longer to build than GiST
GIN indexes are moderately slower to update than GiST indexes, but about 10 times slower if fast-update support was disabled [...]
GIN indexes are two-to-three times larger than GiST indexes
在选择使用哪种索引类型时,GiST 或 GIN,请考虑以下性能差异:
GIN 索引查找速度大约是 GiST 的三倍
GIN 索引的构建时间大约是 GiST 的三倍
GIN 索引的更新速度比 GiST 索引慢,但如果禁用快速更新支持,速度会慢 10 倍 [...]
GIN 索引比 GiST 索引大两到三倍
The link is to the currentmanual, the quote is from the version for 9.4, while your link is to version 9.1 (for some reason?).
链接指向当前手册,引用来自 9.4 版本,而您的链接指向 9.1 版本(出于某种原因?)。
Size and performance estimates seem slightly outdated in the manual - and have since been removed.
With Postgres 9.4 the odds have shifted substantially in favor of GIN.
The release notes of Postgres 9.4include:
手册中的大小和性能估计似乎有些过时 - 并且已被删除。
在 Postgres 9.4 中,可能性已经大大改变,有利于 GIN。Postgres 9.4
的发行说明包括:
Reduce GIN index size (Alexander Korotkov, Heikki Linnakangas) [...]
Improve speed of multi-key GIN lookups (Alexander Korotkov, Heikki Linnakangas)
减少 GIN 索引大小 (Alexander Korotkov, Heikki Linnakangas) [...]
提高多键 GIN 查找的速度(Alexander Korotkov、Heikki Linnakangas)
Note that there are special use casesthat require one or the other.
请注意,有一些特殊用例需要其中之一。
One thing you misunderstood: You neverget wrong results with a GiST index. The index operates on hash values, which canlead to false positives in the index. This should only become relevant with a very big number of different words in your documents. False positives are eliminated after re-checking the actual row in any case. The manual:
您误解了一件事:使用 GiST 索引永远不会得到错误的结果。索引对哈希值进行操作,这可能导致索引中出现误报。这应该只与文档中的大量不同单词相关。在任何情况下重新检查实际行后都会消除误报。手册:
A GiST index is lossy, meaning that the index may produce false matches, and it is necessary to check the actual table row to eliminate such false matches. (PostgreSQL does this automatically when needed.)
GiST 索引是有损的,这意味着该索引可能会产生错误匹配,因此需要检查实际表行以消除此类错误匹配。(PostgreSQL 会在需要时自动执行此操作。)
Bold emphasis mine.
大胆强调我的。