全文的 Postgresql 前缀通配符

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/13071479/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-21 00:28:43  来源:igfitidea点击:

Postgresql prefix wildcard for full text

sqlpostgresqlfull-text-searchfull-text-indexing

提问by Paul Carey

I am trying to run a fulltext query using Postgresql that can cater for partial matches using wildcards.

我正在尝试使用 Postgresql 运行全文查询,该查询可以使用通配符进行部分匹配。

It seems easy enough to have a postfix wildcard after the search term, however I cannot figure out how to specify a prefix wildcard.

在搜索词之后添加后缀通配符似乎很容易,但是我不知道如何指定前缀通配符。

For example, I can perform a postfix search easily enough using something like..

例如,我可以使用类似的东西轻松地执行后缀搜索。

SELECT "t1".* 
FROM "t1" 
WHERE (to_tsvector('simple', "t1"."city") @@ to_tsquery('simple', 'don:*') )

should return results matching "London"

应该返回匹配“伦敦”的结果

However I cant seem to do a prefix search like...

但是我似乎无法进行前缀搜索,例如...

SELECT "t1".* 
FROM "t1" 
WHERE (to_tsvector('simple', "t1"."city") @@ to_tsquery('simple', ':*don') )

Ideally I'd like to have a wildcard prefixed to the front and end of the search term, something like...

理想情况下,我希望在搜索词的前端和末尾添加一个通配符,例如...

SELECT "t1".* 
FROM "t1" 
WHERE (to_tsvector('simple', "t1"."city") @@ to_tsquery('simple', ':*don:*') )

I can use a LIKE condition however I was hoping to benefit from the performance of the full text search features in Postgres.

我可以使用 LIKE 条件,但是我希望从 Postgres 中全文搜索功能的性能中受益。

采纳答案by Tometzky

Full text search is good for finding words, not substrings.

全文搜索适合查找单词,而不是子字符串。

For substring searches you'd better use like '%don%'with pg_trgmextension available from PostgreSQL 9.1 and using gin (column_name gin_trgm_ops)or using gist (column_name gist_trgm_ops)indexes. But your index would be very big (even several times bigger than your table) and write performance not very good.

对于子字符串搜索,您最好使用PostgreSQL 9.1 和/或索引like '%don%'提供的pg_trgm扩展。但是您的索引会非常大(甚至比您的表大几倍)并且写入性能不是很好。using gin (column_name gin_trgm_ops)using gist (column_name gist_trgm_ops)

There's a very good example of using pg_trgm for substring searchon select * from depesz blog.

有一个很好的例子,使用 pg_trgmselect * from depesz blog进行子字符串搜索

回答by Neil McGuigan

One wild and crazy way of doing it would be to create a tsvector index of all your documents, reversed. And reverse your queries for postfix search too.

一种疯狂而疯狂的方法是为所有文档创建一个 tsvector 索引,反过来。并反转您对后缀搜索的查询。

This is essentially what Solr does with its ReversedWildcardFilterFactory

这本质上是 Solr 用它的ReversedWildcardFilterFactory所做的

select
reverse('brown fox')::tsvector @@ (reverse('rown') || ':*')::tsquery --true