PostgreSQL LIKE 查询性能变化
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1566717/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
PostgreSQL LIKE query performance variations
提问by Jason
I have been seeing quite a large variation in response times regarding LIKE
queries to a particular table in my database. Sometimes I will get results within 200-400 ms (very acceptable) but other times it might take as much as 30 seconds to return results.
我已经看到有关LIKE
对数据库中特定表的查询的响应时间有很大差异。有时我会在 200-400 毫秒内得到结果(非常可接受),但有时可能需要多达 30 秒才能返回结果。
I understand that LIKE
queries are very resource intensive but I just don't understand why there would be such a large difference in response times. I have built a btree index on the owner1
field but I don't think it helps with LIKE
queries. Anyone have any ideas?
我知道LIKE
查询非常占用资源,但我不明白为什么响应时间会有如此大的差异。我在该owner1
领域建立了一个 btree 索引,但我认为它对LIKE
查询没有帮助。谁有想法?
Sample SQL:
示例 SQL:
SELECT gid, owner1 FORM parcels
WHERE owner1 ILIKE '%someones name%' LIMIT 10
I've also tried:
我也试过:
SELECT gid, owner1 FROM parcels
WHERE lower(owner1) LIKE lower('%someones name%') LIMIT 10
And:
和:
SELECT gid, owner1 FROM parcels
WHERE lower(owner1) LIKE lower('someones name%') LIMIT 10
With similar results.
Table Row Count: about 95,000.
有类似的结果。
表行数:约 95,000。
回答by Erwin Brandstetter
FTS does not support LIKE
FTS 不支持 LIKE
The previously accepted answerwas incorrect. Full Text Searchwith its full text indexes is notfor the LIKE
operator at all, it has its own operators and doesn't work for arbitrary strings. It operates on wordsbased on dictionaries and stemming. It doessupport prefix matching for words, but not with the LIKE
operator:
在以前接受的答案是不正确的。带有全文索引的全文搜索根本不适用于LIKE
操作符,它有自己的操作符并且不适用于任意字符串。它对基于字典和词干的单词进行操作。它确实支持words 的前缀匹配,但不支持LIKE
运算符:
Trigram indexes for LIKE
三元组索引 LIKE
Install the additional module pg_trgm
which provides operator classes for GIN and GiST trigram indexesto support all LIKE
and ILIKE
patterns, not just left-anchored ones:
安装附加模块pg_trgm
,该模块为GIN 和 GiST 三元索引提供运算符类以支持allLIKE
和ILIKE
patterns,而不仅仅是左锚定的:
Example index:
示例索引:
CREATE INDEX tbl_col_gin_trgm_idx ON tbl USING gin (col gin_trgm_ops);
Or:
或者:
CREATE INDEX tbl_col_gist_trgm_idx ON tbl USING gist (col gist_trgm_ops);
Example query:
示例查询:
SELECT * FROM tbl WHERE col LIKE '%foo%'; -- leading wildcard
SELECT * FROM tbl WHERE col ILIKE '%foo%'; -- works case insensitively as well
Trigrams? What about shorter strings?
三元组?更短的字符串呢?
Words with less than 3 lettersin indexed values still work. The manual:
文字少于3个字母的索引值仍然工作。手册:
Each word is considered to have two spaces prefixed and one space suffixed when determining the set of trigrams contained in the string.
在确定字符串中包含的一组三元组时,每个单词都被认为有两个前缀和一个空格。
And search patterns with less than 3 letters? The manual:
和少于 3 个字母的搜索模式?手册:
For both
LIKE
and regular-expression searches, keep in mind that a pattern with no extractable trigrams will degenerate to a full-index scan.
对于
LIKE
正则表达式搜索和正则表达式搜索,请记住,没有可提取三元组的模式将退化为全索引扫描。
Meaning, that index / bitmap index scans still work (query plans for prepared statement won't break), it just won't buy you better performance. Typically no big loss, since 1- or 2-letter strings are hardly selective (more than a few percent of the underlying table matches) and index support would not improve performance to begin with, because a full table scan is faster.
意思是,索引/位图索引扫描仍然有效(准备好的语句的查询计划不会中断),它不会为您带来更好的性能。通常没有大的损失,因为 1 或 2 个字母的字符串几乎没有选择性(超过百分之几的基础表匹配)并且索引支持不会提高性能,因为全表扫描速度更快。
text_pattern_ops
for prefix matching
text_pattern_ops
用于前缀匹配
For just left-anchoredpatterns (no leading wildcard) you get the optimum with a suitable operator classfor a btree index: text_pattern_ops
or varchar_pattern_ops
. Both built-in features of standard Postgres, no additional module needed. Similar performance, but much smaller index.
对于左锚定模式(无前导通配符),您可以使用适合btree 索引的运算符类获得最佳值:text_pattern_ops
or varchar_pattern_ops
。标准 Postgres 的两个内置功能,不需要额外的模块。性能相似,但索引要小得多。
Example index:
示例索引:
CREATE INDEX tbl_col_text_pattern_ops_idx ON tbl(col text_pattern_ops);
Example query:
示例查询:
SELECT * FROM tbl WHERE col LIKE 'foo%'; -- no leading wildcard
Or, if you should be running your database with the 'C'locale (effectively nolocale), then everything is sorted according to byte order anyway and a plain btree index with default operator class does the job.
或者,如果您应该使用“C”语言环境(实际上没有语言环境)运行您的数据库,那么无论如何都将根据字节顺序对所有内容进行排序,并且带有默认运算符类的普通 btree 索引可以完成这项工作。
More details, explanation, examples and links in these related answers on dba.SE:
dba.SE 上这些相关答案中的更多详细信息、解释、示例和链接:
回答by Ants Aasma
Possibly the fast ones are anchored patterns with case-sensitive like that can use indexes. i.e. there is no wild card at the beginning of the match string so the executor can use an index range scan. (the relevant comment in the docs is here) Lower and ilike will also lose your ability to use the index unless you specifically create an index for that purpose (see functional indexes).
可能快速的是锚定模式,区分大小写,可以使用索引。即匹配字符串的开头没有通配符,因此执行程序可以使用索引范围扫描。(文档中的相关评论在这里)Lower 和 ilike 也将失去您使用索引的能力,除非您专门为此目的创建索引(请参阅功能索引)。
If you want to search for string in the middle of the field, you should look into full textor trigram indexes. First of them is in Postgres core, the other is available in the contrib modules.
如果你想在字段中间搜索字符串,你应该查看全文或三元索引。第一个在 Postgres 核心中,另一个在 contrib 模块中可用。
回答by Frank Heikens
回答by Noyal
Please Execute below mentioned query for improve the LIKE query performance in postgresql. create an index like this for bigger tables:
请执行下面提到的查询以提高 postgresql 中的 LIKE 查询性能。为更大的表创建一个这样的索引:
CREATE INDEX <indexname> ON <tablename> USING btree (<fieldname> text_pattern_ops)
回答by Stephen Quan
I recently had a similar issue with a table containing 200000 records and I need to do repeated LIKE queries. In my case, the string being search was fixed. Other fields varied. Because that, I was able to rewrite:
我最近在一个包含 200000 条记录的表中遇到了类似的问题,我需要重复执行 LIKE 查询。在我的情况下,正在搜索的字符串是固定的。其他领域各不相同。因此,我能够重写:
SELECT owner1 FROM parcels
WHERE lower(owner1) LIKE lower('%someones name%');
as
作为
CREATE INDEX ix_parcels ON parcels(position(lower('someones name') in lower(owner1)));
SELECT owner1 FROM parcels
WHERE position(lower('someones name') in lower(owner1)) > 0;
I was delighted when the queries came back fast and verified the index is being used with EXPLAIN ANALYZE
:
当查询快速返回并验证索引正在使用时,我很高兴EXPLAIN ANALYZE
:
Bitmap Heap Scan on parcels (cost=7.66..25.59 rows=453 width=32) (actual time=0.006..0.006 rows=0 loops=1)
Recheck Cond: ("position"(lower(owner1), 'someones name'::text) > 0)
-> Bitmap Index Scan on ix_parcels (cost=0.00..7.55 rows=453 width=0) (actual time=0.004..0.004 rows=0 loops=1)
Index Cond: ("position"(lower(owner1), 'someones name'::text) > 0)
Planning time: 0.075 ms
Execution time: 0.025 ms
回答by omer Farooq
When ever you use a clause on a column with functions eg LIKE, ILIKE, upper, lower etc. Then postgres wont take your normal index into consideration. It will do a full scan of the table going through each row and therefore it will be slow.
当您在具有函数的列上使用子句时,例如 LIKE、ILIKE、upper、lower 等。那么 postgres 不会考虑您的正常索引。它将对通过每一行的表进行全面扫描,因此它会很慢。
The correct way would be to create a new index according to your query. For example if i want to match a column without case sensitivity and my column is a varchar. Then you can do it like this.
正确的方法是根据您的查询创建一个新索引。例如,如果我想匹配一个不区分大小写的列,而我的列是一个 varchar。然后你可以这样做。
create index ix_tblname_col_upper on tblname (UPPER(col) varchar_pattern_ops);
Similarly if your column is a text then you do something like this
同样,如果您的列是文本,那么您可以执行以下操作
create index ix_tblname_col_upper on tblname (UPPER(col) text_pattern_ops);
Similarly you can change the function upper to any other function that you want.
同样,您可以将函数 upper 更改为您想要的任何其他函数。
回答by MrE
for what it's worth, DjangoORM tends to use UPPER(text)
for all LIKE
queries to make it case insensitive,
就其价值而言,DjangoORM 倾向于UPPER(text)
用于所有LIKE
查询以使其不区分大小写,
Adding an index on UPPER(column::text)
has greatly sped up my system, unlike any other thing.
UPPER(column::text)
与其他任何事情不同,添加索引大大加快了我的系统速度。
As far as leading %, yes that will not use an index. See this blog for a great explanation:
至于领先的%,是的,不会使用索引。请参阅此博客以获得很好的解释:
https://use-the-index-luke.com/sql/where-clause/searching-for-ranges/like-performance-tuning
https://use-the-index-luke.com/sql/where-clause/searching-for-ranges/like-performance-tuning
回答by Asaph
Your like queries probably cannot use the indexes you created because:
您喜欢的查询可能无法使用您创建的索引,因为:
1) your LIKE criteria begins with a wildcard.
1) 您的 LIKE 标准以通配符开头。
2) you've used a function with your LIKE criteria.
2) 您使用了符合 LIKE 标准的函数。