postgresql 较低的喜欢与喜欢
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/20336665/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
LOWER LIKE vs iLIKE
提问by user664833
How does the performance of the following two query components compare?
以下两个查询组件的性能比较如何?
LOWER LIKE
低喜欢
... LOWER(description) LIKE '%abcde%' ...
iLIKE
我喜欢
... description iLIKE '%abcde%' ...
采纳答案by Erwin Brandstetter
The answer depends on many factors like Postgres version, encoding and locale - LC_COLLATE
in particular.
答案取决于许多因素,例如 Postgres 版本、编码和语言环境 -LC_COLLATE
特别是。
The bare expression lower(description) LIKE '%abc%'
is typically a bit faster than description ILIKE '%abc%'
, and either is a bit faster than the equivalent regular expression: description ~* 'abc'
. This matters for sequential scans where the expression has to be evaluated for every tested row.
裸表达式lower(description) LIKE '%abc%'
通常比 快一点description ILIKE '%abc%'
,或者比等效的正则表达式快一点:description ~* 'abc'
。这对于顺序扫描很重要,其中必须为每个测试行评估表达式。
Butfor big tables like you demonstrate in your answer one would certainly use an index. For arbitrary patterns (not only left-anchored) I suggest a trigram index using the additional module pg_trgm
. Then we talk about milliseconds instead of seconds and the difference between the above expressions is nullified.
但是对于像您在答案中演示的大表,肯定会使用索引。对于任意模式(不仅是左锚定的),我建议使用附加模块的三元组索引pg_trgm
。然后我们谈论毫秒而不是秒,并且上述表达式之间的差异无效。
GIN and GiST indexes (using the gin_trgm_ops
or gist_trgm_ops
operator classes) support LIKE
(~~
), ILIKE
(~~*
), ~
, ~*
(and some more variants) alike. With a trigram GIN index on description
(typically bigger than GiST, but faster for reads), your query would use description ILIKE 'case_insensitive_pattern'
.
GIN 和 GiST 索引(使用gin_trgm_ops
或gist_trgm_ops
运算符类)支持LIKE
( ~~
), ILIKE
( ~~*
), ~
, ~*
(以及更多变体)。使用三元组 GIN 索引description
(通常比 GiST 大,但读取速度更快),您的查询将使用description ILIKE 'case_insensitive_pattern'
.
Related:
有关的:
Basics for pattern matching in Postgres:
Postgres 中模式匹配的基础:
When working with said trigram index it's typicallymore practical to work with:
使用上述三元组索引时,通常更实用:
description ILIKE '%abc%'
Or with the case-insensitive regexp operator (without %
wildcards):
或者使用不区分大小写的正则表达式运算符(不带%
通配符):
description ~* 'abc'
An index on (description)
does not support queries on lower(description)
like:
上的索引(description)
不支持对以下内容的查询lower(description)
:
lower(description) LIKE '%abc%'
And vice versa.
反之亦然。
With predicates on lower(description)
exclusively, the expression index is the slightly better option.
随着谓词lower(description)
独家,表达指数稍微更好的选择。
In all other cases, an index on (description)
is preferable as it supports bothcase-sensitive and -insensitive predicates.
在所有其他情况下,一个索引上(description)
是优选的,因为它支持两者区分大小写和不敏感的谓词。
回答by user664833
According to my tests (tenof each query), LOWER
LIKE
is about 17%
faster than iLIKE
.
根据我的测试(每个查询十个),LOWER
LIKE
大约17%
比iLIKE
.
Explanation
解释
I created a million rows contain some random mixed text data:
我创建了一百万行包含一些随机混合文本数据:
require 'securerandom'
inserts = []
1000000.times do |i|
inserts << "(1, 'fake', '#{SecureRandom.urlsafe_base64(64)}')"
end
sql = "insert into books (user_id, title, description) values #{inserts.join(', ')}"
ActiveRecord::Base.connection.execute(sql)
Verify the number of rows:
验证行数:
my_test_db=# select count(id) from books ;
count
---------
1000009
(Yes, I have nine extra rows from other tests - not a problem.)
(是的,我有来自其他测试的 9 行额外数据 - 没问题。)
Example query and results:
示例查询和结果:
my_test_db=# SELECT "books".* FROM "books" WHERE "books"."published" = 'f'
my_test_db=# and (LOWER(description) LIKE '%abcde%') ;
id | user_id | title | description | published
---------+---------+-------+----------------------------------------------------------------------------------------+------
1232322 | 1 | fake | 5WRGr7oCKABcdehqPKsUqV8ji61rsNGS1TX6pW5LJKrspOI_ttLNbaSyRz1BwTGQxp3OaxW7Xl6fzVpCu9y3fA | f
1487103 | 1 | fake | J6q0VkZ8-UlxIMZ_MFU_wsz_8MP3ZBQvkUo8-2INiDIp7yCZYoXqRyp1Lg7JyOwfsIVdpPIKNt1uLeaBCdelPQ | f
1817819 | 1 | fake | YubxlSkJOvmQo1hkk5pA1q2mMK6T7cOdcU3ADUKZO8s3otEAbCdEcmm72IOxiBdaXSrw20Nq2Lb383lq230wYg | f
Results for LOWER LIKE
LOWER LIKE 的结果
my_test_db=# EXPLAIN ANALYZE SELECT "books".* FROM "books" WHERE "books"."published" = 'f' and (LOWER(description) LIKE '%abcde%') ;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------
Seq Scan on books (cost=0.00..32420.14 rows=1600 width=117) (actual time=938.627..4114.038 rows=3 loops=1)
Filter: ((NOT published) AND (lower(description) ~~ '%abcde%'::text))
Rows Removed by Filter: 1000006
Total runtime: 4114.098 ms
Results for iLIKE
iLIKE 的结果
my_test_db=# EXPLAIN ANALYZE SELECT "books".* FROM "books" WHERE "books"."published" = 'f' and (description iLIKE '%abcde%') ;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------
Seq Scan on books (cost=0.00..29920.11 rows=100 width=117) (actual time=1147.612..4986.771 rows=3 loops=1)
Filter: ((NOT published) AND (description ~~* '%abcde%'::text))
Rows Removed by Filter: 1000006
Total runtime: 4986.831 ms
Database info disclosure
数据库信息公开
Postgres version:
Postgres 版本:
my_test_db=# select version();
version
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
PostgreSQL 9.2.4 on x86_64-apple-darwin12.4.0, compiled by i686-apple-darwin11-llvm-gcc-4.2 (GCC) 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2336.11.00), 64-bit
Collation setting:
校对设置:
my_test_db=# select datcollate from pg_database where datname = 'my_test_db';
datcollate
-------------
en_CA.UTF-8
Table definition:
表定义:
my_test_db=# \d books
Table "public.books"
Column | Type | Modifiers
-------------+-----------------------------+-------------------------------------------------------
id | integer | not null default nextval('books_id_seq'::regclass)
user_id | integer | not null
title | character varying(255) | not null
description | text | not null default ''::text
published | boolean | not null default false
Indexes:
"books_pkey" PRIMARY KEY, btree (id)
回答by lfx_cool
In my rails Project. ILIKE
is almost 10x faster then LOWER LIKE
, I add a GIN
index on entities.name
column
在我的 rails 项目中。ILIKE
几乎快了 10 倍LOWER LIKE
,我GIN
在entities.name
列上添加了一个索引
> Entity.where("LOWER(name) LIKE ?", name.strip.downcase).limit(1).first
Entity Load (2443.9ms) SELECT "entities".* FROM "entities" WHERE (lower(name) like 'baidu') ORDER BY "entities"."id" ASC LIMIT [["LIMIT", 1]]
> Entity.where("name ILIKE ?", name.strip).limit(1).first
Entity Load (285.0ms) SELECT "entities".* FROM "entities" WHERE (name ilike 'Baidu') ORDER BY "entities"."id" ASC LIMIT [["LIMIT", 1]]
# explain analyze SELECT "entities".* FROM "entities" WHERE (name ilike 'Baidu') ORDER BY "entities"."id" ASC LIMIT 1;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=3186.03..3186.04 rows=1 width=1588) (actual time=7.812..7.812 rows=1 loops=1)
-> Sort (cost=3186.03..3187.07 rows=414 width=1588) (actual time=7.811..7.811 rows=1 loops=1)
Sort Key: id
Sort Method: quicksort Memory: 26kB
-> Bitmap Heap Scan on entities (cost=1543.21..3183.96 rows=414 width=1588) (actual time=7.797..7.805 rows=1 loops=1)
Recheck Cond: ((name)::text ~~* 'Baidu'::text)
Rows Removed by Index Recheck: 6
Heap Blocks: exact=7
-> Bitmap Index Scan on index_entities_on_name (cost=0.00..1543.11 rows=414 width=0) (actual time=7.787..7.787 rows=7 loops=1)
Index Cond: ((name)::text ~~* 'Baidu'::text)
Planning Time: 6.375 ms
Execution Time: 7.874 ms
(12 rows)
GIN index is really helpful to improve ILIKE
performance
GIN索引对提高ILIKE
性能真的很有帮助