LIKE '%...%' 通配符查询的 PL/SQL 性能调优

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/6229445/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-01 10:50:43  来源:igfitidea点击:

PL/SQL Performance Tuning for LIKE '%...%' Wildcard Queries

sqloracleindexingquery-optimization

提问by Грозный

We're using Oracle 11g database.
As you may or may not know, if you use wildcard query with "%" in front of the string, the column index is not being usedand a full table scanis happening.

我们使用的是 Oracle 11g 数据库。
您可能知道也可能不知道,如果在字符串前面使用带有“%”的通配符查询,则不会使用索引,并且正在执行全表扫描

It looks like there isn't a definitive suggestion on how to improve this kind of query, but perhaps you could share some valuable information from your experience on how to optimize the following query:

似乎没有关于如何改进这种查询的明确建议,但也许您可以从您的经验中分享一些关于如何优化以下查询的宝贵信息:

SELECT * 
  FROM myTable 
 WHERE UPPER(CustomerName) like '%ABC%' 
    OR UPPER(IndemnifierOneName) like '%ABC%' 
    OR UPPER(IndemnifierTwoName) like '%ABC%';

...where all 3 columns are of type varchar2(100)and ABCis a value of variable input parameter.

...其中所有 3 列都是varchar2(100)类型,ABC是变量输入参数的值。

@All suggesting CONTEXindex, please note my data gets updated any time of the day every day and this index requires re-syncing, hence it's nota good option for a table of 1.5 million rows, sorry.

@All 建议使用CONTEX索引,请注意我的数据每天都会更新,并且该索引需要重新同步,因此对于150 万行的表来说,这不是一个好的选择,抱歉。

P.S. I'll upvote every answer, so please do keep them coming.

PS我会赞成每个答案,所以请让他们继续。

采纳答案by Kevin Burton

As already mentioned you could add a ctx context index to the name columns.

如前所述,您可以向名称列添加 ctx 上下文索引。

assuming a small number of records get updated, 1 option is to refresh your index daily. (and record when it happened)

假设更新了少量记录,一个选项是每天刷新您的索引。(并记录它发生的时间)

then add a lastupdate date column & index to your table being searched.

然后将 lastupdate 日期列和索引添加到正在搜索的表中。

It should be possible to scan your ctx index for the majority of the old unchanged data and select from the small percentage of updated data using the traditonal LIKE e.g:

应该可以扫描您的 ctx 索引以查找大多数旧的未更改数据,并使用传统的 LIKE 从更新数据的小百分比中进行选择,例如:

WHERE (lastupdated<lastrefresh AND contains(name,'%ABC%')) 
   OR (lastupdated>lastrefresh AND name like '%ABC%')

NOTE: you may find your query plan goes a little mental (lots of bitmap conversions to row ids) in that case split the 2 parts of the OR into a UNION ALL query. e.g

注意:在这种情况下,您可能会发现您的查询计划有点脑残(大量位图转换为行 ID),在这种情况下将 OR 的 2 部分拆分为 UNION ALL 查询。例如

SELECT id FROM mytable   
    WHERE 
    (lastupdate>lastrefresh and name LIKE '%ABC%')
    UNION ALL
    SELECT id FROM mytable   
    WHERE lastupdate<lastrefresh and CONTAINS(name, '%ABC%', 1) > 0

回答by Fosco

The only optimization is to not use that type of query and instead use the native capabilities of the database platform:

唯一的优化是不使用这种类型的查询,而是使用数据库平台的本机功能:

See Oracle Text: http://www.oracle.com/technetwork/database/enterprise-edition/index-098492.html

请参阅 Oracle 文本:http: //www.oracle.com/technetwork/database/enterprise-edition/index-098492.html

The common answer for SQL Server related questions would be Full Text Search.. nice to see Oracle has something as good or better.

SQL Server 相关问题的常见答案是全文搜索……很高兴看到 Oracle 有同样好的或更好的东西。

回答by Samuel

The UPPER()is killing your indexes before anything, consider using a regexp. The initial %may avoid a normal index scan, but not always results in a Full Table Scan but into a Full Index Scan, which is faster than the FTS.

UPPER()是什么,之前杀死你的指标,可以考虑使用正则表达式。初始%可能会避免正常的索引扫描,但并不总是导致全表扫描,而是进入全索引扫描,这比 FTS 更快。

I suppose that 'ABC'is variable. If not, a function index is the way to go.

我想这'ABC'是可变的。如果没有,函数索引是可行的方法。

回答by Tim

Sometimes this kind of query is unavoidable -- extracting the domain from a URL, or perhaps the root from a word with a prefix and a suffix.

有时这种查询是不可避免的——从 URL 中提取域,或者从带有前缀和后缀的词中提取词根。

You can resort to a full text index with or without a custom tokenizer.

您可以使用或不使用自定义标记器使用全文索引。

Or if the strings you're searching for are finite in number and known in advance (e.g. you're working with a limited set of domain names that need to be extracted from a URL) you can use a deterministic function which can be indexed.

或者,如果您要搜索的字符串数量有限且事先已知(例如,您正在使用一组需要从 URL 中提取的有限域名),您可以使用可编入索引的确定性函数。

http://www.akadia.com/services/ora_function_based_index_2.html

http://www.akadia.com/services/ora_function_based_index_2.html

回答by Martin Ridgway

Use Oracle text BUT the slightly newer CTXCAT variation - this domain index is updated as part of the transaction that inserts/updates the row in question and is therefore always up-to-date - See Oracle's own Oracle Text documentation for details.

使用 Oracle 文本但稍新的 CTXCAT 变体 - 此域索引作为插入/更新相关行的事务的一部分进行更新,因此始终是最新的 - 有关详细信息,请参阅 Oracle 自己的 Oracle Text 文档。