Oracle 查询在索引号列上使用“like”,性能不佳

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1676064/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-18 19:21:44  来源:igfitidea点击:

Oracle query using 'like' on indexed number column, poor performance

sqloracleindexingoracle10gsql-like

提问by James Collins

On Query 1 a full table scan is being performed even though the id is an indexed column. Query 2 achieves the same result but much faster. If Query 1 is run returning an indexed column then it returns quickly but if non-indexed columns are returned or the entire row is then the query takes longer.

在查询 1 上,即使 id 是索引列,也会执行全表扫描。查询 2 获得了相同的结果,但速度要快得多。如果运行查询 1 返回索引列,则它会快速返回,但如果返回非索引列或整行,则查询需要更长的时间。

In Query 3 it runs fast but the column 'code' is a VARCHAR2(10) in stead of a NUMBER(12) and is indexed the same way as 'id'.

在查询 3 中,它运行得很快,但列 'code' 是 VARCHAR2(10) 而不是 NUMBER(12) 并且索引方式与 'id' 相同。

Why does Query 1 not pick up that it should use the index? Is there something that should be changed to allow indexed number columns to perform quicker?

为什么查询 1 没有选择它应该使用索引?是否应该更改某些内容以允许索引编号列更快地执行?

[Query 1]

[查询 1]

select a1.*
from people a1
where a1.id like '119%' 
and rownum < 5

Explain Plan
SELECT STATEMENT ALL_ROWS
Cost: 67 Bytes: 2,592 Cardinality: 4
2 COUNT STOPKEY
    1 TABLE ACCESS FULL TABLE people
     Cost: 67 Bytes: 3,240 Cardinality: 5

解释计划
SELECT STATEMENT ALL_ROWS
成本:67 字节:2,592 基数:4
2 COUNT STOPKEY
    1 TABLE ACCESS FULL TABLE people
     成本:67 字节:3,240 基数:5

[Query 2]

[查询 2]

select a1.*
from people a1, people a2
where a1.id = a2.id
and a2.id like '119%' 
and rownum < 5

Explain Plan
SELECT STATEMENT ALL_ROWS
Cost: 11 Bytes: 2,620 Cardinality: 4
5 COUNT STOPKEY
    4 TABLE ACCESS BY INDEX ROWID TABLE people
    Cost: 3 Bytes: 648 Cardinality: 1
        3 NESTED LOOPS
        Cost: 11 Bytes: 2,620 Cardinality: 4
            1 INDEX FAST FULL SCAN INDEX people_IDX3
            Cost: 2 Bytes: 54,796 Cardinality: 7,828
            2 INDEX RANGE SCAN INDEX people_IDX3
            Cost: 2 Cardinality: 1

解释计划
SELECT语句ALL_ROWS
费用:11字节:2620基数:4
5 COUNT STOPKEY
    4 TABLE ACCESS BY INDEX ROWID表人
    费用:3字节:648基数:1
        3嵌套循环
        成本:11字节:2620基数:4
            1索引快速全SCAN INDEX people_IDX3
            成本:2 字节:54,796 基数:7,828
            2 INDEX RANGE SCAN INDEX people_IDX3
            成本:2 基数:1

[Query 3]

[查询 3]

select a1.*
from people a1
where a1.code like '119%' 
and rownum < 5

Explain Plan
SELECT STATEMENT ALL_ROWS
Cost: 6 Bytes: 1,296 Cardinality: 2
   3 COUNT STOPKEY
      2 TABLE ACCESS BY INDEX ROWID TABLE people
      Cost: 6 Bytes: 1,296 Cardinality: 2
         1 INDEX RANGE SCAN INDEX people_IDX4
         Cost: 3 Cardinality: 2

解释计划
SELECT STATEMENT ALL_ROWS
成本:6 字节:1,296 Cardinality:2
   3 COUNT STOPKEY
      2 TABLE ACCESS BY INDEX ROWID TABLE people
      成本:6 字节:1,296 Cardinality:2
         1 INDEX RANGE SCAN INDEX people_IDX4in
         Cost:

回答by Sergey Stadnik

LIKE pattern-matching conditionexpects to see character types as both left-side and right-side operands. When it encounters a NUMBER, it implicitly converts it to char. Your Query 1 is basically silently rewritten to this:

LIKE 模式匹配条件期望将字符类型视为左侧和右侧操作数。当它遇到一个 NUMBER 时,它会隐式地将它转换为 char。您的查询 1 基本上被默默地改写为:

SELECT a1.*
  FROM people a1
 WHERE TO_CHAR(a1.id) LIKE '119%'
   AND ROWNUM < 5

That happens in your case, and that is bad for 2 reasons:

在您的情况下会发生这种情况,这很糟糕,原因有两个:

  1. The conversion is executed for every row, which is slow;
  2. Because of a function (though implicit) in a WHERE predicate, Oracle is unable to use the index on A1.IDcolumn.
  1. 对每一行都执行转换,速度很慢;
  2. 由于 WHERE 谓词中的函数(尽管是隐式的),Oracle 无法在A1.ID列上使用索引。

To get around it, you need to do one of the following:

要绕过它,您需要执行以下操作之一:

  1. Create a function-based indexon A1.IDcolumn:

    CREATE INDEX people_idx5 ON people (TO_CHAR(id));

  2. If you need to match records on first 3 characters of ID column, create another column of type NUMBER containing just these 3 characters and use a plain =operator on it.

  3. Create a separatecolumn ID_CHARof type VARCHAR2and fill it with TO_CHAR(id). Index it and use instead of IDin your WHEREcondition.

    Of course if you choose to create an additional column based on existing ID column, you need to keep those 2 synchronized.You can do that in batch as a single UPDATE, or in an ON-UPDATE trigger, or add that column to the appropriate INSERT and UPDATE statements in your code.

  1. 在列上创建基于函数的索引A1.ID

    CREATE INDEX people_idx5 ON people (TO_CHAR(id));

  2. 如果您需要匹配 ID 列的前 3 个字符的记录,请创建另一个仅包含这 3 个字符的 NUMBER 类型的列,并在其上使用普通的=运算符。

  3. 创建一个单独ID_CHAR的类型列VARCHAR2并用TO_CHAR(id). 索引它并ID在您的WHERE情况下使用而不是。

    当然,如果您选择基于现有 ID 列创建附加列,则需要保持这 2 个同步。您可以将其作为单个 UPDATE 或在 ON-UPDATE 触发器中批量执行,或将该列添加到适当的代码中的 INSERT 和 UPDATE 语句。

回答by Gary Myers

LIKE is a string function, so a numeric index can't be used as easily. In numeric index, you'll have 119,120,130,..,1191,1192,1193...,11921,11922... etc. That is all the rows starting with the '119' won't be in the same place, so the whole index has to be read (hence the FAST FULL SCAN). In a character based index they will be together (eg '119','1191','11911','120',...) so a better RANGE SCAN can be used.

LIKE 是一个字符串函数,因此不能轻易使用数字索引。在数字索引中,您将有 119,120,130,..,1191,1192,1193...,11921,11922... 等等。也就是说,所有以“119”开头的行都不会在同一个地方,所以必须读取整个索引(因此是 FAST FULL SCAN)。在基于字符的索引中,它们将在一起(例如,'119'、'1191'、'11911'、'120'、...)因此可以使用更好的 RANGE SCAN。

If you were looking for id values in a particular range (eg 119000 to 119999) then specify that as the predicate (id between 119000 and 119999).

如果您要查找特定范围内的 id 值(例如 119000 到 119999),则将其指定为谓词(id 介于 119000 和 119999 之间)。

回答by user188658

Optimizer decided that it's faster to do a table scan, most probably due to low number of actual records.

优化器决定执行表扫描速度更快,这很可能是因为实际记录数较少。

Also, you should know that non-exact matching is always way worse than exact. If your where was "a1.id='123456'", it would most probably use index. But then again, even index takes two reads (first find a record in the index, then read the block from table) and for very small tables it could decide for table scan.

此外,您应该知道非精确匹配总是比精确匹配更糟糕。如果您的位置是“a1.id='123456'”,它很可能会使用索引。但话说回来,即使是索引也需要两次读取(首先在索引中找到一条记录,然后从表中读取块),对于非常小的表,它可以决定进行表扫描。

回答by davek

Try placing a hint in one of your queries to force it to use the desired index and then check your plan: it could be that (due to skewing or whatever) the optimzer doestake the index into account, but decides against using it because of the perceived cost.

尝试在您的一个查询中放置一个提示以强制它使用所需的索引,然后检查您的计划:可能是(由于倾斜或其他原因)优化器确实考虑了索引,但决定不使用它,因为感知成本。

回答by Michael Dillon

The LIKEkeyword tells SQL that you are doing a regular expression match. You should never use regular expressions in SQL or in any programming library until you have checked the string functions available to see if the query could be expressed simply with them. In this case, you could change this to an equals condition by only comparing the substring consisting of the first 3 characters of the code. In Oracle, this would look like:

LIKE关键字告诉 SQL 您正在进行正则表达式匹配。在您检查可用的字符串函数以查看是否可以简单地用它们表达查询之前,您不应在 SQL 或任何编程库中使用正则表达式。在这种情况下,您可以通过仅比较由代码的前 3 个字符组成的子字符串将其更改为等于条件。在 Oracle 中,这看起来像:

SELECT *
FROM people
WHERE SUBSTR(code,1,3) = '119'