oracle:如何确保只有在所有剩余的 where 子句都过滤了结果后才会调用 where 子句中的函数?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/8428328/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-19 00:29:40  来源:igfitidea点击:

oracle : how to ensure that a function in the where clause will be called only after all the remaining where clauses have filtered the result?

oraclefunctionoptimizationfilter

提问by Daud

I am writing a query to this effect:

我正在为此编写一个查询:

select * 
from players 
where player_name like '%K% 
  and player_rank<10 
  and check_if_player_is_eligible(player_name) > 1;

Now, the function check_if_player_is_eligible()is heavy and, therefore, I want the query to filter the search results sufficiently and then only run this function on the filtered results.

现在,函数check_if_player_is_eligible()很重,因此,我希望查询充分过滤搜索结果,然后仅对过滤后的结果运行此函数。

How can I ensure that the all filtering happens before the function is executed, so that it runs the minimum number of times ?

如何确保在执行函数之前进行所有过滤,以使其运行次数最少?

回答by Vincent Malgrat

Here's two methods where you can trick Oracle into not evaluating your function before all the other WHERE clauses have been evaluated:

这里有两种方法可以让 Oracle 在评估所有其他 WHERE 子句之前不评估您的函数:

  1. Using rownum

    Using the pseudo-column rownumin a subquery will force Oracle to "materialize" the subquery. See for example this askTom thread for examples.

    SELECT *
      FROM (SELECT *
               FROM players
              WHERE player_name LIKE '%K%'
                AND player_rank < 10
                AND ROWNUM >= 1)
     WHERE check_if_player_is_eligible(player_name) > 1
    

    Here's the documentation reference "Unnesting of Nested Subqueries":

    The optimizer can unnest most subqueries, with some exceptions. Those exceptions include hierarchical subqueries and subqueries that contain a ROWNUM pseudocolumn, one of the set operators, a nested aggregate function, or a correlated reference to a query block that is not the immediate outer query block of the subquery.

  2. Using CASE

    Using CASE you can force Oracle to only evaluate your function when the other conditions are evaluated to TRUE. Unfortunately it involves duplicating code if you want to make use of the other clauses to use indexes as in:

    SELECT *
      FROM players
     WHERE player_name LIKE '%K%'
       AND player_rank < 10
       AND CASE 
             WHEN player_name LIKE '%K%'
              AND player_rank < 10 
                THEN check_if_player_is_eligible(player_name) 
           END > 1
    
  1. 使用 rownum

    rownum在子查询中使用伪列将强制 Oracle“具体化”子查询。例如,请参阅此askTom 线程以获取示例

    SELECT *
      FROM (SELECT *
               FROM players
              WHERE player_name LIKE '%K%'
                AND player_rank < 10
                AND ROWNUM >= 1)
     WHERE check_if_player_is_eligible(player_name) > 1
    

    这是文档参考“嵌套子查询的嵌套”

    优化器可以取消嵌套大多数子查询,但有一些例外。这些例外包括分层子查询和包含 ROWNUM 伪列、集合运算符之一、嵌套聚合函数或对不是子查询的直接外部查询块的查询块的相关引用的子查询。

  2. 使用案例

    使用 CASE,您可以强制 Oracle 仅在其他条件评估为 TRUE 时评估您的函数。不幸的是,如果您想使用其他子句来使用索引,则它涉及重复代码,如下所示:

    SELECT *
      FROM players
     WHERE player_name LIKE '%K%'
       AND player_rank < 10
       AND CASE 
             WHEN player_name LIKE '%K%'
              AND player_rank < 10 
                THEN check_if_player_is_eligible(player_name) 
           END > 1
    

回答by Alessandro Rossi

There is the NO_PUSH_PREDhint to do it without involving rownum evaluation (that is a good trick anyway) in the process!

有一个NO_PUSH_PRED提示可以在不涉及 rownum 评估(无论如何这是一个很好的技巧)的过程中做到这一点!

SELECT /*+NO_PUSH_PRED(v)*/*
FROM (
        SELECT *
        FROM players
        WHERE player_name LIKE '%K%'
            AND player_rank < 10
    ) v
 WHERE check_if_player_is_eligible(player_name) > 1

回答by Jon Heller

You usually want to avoid forcing a specific order of execution. If the data or the query changes, your hints and tricks may backfire. It's usually better to provide useful metadata to Oracle so it can make the correct decisions for you.

您通常希望避免强制执行特定的顺序。如果数据或查询发生变化,您的提示和技巧可能会适得其反。通常最好向 Oracle 提供有用的元数据,以便它可以为您做出正确的决策。

In this case, you can provide better optimizer statistics about the function with ASSOCIATE STATISTICS.

在这种情况下,您可以使用ASSOCIATE STATISTICS提供关于函数的更好的优化器统计信息。

For example, if your function is very slow because it has to read 50 blocks each time it is called:

例如,如果您的函数很慢,因为每次调用它都必须读取 50 个块:

associate statistics with functions
check_if_player_is_eligible default cost(1000 /*cpu*/, 50 /*IO*/, 0 /*network*/);

By default Oracle assumes that a function will select a row 1/20th of the time. Oracle wants to eliminate as many rows as soon as possible, changing the selectivity should make the function less likely to be executed first:

默认情况下,Oracle 假定函数将选择行的时间为 1/20。Oracle 想尽快消除尽可能多的行,改变选择性应该使函数不太可能首先被执行:

associate statistics with functions
check_if_player_is_eligible default selectivity 90;

But this raises some other issues. You have to pick a selectivity for ALL possible conditions, 90% certainly won't always be accurate. The IO cost is the number of blocks fetched, but CPU cost is "machine instructions used", what exactly does that mean?

但这引发了一些其他问题。您必须为所有可能的条件选择一个选择性,90% 肯定不会总是准确的。IO 成本是获取的块数,而 CPU 成本是“使用的机器指令”,这到底是什么意思?

There are more advanced ways to customize statistics,for example using the Oracle Data Cartridge Extensible Optimizer. But data cartridge is probably one of the most difficult Oracle features.

有更高级的方法来自定义统计信息,例如使用Oracle Data Cartridge Extensible Optimizer。但数据磁带可能是最困难的 Oracle 功能之一。

回答by HAL 9000

You did't specify whether player.player_name is unique or not. One could assume that it is and then the database has to call the function at least onceper result record.

您没有指定 player.player_name 是否唯一。人们可以假设是这样,然后数据库必须至少为每个结果记录调用一次该函数。

But, if player.player_name is not unique, you would want to minimize the calls down to count(distinct player.player_name)times. As (Ask)Tom shows in Oracle Magazine, the scalar subquery cacheis an efficient way to do this.

但是,如果 player.player_name 不是唯一的,您可能希望将调用次数减少到count(distinct player.player_name)次。正如(Ask)Tom 在 Oracle Magazine 中所示标量子查询缓存是实现此目的的有效方法。

You would have to wrap your function call into a subselect in order to make use of the scalar subquery cache:

您必须将函数调用包装到子选择中才能使用标量子查询缓存:

SELECT players.*
FROM   players,
      (select check_if_player_is_eligible(player.player_name) eligible) subq
WHERE  player_name LIKE '%K%'
  AND  player_rank < 10
  AND  ROWNUM >= 1
  AND  subq.eligible = 1

回答by Adrian

Put the original query in a derived table then place the additional predicate in the where clause of the derived table.

将原始查询放在派生表中,然后将附加谓词放在派生表的 where 子句中。

select * 
from (
   select * 
   from players 
   where player_name like '%K% 
     and player_rank<10 
) derived_tab1
Where  check_if_player_is_eligible(player_name) > 1;