通过删除“存在”和“不存在”来优化 Oracle 查询

Question

提问by Lawtonfogle

I recently moved a piece of code into production on a oracle database where one of the more experienced developer who reviewed it mentioned I had way too many existsand not existsstatements and that there should be a way to remove them, but it had been too long since he had to use it and didn't remember much on how it worked. Currently, I'm going back and making the piece of code more maintainable as it is a piece likely to be changed multiple times in future years as business logic/requirements change, and I wanted to go ahead and optimize it while making it more maintainable.

我最近搬到了一段代码到生产Oracle数据库在哪里谁它提到我有太多的经验更丰富的开发者之一exists，并not exists声明，并且应该有一个方法，以消除它们，但它已经太长时间，因为他不得不使用它并且不记得它是如何工作的。目前，我正在回过头来让这段代码更易于维护，因为它可能在未来几年随着业务逻辑/需求的变化而多次更改，我想继续优化它，同时使其更易于维护.

I've tried looking it up, but all I can find is recommendations on replacing not inwith not existsand to not return actual results.

我试图寻找它，但所有我能找到的关于更换建议，not in与not exists和不返回实际结果。

As such, I'm wondering what can be done to optimize out exists/not existsor if there is a way to write exists/not existsso that oracle will optimize it internally (likely at a better degree than I can).

因此，我想知道可以做些什么来优化exists/not exists或者是否有办法编写exists/not exists以便 oracle 将在内部对其进行优化（可能比我能做到的更好）。

For example, how can the following be optimized?

例如，如何优化以下内容？

UPDATE
    SCOTT.TABLE_N N
SET
    N.VALUE_1 = 'Data!'
WHERE
    N.VALUE_2 = 'Y'
    AND
    EXISTS
    (
        SELECT
            1
        FROM
            SCOTT.TABLE_Q Q
        WHERE
            N.ID = Q.N_ID
    )
    AND
    NOT EXISTS
    (
        SELECT
            1
        FROM
            SCOTT.TABLE_W W
        WHERE
            N.ID = W.N_ID
    )

Answer 1

回答by Kirill Leontev

Your statement seems perfectly fine to me.

你的说法在我看来完全没问题。

In any optimizing task, don't think patterns. Don't think like, "(not) existsis bad and slow, (not) inis super cool and fast".

在任何优化任务中，不要考虑模式。不要认为“(not) exists又坏又慢，(not) in又酷又快”。

Think like, how much work does database do on each step and how can you measure it?

想想，数据库在每一步做了多少工作，你如何衡量它？

A simple example:

一个简单的例子：

-- NOT IN:

-- 不在：

23:59:41 HR@sandbox> alter system flush buffer_cache;

System altered.

Elapsed: 00:00:00.03
23:59:43 HR@sandbox> set autotrace traceonly explain statistics
23:59:49 HR@sandbox> select country_id from countries where country_id not in (select country_id from locations);

11 rows selected.

Elapsed: 00:00:00.02

Execution Plan
----------------------------------------------------------
Plan hash value: 1748518851

------------------------------------------------------------------------------------------
| Id  | Operation              | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |                 |     1 |     6 |     4   (0)| 00:00:01 |
|*  1 |  FILTER                |                 |       |       |            |          |
|   2 |   NESTED LOOPS ANTI SNA|                 |    11 |    66 |     4  (75)| 00:00:01 |
|   3 |    INDEX FULL SCAN     | COUNTRY_C_ID_PK |    25 |    75 |     1   (0)| 00:00:01 |
|*  4 |    INDEX RANGE SCAN    | LOC_COUNTRY_IX  |    13 |    39 |     0   (0)| 00:00:01 |
|*  5 |   TABLE ACCESS FULL    | LOCATIONS       |     1 |     3 |     3   (0)| 00:00:01 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter( NOT EXISTS (SELECT 0 FROM "LOCATIONS" "LOCATIONS" WHERE
              "COUNTRY_ID" IS NULL))
   4 - access("COUNTRY_ID"="COUNTRY_ID")
   5 - filter("COUNTRY_ID" IS NULL)


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         11  consistent gets
          8  physical reads
          0  redo size
        446  bytes sent via SQL*Net to client
        363  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         11  rows processed

-- NOT EXISTS

- 不存在

23:59:57 HR@sandbox> alter system flush buffer_cache;

System altered.

Elapsed: 00:00:00.17
00:00:02 HR@sandbox> select country_id from countries c where not exists (select 1 from locations l where l.country_id = c.country_id );

11 rows selected.

Elapsed: 00:00:00.30

Execution Plan
----------------------------------------------------------
Plan hash value: 840074837

-------------------------------------------------------------------------------------
| Id  | Operation         | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |                 |    11 |    66 |     1   (0)| 00:00:01 |
|   1 |  NESTED LOOPS ANTI|                 |    11 |    66 |     1   (0)| 00:00:01 |
|   2 |   INDEX FULL SCAN | COUNTRY_C_ID_PK |    25 |    75 |     1   (0)| 00:00:01 |
|*  3 |   INDEX RANGE SCAN| LOC_COUNTRY_IX  |    13 |    39 |     0   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("L"."COUNTRY_ID"="C"."COUNTRY_ID")


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          5  consistent gets
          2  physical reads
          0  redo size
        446  bytes sent via SQL*Net to client
        363  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         11  rows processed

NOT IN in this example reads twice as much database blocks and performs more complicated filtering - ask yourself, why would you chose it over NOT EXISTS?

在这个例子中 NOT IN 读取两倍的数据库块并执行更复杂的过滤 - 问问自己，为什么你会选择它而不是 NOT EXISTS？

Answer 2

回答by Todd Gibson

There is no reason to avoid using EXISTS or NOT EXISTS when that is what you need. In the example you gave, that is probably exactly what you want to use.

当您需要时，没有理由避免使用 EXISTS 或 NOT EXISTS。在您给出的示例中，这可能正是您想要使用的。

The typical dilemma is whether to use IN/NOT IN, or EXISTS/NOT EXISTS. They are evaluated quite differently, and one may be faster or slower depending on your specific circumstances.

典型的困境是使用 IN/NOT IN 还是 EXISTS/NOT EXISTS。它们的评估方式完全不同，根据您的具体情况，可能更快或更慢。

See herefor more details than you probably want.

请参阅此处了解比您想要的更多的详细信息。

Answer 3

回答by Markus Jarderot

I don't know if it is much faster, but here is a way to write it without EXISTS/NOT EXISTS:

我不知道它是否更快，但这里有一种不用EXISTS/编写它的方法NOT EXISTS：

MERGE INTO TABLE_N T
USING (
  SELECT N.ID, 'Data!' AS NEW_VALUE_1
  FROM SCOTT.TABLE_N N
  INNER JOIN SCOTT.TABLE_Q Q
      ON Q.N_ID = N.ID
  LEFT JOIN SCOTT.TABLE_W W
      ON W.N_ID = N.ID
  WHERE N.VALUE_2 = 'Y'
  AND W.ID IS NULL
) X
ON ( T.ID = X.ID )
WHEN MATCHED THEN UPDATE
    SET T.VALUE_1 = X.NEW_VALUE_1;

通过删除“存在”和“不存在”来优化 Oracle 查询

提问by Lawtonfogle

回答by Kirill Leontev

回答by Todd Gibson

回答by Markus Jarderot

相关推荐

最近更新

标签

通过删除“存在”和“不存在”来优化 Oracle 查询

提问by Lawtonfogle

回答by Kirill Leontev

回答by Todd Gibson

回答by Markus Jarderot

相关推荐

oracle 调用HTTP页面时如何提供用户名和密码？

oracle 如何只为一个用户设置用户 password_life_time？

oracle 在 select 语句中设置默认值（不使用 UNION 语句）

将 sysdate 附加到 Oracle 中的表名

相关推荐

最近更新

标签