oracle 如何减少大数据表的查询执行时间

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/19136444/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-19 01:58:01  来源:igfitidea点击:

How to reduce query execution time for table with huge data

sqldatabaseperformanceoracle

提问by RahSin

I am running this query in production(Oracle) and it is taking more than 3 minutes . Is there any way out to reduce the execution time ? Both svc_order and event table contains almost 1million records .

我在生产(Oracle)中运行这个查询,它需要超过 3 分钟。有什么办法可以减少执行时间吗?svc_order 和事件表都包含近 100 万条记录。

select 0 test_section, count(1) count, 'DD' test_section_value  
from svc_order so, event e  
where so.svc_order_id = e.svc_order_id  
  and so.entered_date >= to_date('01/01/2012', 'MM/DD/YYYY')  
  and e.event_type = 230 and e.event_level = 'O'  
  and e.current_sched_date between 
      to_date( '09/01/2010 00:00:00', 'MM/DD/YYYY HH24:MI:SS')
      and to_date('09/29/2013 23:59:59', 'MM/DD/YYYY HH24:MI:SS')  
  and (((so.sots_ta = 'N') and (so.action_type = 0)) 
       or  ((so.sots_ta is null) and (so.action_type = 0)) 
       or  ((so.sots_ta = 'N') and (so.action_type is null)))
  and so.company_code = 'LL'

采纳答案by Aditya Kakirde

Looking at the what you said that you cannot create indexes. I hope that the query is making a full table scan on the table. Please try a parallel hint.

看看你说的不能创建索引。我希望查询正在对表进行全表扫描。请尝试并行提示。

select /*+ full(so) parallel(so, 4) */ 0 test_section, count(1) count, 'DD' test_section_value  
from svc_order so, event e  
where so.svc_order_id = e.svc_order_id  
  and so.entered_date >= to_date('01/01/2012', 'MM/DD/YYYY')  
  and e.event_type = 230 and e.event_level = 'O'  
  and e.current_sched_date between 
      to_date( '09/01/2010 00:00:00', 'MM/DD/YYYY HH24:MI:SS')
      and to_date('09/29/2013 23:59:59', 'MM/DD/YYYY HH24:MI:SS')  
  and (((so.sots_ta = 'N') and (so.action_type = 0)) 
       or  ((so.sots_ta is null) and (so.action_type = 0)) 
       or  ((so.sots_ta = 'N') and (so.action_type is null)))
  and so.company_code = 'LL'

回答by wildplasser

You could at leastavoid the triple AND/OR list by using COALESCE()(or its oracle equivalent IFNULL()) Note: this does not catch the case where bothsots_ta and action_type are NULL.

你可以至少避免了三人间和/或按使用COALESCE()(或等同的Oracle IFNULL())注:这不赶在那里的情况下sots_ta和ACTION_TYPE是NULL。

SELECT 0 test_section, count(1) count, 'DD' test_section_value
FROM svc_order so 
JOIN event e  ON so.svc_order_id = e.svc_order_id
WHERE e.event_type = 230 and e.event_level = 'O'  
  AND so.entered_date >= to_date('01/01/2012', 'MM/DD/YYYY')
  AND e.current_sched_date >= to_date('09/01/2010 00:00:00', 'MM/DD/YYYY HH24:MI:SS')
  AND e.current_sched_date  < to_date('10/01/2013 00:00:00', 'MM/DD/YYYY HH24:MI:SS') 
  AND  COALESCE(so.sots_ta, 'N') = 'N'
  AND  COALESCE(so.action_type, 0) = 0   
  AND so.company_code = 'LL'

I replaced the between by a plain t >= low AND t. < high)test because I don't like betweens semantics. I replaced the FROM kommalistby a JOINbecause I like joins better.

我用一个简单的t >= low AND t. < high)测试替换了 between,因为我不喜欢betweens 语义。我用FROM kommalista代替了,JOIN因为我更喜欢加入。

回答by Anup Shah

We cannot have additional indexes but tables must have at least meaning full primary key, right so is there one? That should result in at least index, non/clustered, anything. Look at it lets and try to make use of it.

我们不能有额外的索引,但表必须至少具有完整的主键,对吗?有吗?这应该至少导致索引,非/聚集,任何东西。看看它让我们尝试使用它。

In case table is a heap, and we want to deal with it as it is, then we should reduce the number rows in each table individually by applying respective where filters and then combine that result set. In your query only meaning full result column depends on base tables is count(1). Other two columns are constants. Because also JOIN/Cartesian Product etc….. will lead DB engine to look for Indexes so instead use INTERSECT which I feel should better in your case. Some other changes you can do: Avoid using TO_DATE or any kind of function in Right Side of the WHERE condition column. Prepare data in local Variable and use Local Variable in query. Also you need to check is there any good performance gain using >= than BETWEEN ?

如果 table 是一个堆,并且我们想要按原样处理它,那么我们应该通过应用各自的 where 过滤器来单独减少每个 table 中的行数,然后组合该结果集。在您的查询中,仅意味着完整结果列取决于基表是 count(1)。其他两列是常量。因为还有 JOIN/Cartesian Product 等......会导致数据库引擎寻找索引,所以改用 INTERSECT ,我觉得在你的情况下应该更好。您可以做的其他一些更改: 避免在 WHERE 条件列的右侧使用 TO_DATE 或任何类型的函数。在本地变量中准备数据并在查询中使用本地变量。您还需要检查使用 >= 比 BETWEEN 是否有任何良好的性能提升?

I have modified the query and also combined one redundant where condition. Remember that if this changes works for you right now that doesn't mean it will work always. As son your table start hitting more data that qualifies those WHERE conditions this swill again come back as slow query. so for short term this might work but longer term you have to think about alternate options

我修改了查询并结合了一个冗余的 where 条件。请记住,如果此更改现在对您有效,并不意味着它会一直有效。随着您的表开始命中更多符合 WHERE 条件的数据,这将再次作为慢查询返回。所以在短期内这可能会奏效,但从长远来看,你必须考虑其他选择

    1)  for example Indexed Views on top of this tables
    2)  Create same tables with different name and sync data 
        between new and original table using  “Insert/Update/Delete Trigger”.




    SELECT COUNT(1) AS [COUNT], 'DD' test_section_value  ,0 test_section
    FROM
    (
        SELECT  so.svc_order_id
        FROM    svc_order so
        WHERE   so.entered_date >= to_date('01/01/2012', 'MM/DD/YYYY')
                AND so.company_code = 'LL'

        INTERSECT

        SELECT  e.svc_order_id
        FROM    event e
        WHERE   e.event_type = 230
                AND e.event_level = 'O'
                AND e.current_sched_date BETWEEN
                    to_date('09/01/2010 00:00:00','MM/DD/YYYY HH24:MI:SS')
                    AND to_date('09/29/2013 23:59:59','MM/DD/YYYY HH24:MI:SS')
                AND ( 
                        (( so.sots_ta = 'N' ) AND ( so.action_type IS NULL OR so.action_type = 0))
                        OR 
                        (( so.sots_ta IS NULL ) AND ( so.action_type = 0 )) 
                        --or ((so.sots_ta = 'N') and (so.action_type is null))
                    )
    )qry1

回答by Jon Heller

First, ensure statistics are up-to-date.

首先,确保统计数据是最新的。

begin
    dbms_stats.gather_table_stats('[schema]', 'svc_order');
    dbms_stats.gather_table_stats('[schema]', 'event');
end;
/

This query is a very simple join between two small tables, but with complex predicates. You almost certainly do notwant to significantly re-write all of your queries in search of some magic syntax that will make everything run fast. Yes, there are some rare cases where BETWEENdoes not work well, or moving the predicates into an inline view helps, or replacing the join with an INTERSECTmight help. But that sounds like cargo-cult programmingto me. Ask yourself, whywould those changes make any difference? If those types of changes always improved performance, why wouldn't Oracle just translate the queries internally?

此查询是两个小表之间的非常简单的连接,但具有复杂的谓词。你几乎可以肯定做希望显著重新编写所有的疑问在搜索一些神奇的语法,这将让一切跑得快的。是的,在极少数情况下BETWEEN效果不佳,或者将谓词移动到内联视图中INTERSECT会有所帮助,或者可能会帮助替换连接。但这对我来说听起来像是货物崇拜编程。问问自己,为什么这些变化会产生任何影响?如果这些类型的更改总能提高性能,为什么 Oracle 不直接在内部转换查询?

Normally, you should try to provide better information to the optimizer so it can make better decisions. Usually this is as simple as gathering statistics with the default settings. Some predicates are just too complex, and for that you should try to use dynamic sampling, such as /*+ dynamic_sampling(6) */. Or maybe add some histograms. Or perhaps add an expression statisticlike this:

通常,您应该尝试向优化器提供更好的信息,以便它可以做出更好的决策。通常这就像使用默认设置收集统计信息一样简单。有些谓词太复杂了,为此您应该尝试使用 动态采样,例如/*+ dynamic_sampling(6) */. 或者添加一些直方图。或者也许添加这样的表达式统计信息

SELECT 
    DBMS_STATS.CREATE_EXTENDED_STATS(null,'SVC_ORDER',
        '(((so.sots_ta = 'N') and (so.action_type = 0)) 
        or  ((so.sots_ta is null) and (so.action_type = 0)) 
        or  ((so.sots_ta = 'N') and (so.action_type is null)))'
    ) 
FROM DUAL;
--Don't forget to re-gather statistics after this.

The optimizer is probably under-estimating the number of rows, and using a nested loop instead of a hash join. After providing it with more information, ideally it will start using a hash join. But at some point, after you've tried the above methods and possibly many other features, you can just tell it what kind of join to use. Which would be @Florin Ghita's suggestion, /*+use_hash(so e)*/.

优化器可能低估了行数,并使用了嵌套循环而不是散列连接。在为其提供更多信息后,理想情况下它将开始使用散列连接。但是在某些时候,在您尝试了上述方法和可能的许多其他功能之后,您可以告诉它使用哪种连接。这将是@Florin Ghita 的建议,/*+use_hash(so e)*/.