oracle 使用 REGEXP_SUBSTR(AGGREGATOR,'[^;]+',1,LEVEL) 查询速度很慢

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/15106247/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-19 01:28:40  来源:igfitidea点击:

oracle query slow with REGEXP_SUBSTR(AGGREGATOR,'[^;]+',1,LEVEL)

sqloracle

提问by user2114275

Hi i am using this query to get diffrent row in ; seprate value

嗨,我正在使用此查询来获取不同的行;单独的值

table is like

桌子就像

row_id  aggregator
1             12;45
2             25

using this query i want output like

使用这个查询我想要输出

row_id  aggregator
1        12
1        45
2        25

i am using below query

我正在使用以下查询

SELECT 
DISTINCT ROW_ID,  
REGEXP_SUBSTR(AGGREGATOR,'[^;]+',1,LEVEL) as AGGREGATOR,                       
FROM DUMMY_1 
CONNECT BY REGEXP_SUBSTR(AGGREGATOR,'[^;]+',1,LEVEL) IS NOT NULL;

but it is very slow even for 300 records
I have to work for 40000 records.

但即使对于 300 条记录,它也很慢,
我必须为 40000 条记录工作。

回答by A.B.Cade

Sometimes a pipelined table can be faster, try this:

有时流水线表可以更快,试试这个:

create or replace type t is object(word varchar2(100), pk number);
/
create or replace type t_tab as table of t;
/

create or replace function split_string(del in varchar2) return t_tab
  pipelined is

  word    varchar2(4000);
  str_t   varchar2(4000) ;
  v_del_i number;
  iid     number;

  cursor c is
    select * from DUMMY_1; 

begin

  for r in c loop
    str_t := r.aggregator;
    iid   := r.row_id;

    while str_t is not null loop

      v_del_i := instr(str_t, del, 1, 1);

      if v_del_i = 0 then
        word  := str_t;
        str_t := '';
      else
        word  := substr(str_t, 1, v_del_i - 1);
        str_t := substr(str_t, v_del_i + 1);
      end if;

      pipe row(t(word, iid));

    end loop;

  end loop;

  return;
end split_string;

Here is a sqlfiddle demo

这是一个 sqlfiddle 演示

And here is another demowith 22 rows containing 3 vals in aggregator each - see the difference between first and second query..

这是另一个演示,其中包含 22 行,每行在聚合器中包含 3 个 val - 请参阅第一个和第二个查询之间的区别。

回答by Vincent Malgrat

Regular expressions are known to be expensive functions, so you should try to minimize their use when performance is critical (such as using standard functions in the CONNECT BYclause).

众所周知,正则表达式是开销很大的函数,因此当性能至关重要时(例如在CONNECT BY子句中使用标准函数),您应该尽量减少它们的使用。

Using standard functions (INSTR, SUBSTR, REPLACE) will be more efficient, but the resulting code will be hard to read/understand/maintain.

使用标准函数 ( INSTR, SUBSTR, REPLACE) 会更有效率,但生成的代码将难以阅读/理解/维护。

I could not resist writing a recursive QTE, which I is much more efficient than both regular expressions and standard functions. Furthermore, recursive QTE queries have arguably some elegance. You'll need Oracle 11.2:

我忍不住写了一个递归 QTE,它比正则表达式和标准函数都要高效得多。此外,递归 QTE 查询可以说是有些优雅。您将需要 Oracle 11.2:

WITH rec_sql(row_id, aggregator, lvl, tail) AS (
SELECT row_id, 
       nvl(substr(aggregator, 1, instr(aggregator, ';') - 1), 
           aggregator),
       1 lvl,
       CASE WHEN instr(aggregator, ';') > 0 THEN
          substr(aggregator, instr(aggregator, ';') + 1)
       END tail
  FROM dummy_1 initialization
UNION ALL
SELECT r.row_id, 
       nvl(substr(tail, 1, instr(tail, ';') - 1), tail), 
       lvl + 1, 
       CASE WHEN instr(tail, ';') > 0 THEN
          substr(tail, instr(tail, ';') + 1)
       END tail
  FROM rec_sql r
 WHERE r.tail IS NOT NULL
)
SELECT * FROM rec_sql;

You can see on SQLFiddlethat this solution is very performant and on par with @A.B.Cade's solution. (Thanks to A.B.Cade for the test case).

您可以在SQLFiddle 上看到该解决方案非常高效,并且与@ABCade 的解决方案相当。(感谢 ABCade 的测试用例)。

回答by mik

Your connect byproduces much more records than needed, that's why the performance is poor and you need to use distinctto limit the number records. An approach that does need distinctwould be:

connect by产生的记录比需要的多得多,这就是为什么性能很差,您需要使用distinct来限制记录数量。一种确实需要的方法distinct是:

select row_id, regexp_substr(aggregator,'[^;]+',1,n) aggregator
  from dummy_1, (select level n from dual connect by level < 100)
 where n <= regexp_count(aggregator,';')+1

回答by Art

I think the DISTINCT may the problem. Besides, I do not understand why do you need to CONNECT BY REGEXP_SUBSTR(AGGREGATOR,'[^;]+',1,LEVEL) IS NOT NULL. You are using regexp in your select and connect by. Can you use where AGGREGATOR IS NOT NULL instead of connect by? Find a way to get rid of distinct and revise your query. You can use EXISTS instead of distinct... To help you more I need tables and data.

我认为 DISTINCT 可能是问题所在。此外,我不明白为什么你需要 CONNECT BY REGEXP_SUBSTR(AGGREGATOR,'[^;]+',1,LEVEL) IS NOT NULL。您在选择和连接中使用正则表达式。您可以使用 where AGGREGATOR IS NOT NULL 而不是 connect by 吗?找到一种方法来摆脱distinct 并修改您的查询。您可以使用 EXISTS 而不是 distinct... 为了帮助您更多,我需要表格和数据。

SELECT * FROM
(
 SELECT REGEXP_SUBSTR(AGGREGATOR ,'[^;]+',1,LEVEL) as AGGREGATOR                      
   FROM your_table
)
WHERE AGGREGATOR IS NOT NULL
/