oracle 使用 REGEXP_SUBSTR(AGGREGATOR,'[^;]+',1,LEVEL) 查询速度很慢
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/15106247/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
oracle query slow with REGEXP_SUBSTR(AGGREGATOR,'[^;]+',1,LEVEL)
提问by user2114275
Hi i am using this query to get diffrent row in ; seprate value
嗨,我正在使用此查询来获取不同的行;单独的值
table is like
桌子就像
row_id aggregator
1 12;45
2 25
using this query i want output like
使用这个查询我想要输出
row_id aggregator
1 12
1 45
2 25
i am using below query
我正在使用以下查询
SELECT
DISTINCT ROW_ID,
REGEXP_SUBSTR(AGGREGATOR,'[^;]+',1,LEVEL) as AGGREGATOR,
FROM DUMMY_1
CONNECT BY REGEXP_SUBSTR(AGGREGATOR,'[^;]+',1,LEVEL) IS NOT NULL;
but it is very slow even for 300 records
I have to work for 40000 records.
但即使对于 300 条记录,它也很慢,
我必须为 40000 条记录工作。
回答by A.B.Cade
Sometimes a pipelined table can be faster, try this:
有时流水线表可以更快,试试这个:
create or replace type t is object(word varchar2(100), pk number);
/
create or replace type t_tab as table of t;
/
create or replace function split_string(del in varchar2) return t_tab
pipelined is
word varchar2(4000);
str_t varchar2(4000) ;
v_del_i number;
iid number;
cursor c is
select * from DUMMY_1;
begin
for r in c loop
str_t := r.aggregator;
iid := r.row_id;
while str_t is not null loop
v_del_i := instr(str_t, del, 1, 1);
if v_del_i = 0 then
word := str_t;
str_t := '';
else
word := substr(str_t, 1, v_del_i - 1);
str_t := substr(str_t, v_del_i + 1);
end if;
pipe row(t(word, iid));
end loop;
end loop;
return;
end split_string;
And here is another demowith 22 rows containing 3 vals in aggregator each - see the difference between first and second query..
这是另一个演示,其中包含 22 行,每行在聚合器中包含 3 个 val - 请参阅第一个和第二个查询之间的区别。
回答by Vincent Malgrat
Regular expressions are known to be expensive functions, so you should try to minimize their use when performance is critical (such as using standard functions in the CONNECT BY
clause).
众所周知,正则表达式是开销很大的函数,因此当性能至关重要时(例如在CONNECT BY
子句中使用标准函数),您应该尽量减少它们的使用。
Using standard functions (INSTR
, SUBSTR
, REPLACE
) will be more efficient, but the resulting code will be hard to read/understand/maintain.
使用标准函数 ( INSTR
, SUBSTR
, REPLACE
) 会更有效率,但生成的代码将难以阅读/理解/维护。
I could not resist writing a recursive QTE, which I is much more efficient than both regular expressions and standard functions. Furthermore, recursive QTE queries have arguably some elegance. You'll need Oracle 11.2:
我忍不住写了一个递归 QTE,它比正则表达式和标准函数都要高效得多。此外,递归 QTE 查询可以说是有些优雅。您将需要 Oracle 11.2:
WITH rec_sql(row_id, aggregator, lvl, tail) AS (
SELECT row_id,
nvl(substr(aggregator, 1, instr(aggregator, ';') - 1),
aggregator),
1 lvl,
CASE WHEN instr(aggregator, ';') > 0 THEN
substr(aggregator, instr(aggregator, ';') + 1)
END tail
FROM dummy_1 initialization
UNION ALL
SELECT r.row_id,
nvl(substr(tail, 1, instr(tail, ';') - 1), tail),
lvl + 1,
CASE WHEN instr(tail, ';') > 0 THEN
substr(tail, instr(tail, ';') + 1)
END tail
FROM rec_sql r
WHERE r.tail IS NOT NULL
)
SELECT * FROM rec_sql;
You can see on SQLFiddlethat this solution is very performant and on par with @A.B.Cade's solution. (Thanks to A.B.Cade for the test case).
您可以在SQLFiddle 上看到该解决方案非常高效,并且与@ABCade 的解决方案相当。(感谢 ABCade 的测试用例)。
回答by mik
Your connect by
produces much more records than needed, that's why the performance is poor and you need to use distinct
to limit the number records. An approach that does need distinct
would be:
您connect by
产生的记录比需要的多得多,这就是为什么性能很差,您需要使用distinct
来限制记录数量。一种确实需要的方法distinct
是:
select row_id, regexp_substr(aggregator,'[^;]+',1,n) aggregator
from dummy_1, (select level n from dual connect by level < 100)
where n <= regexp_count(aggregator,';')+1
回答by Art
I think the DISTINCT may the problem. Besides, I do not understand why do you need to CONNECT BY REGEXP_SUBSTR(AGGREGATOR,'[^;]+',1,LEVEL) IS NOT NULL. You are using regexp in your select and connect by. Can you use where AGGREGATOR IS NOT NULL instead of connect by? Find a way to get rid of distinct and revise your query. You can use EXISTS instead of distinct... To help you more I need tables and data.
我认为 DISTINCT 可能是问题所在。此外,我不明白为什么你需要 CONNECT BY REGEXP_SUBSTR(AGGREGATOR,'[^;]+',1,LEVEL) IS NOT NULL。您在选择和连接中使用正则表达式。您可以使用 where AGGREGATOR IS NOT NULL 而不是 connect by 吗?找到一种方法来摆脱distinct 并修改您的查询。您可以使用 EXISTS 而不是 distinct... 为了帮助您更多,我需要表格和数据。
SELECT * FROM
(
SELECT REGEXP_SUBSTR(AGGREGATOR ,'[^;]+',1,LEVEL) as AGGREGATOR
FROM your_table
)
WHERE AGGREGATOR IS NOT NULL
/