Oracle 查询以查找字符串中所有出现的字符
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/17906105/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Oracle query to find all occurrences of a charcter in a string
提问by Rajeena Safeer
I have to write an Oracle query in toad to find all the occurrences of a character in a string. For example if I'm searching for R
in the string SSSRNNSRSSR
, it should return positions 4, 8 and 11.
我必须在 toad 中编写一个 Oracle 查询来查找字符串中出现的所有字符。例如,如果我R
在 string 中搜索SSSRNNSRSSR
,它应该返回位置 4、8 和 11。
I am new to Oracle and tried this.
我是 Oracle 的新手并尝试过这个。
select instr(mtr_ctrl_flags, 'R', pos + 1, 1) as pos1
from mer_trans_reject
where pos in ( select instr(mtr_ctrl_flags, 'R', 1, 1) as pos
from mer_trans_reject
);
where mtr_ctrl_flags
is the column name. I'm getting an error indicating that pos
is an invalid identifier.
mtr_ctrl_flags
列名在哪里。我收到一个错误,指出这pos
是一个无效的标识符。
回答by Ben
Extending GolezTrol's answer you can use regular expressions to significantly reduce the number of recursive queries you do:
扩展 GolezTrol 的答案,您可以使用正则表达式来显着减少您执行的递归查询的数量:
select instr('SSSRNNSRSSR','R', 1, level)
from dual
connect by level <= regexp_count('SSSRNNSRSSR', 'R')
REGEXP_COUNT()returns the number of times the pattern matches, in this case the number of times R
exists in SSSRNNSRSSR
. This limits the level of recursion to the exact number you need to.
REGEXP_COUNT()返回模式匹配的次数,在这种情况下,R
存在于 中的次数SSSRNNSRSSR
。这将递归级别限制为您需要的确切数量。
INSTR()simply searches for the index of R in your string. level
is the depth of the recursion but in this case it's also the level thoccurrence of the string as we restricted to the number of recurses required.
INSTR()只是在您的字符串中搜索 R 的索引。level
是递归的深度,但在这种情况下,它也是字符串出现的第 th级,因为我们限制了所需的递归次数。
If the string you're wanting to pick out is more complicated you could go for regular expressions ans REGEXP_INSTR()as opposed to INSTR() but it will be slower (not by much) and it's unnecessary unless required.
如果您要选择的字符串更复杂,您可以使用正则表达式和REGEXP_INSTR()而不是 INSTR() 但它会更慢(不是很多)并且除非需要,否则它是不必要的。
Simple benchmark as requested:
根据要求进行简单基准测试:
The two CONNECT BY solutions would indicate that using REGEXP_COUNT is 20% quicker on a string of this size.
两个 CONNECT BY 解决方案表明在这种大小的字符串上使用 REGEXP_COUNT 快 20%。
SQL> set timing on
SQL>
SQL> -- CONNECT BY with REGEX
SQL> declare
2 type t__num is table of number index by binary_integer;
3 t_num t__num;
4 begin
5 for i in 1 .. 100000 loop
6 select instr('SSSRNNSRSSR','R', 1, level)
7 bulk collect into t_num
8 from dual
9 connect by level <= regexp_count('SSSRNNSRSSR', 'R')
10 ;
11 end loop;
12 end;
13 /
PL/SQL procedure successfully completed.
Elapsed: 00:00:03.94
SQL>
SQL> -- CONNECT BY with filter
SQL> declare
2 type t__num is table of number index by binary_integer;
3 t_num t__num;
4 begin
5 for i in 1 .. 100000 loop
6 select pos
7 bulk collect into t_num
8 from ( select substr('SSSRNNSRSSR', level, 1) as character
9 , level as pos
10 from dual t
11 connect by level <= length('SSSRNNSRSSR') )
12 where character = 'R'
13 ;
14 end loop;
15 end;
16 /
PL/SQL procedure successfully completed.
Elapsed: 00:00:04.80
The pipelined table function is a fair bit slower, though it would be interesting to see how it performs over large strings with lots of matches.
流水线表函数要慢一些,但看看它如何在具有大量匹配项的大字符串上执行会很有趣。
SQL> -- PIPELINED TABLE FUNCTION
SQL> declare
2 type t__num is table of number index by binary_integer;
3 t_num t__num;
4 begin
5 for i in 1 .. 100000 loop
6 select *
7 bulk collect into t_num
8 from table(string_indexes('SSSRNNSRSSR','R'))
9 ;
10 end loop;
11 end;
12 /
PL/SQL procedure successfully completed.
Elapsed: 00:00:06.54
回答by GolezTrol
This is a solution:
这是一个解决方案:
select
pos
from
(select
substr('SSSRNNSRSSR', level, 1) as character,
level as pos
from
dual
connect by
level <= length(t.text))
where
character = 'R'
dual
is a built in table that just returns a single row. Very convenient!
dual
是一个内置表,只返回一行。很方便!
connect by
lets you build recursive queries. This is often used to generate lists from tree-like data (parent/child relations). It allows you to more or less repeat the query in front of it. And you've got special fields, like level
that allows you to check how deeply the recursion went.
connect by
允许您构建递归查询。这通常用于从树状数据(父/子关系)生成列表。它允许您或多或少地重复它前面的查询。而且你有特殊的字段,这样level
你就可以检查递归的深度。
In this case, I use it to split the string to characters and return a row for each character. Using level
, I can repeat the query and get a character until the end of the string is reached.
在这种情况下,我使用它将字符串拆分为字符并为每个字符返回一行。使用level
,我可以重复查询并获取一个字符,直到到达字符串的末尾。
Then it is just a matter of returning the pos
for all rows containing the character 'R'
然后它只是返回pos
包含字符的所有行的问题'R'
回答by Ben
To take up a_horse_with_no_name's challengehere is another answer with a pipelined table function.
在这里接受a_horse_with_no_name 的挑战是使用流水线表函数的另一个答案。
A pipelined function returns an array, which you can query normally. I would expect that over strings with large numbers of matches this will perform better than the recursive query but as with everything test yourself first.
流水线函数返回一个数组,您可以正常查询。我希望在具有大量匹配项的字符串上,这将比递归查询执行得更好,但与所有事情一样,首先要测试自己。
create type num_array as table of number
/
create function string_indexes (
PSource_String in varchar2
, PSearch_String in varchar2
) return num_array pipelined is
begin
for i in 1 .. length(PSource_String) loop
if substr(PSource_String, i, 1) = PSearch_String then
pipe row(i);
end if;
end loop;
return;
end;
/
Then in order to access it:
然后为了访问它:
select *
from table(string_indexes('SSSRNNSRSSR','R'))