如何在 oracle 9i 中最好地拆分 csv 字符串
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1089508/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to best split csv strings in oracle 9i
提问by Joyce
I want to be able to split csv strings in Oracle 9i
我希望能够在 Oracle 9i 中拆分 csv 字符串
I've read the following article http://www.oappssurd.com/2009/03/string-split-in-oracle.html
我已阅读以下文章 http://www.oappssurd.com/2009/03/string-split-in-oracle.html
But I didn't understand how to make this work. Here are some of my questions pertaining to it
但我不明白如何使这项工作。以下是我的一些相关问题
- Would this work in Oracle 9i, if not, why not?
- Is there a better way of going about splitting csv strings then the solution presented above?
- Do I need to create a new type? If so, do I need specific privilages for that?
- Can I declare the type w/in the function?
- 这会在 Oracle 9i 中工作吗,如果没有,为什么不呢?
- 有没有比上面介绍的解决方案更好的分割 csv 字符串的方法?
- 我需要创建一个新类型吗?如果是这样,我需要特定的特权吗?
- 我可以在函数中声明类型吗?
采纳答案by Michael Sofaer
Here's a string tokenizer for Oracle that's a little more straightforward than that page, but no idea if it's as fast:
这是 Oracle 的字符串标记器,它比该页面更简单一些,但不知道它是否一样快:
create or replace function splitter_count(str in varchar2, delim in char) return int as
val int;
begin
val := length(replace(str, delim, delim || ' '));
return val - length(str);
end;
create type token_list is varray(100) of varchar2(200);
CREATE or replace function tokenize (str varchar2, delim char) return token_list as
ret token_list;
target int;
i int;
this_delim int;
last_delim int;
BEGIN
ret := token_list();
i := 1;
last_delim := 0;
target := splitter_count(str, delim);
while i <= target
loop
ret.extend();
this_delim := instr(str, delim, 1, i);
ret(i):= substr(str, last_delim + 1, this_delim - last_delim -1);
i := i + 1;
last_delim := this_delim;
end loop;
ret.extend();
ret(i):= substr(str, last_delim + 1);
return ret;
end;
You can use it like this:
你可以这样使用它:
select tokenize('hi you person', ' ') from dual;
VARCHAR(hi,you,person)
回答by Rob van Wijk
Joyce,
乔伊斯,
Here are three examples:
下面是三个例子:
1) Using dbms_utility.comma_to_table. This is not a general purpose routine, because the elements should be valid identifiers. With some dirty tricks we can make it work more universal:
1) 使用 dbms_utility.comma_to_table。这不是通用例程,因为元素应该是有效的标识符。通过一些肮脏的技巧,我们可以让它更通用:
SQL> declare
2 cn_non_occuring_prefix constant varchar2(4) := 'zzzz';
3 mystring varchar2(2000):='a:sd:dfg:31456:dasd: :sdfsdf'; -- just an example
4 l_tablen binary_integer;
5 l_tab dbms_utility.uncl_array;
6 begin
7 dbms_utility.comma_to_table
8 ( list => cn_non_occuring_prefix || replace(mystring,':',','||cn_non_occuring_prefix)
9 , tablen => l_tablen
10 , tab => l_tab
11 );
12 for i in 1..l_tablen
13 loop
14 dbms_output.put_line(substr(l_tab(i),1+length(cn_non_occuring_prefix)));
15 end loop;
16 end;
17 /
a
sd
dfg
31456
dasd
sdfsdf
PL/SQL-procedure is geslaagd.
2) Using SQL's connect by level. If you are on 10g or higher you can use the connect-by-level approach in combination with regular expressions, like this:
2) 使用 SQL 的按级别连接。如果您使用 10g 或更高版本,则可以将逐级连接方法与正则表达式结合使用,如下所示:
SQL> declare
2 mystring varchar2(2000):='a:sd:dfg:31456:dasd: :sdfsdf'; -- just an example
3 begin
4 for r in
5 ( select regexp_substr(mystring,'[^:]+',1,level) element
6 from dual
7 connect by level <= length(regexp_replace(mystring,'[^:]+')) + 1
8 )
9 loop
10 dbms_output.put_line(r.element);
11 end loop;
12 end;
13 /
a
sd
dfg
31456
dasd
sdfsdf
PL/SQL-procedure is geslaagd.
3) Again using SQL's connect by level, but now in combination with good old SUBSTR/INSTR in case you are on version 9, like you are:
3) 再次使用 SQL 的按级别连接,但现在与旧的 SUBSTR/INSTR 结合使用,以防您使用版本 9,就像您一样:
SQL> declare
2 mystring varchar2(2000):='a:sd:dfg:31456:dasd: :sdfsdf'; -- just an example
3 begin
4 for r in
5 ( select substr
6 ( str
7 , instr(str,':',1,level) + 1
8 , instr(str,':',1,level+1) - instr(str,':',1,level) - 1
9 ) element
10 from (select ':' || mystring || ':' str from dual)
11 connect by level <= length(str) - length(replace(str,':')) - 1
12 )
13 loop
14 dbms_output.put_line(r.element);
15 end loop;
16 end;
17 /
a
sd
dfg
31456
dasd
sdfsdf
PL/SQL-procedure is geslaagd.
You can see some more techniques like these, in this blogpost: http://rwijk.blogspot.com/2007/11/interval-based-row-generation.html
您可以在这篇博文中看到更多类似的技术:http://rwijk.blogspot.com/2007/11/interval-based-row-generation.html
Hope this helps.
希望这可以帮助。
Regards, Rob.
问候,罗布。
To address your comment:
要解决您的评论:
An example of inserting the separated values into a normalized table.
将分隔值插入规范化表的示例。
First create the tables:
首先创建表:
SQL> create table csv_table (col)
2 as
3 select 'a,sd,dfg,31456,dasd,,sdfsdf' from dual union all
4 select 'a,bb,ccc,dddd' from dual union all
5 select 'zz,yy,' from dual
6 /
Table created.
SQL> create table normalized_table (value varchar2(10))
2 /
Table created.
Because you seem interested in the dbms_utility.comma_to_table approach, I mention it here. However, I certainly do not recommend this variant, because of the identifier quirks and because of the slow row by row processing.
因为您似乎对 dbms_utility.comma_to_table 方法感兴趣,所以我在这里提到它。但是,我当然不推荐这种变体,因为标识符的怪癖和逐行处理速度很慢。
SQL> declare
2 cn_non_occuring_prefix constant varchar2(4) := 'zzzz';
3 l_tablen binary_integer;
4 l_tab dbms_utility.uncl_array;
5 begin
6 for r in (select col from csv_table)
7 loop
8 dbms_utility.comma_to_table
9 ( list => cn_non_occuring_prefix || replace(r.col,',',','||cn_non_occuring_prefix)
10 , tablen => l_tablen
11 , tab => l_tab
12 );
13 forall i in 1..l_tablen
14 insert into normalized_table (value)
15 values (substr(l_tab(i),length(cn_non_occuring_prefix)+1))
16 ;
17 end loop;
18 end;
19 /
PL/SQL procedure successfully completed.
SQL> select * from normalized_table
2 /
VALUE
----------
a
sd
dfg
31456
dasd
sdfsdf
a
bb
ccc
dddd
zz
yy
14 rows selected.
I do recommend this single SQL variant:
我确实推荐这个单一的 SQL 变体:
SQL> truncate table normalized_table
2 /
Table truncated.
SQL> insert into normalized_table (value)
2 select substr
3 ( col
4 , instr(col,',',1,l) + 1
5 , instr(col,',',1,l+1) - instr(col,',',1,l) - 1
6 )
7 from ( select ',' || col || ',' col from csv_table )
8 , ( select level l from dual connect by level <= 100 )
9 where l <= length(col) - length(replace(col,',')) - 1
10 /
14 rows created.
SQL> select * from normalized_table
2 /
VALUE
----------
a
a
zz
sd
bb
yy
dfg
ccc
31456
dddd
dasd
sdfsdf
14 rows selected.
Regards, Rob.
问候,罗布。
回答by Brian
It sounds like you don't want to add schema (types, function). One SQL only way to parse the delimited text is to 'go crazy' with instr and substr calls.
听起来您不想添加架构(类型、函数)。解析分隔文本的一种 SQL 唯一方法是使用 instr 和 substr 调用“发疯”。
DECLARE
V_CSV_STRING VARCHAR2(100);
BEGIN
--Create a test delimited list of first_name, last_name, middle_init
V_CSV_STRING := 'Brian,Hart,M';
select substr( V_CSV_STRING||',', 1, instr(V_CSV_STRING,',')-1 ) FIRST_NAME,
substr( V_CSV_STRING||',,', instr( V_CSV_STRING||',,', ',') +1,
instr( V_CSV_STRING||',,', ',', 1, 2 )-instr(V_CSV_STRING||',,',',')-1 ) LAST_NAME,
rtrim(substr( V_CSV_STRING||',,', instr( V_CSV_STRING||',,',',',1,2)+1),',') MIDDLE_INIT
from dual;
END;
If your looking to formalize a structure and adding the appropriate application code (functions, views, types etc...) I would take a look at Tom Kyte's writingon this subject.
回答by Mark Nold
You might want to be a bit clearer on what you want to do, then we can give you a specific answer. Showing some of your code is always helpful :)
您可能想要更清楚地了解您想要做什么,然后我们可以给您一个具体的答案。显示一些代码总是有帮助的 :)
If you are using paramters, to split a string of csv numbers (eg: 1,2,3,4) then use that in a IN
statement have a look at the function str2tbl()
in Question 670922. With a few changes you could change it to a VARCHAR2
or whatever you need.
如果您正在使用paramters,分裂CSV数字(例如:1,2,3,4)的字符串,然后使用在IN
声明中看看功能str2tbl()
的问题670922。通过一些更改,您可以将其更改为 aVARCHAR2
或您需要的任何内容。
In the following you could set :sMyCatagories
equal to '1,2,3,4'
在下面你可以设置:sMyCatagories
等于'1,2,3,4'
create or replace type myTableType as table of number;
create or replace function str2tbl( p_str in varchar2 ) return myTableType
as
l_str long default p_str || ',';
l_n number;
l_data myTableType := myTabletype();
begin
loop
l_n := instr( l_str, ',' );
exit when (nvl(l_n,0) = 0);
l_data.extend;
l_data( l_data.count ) := ltrim(rtrim(substr(l_str,1,l_n-1)));
l_str := substr( l_str, l_n+1 );
end loop;
return l_data;
end;
and using it in a select statement....
并在选择语句中使用它....
SELECT
*
FROM
atable a
WHERE
a.category in (
select * from INLIST (
select cast(str2tbl(:sMyCatagories) as mytableType) from dual
)
);
This is really only useful if you are using parameters. If you are munging together SQL in your application, then just use a normal IN statement.
这仅在您使用参数时才有用。如果您在应用程序中混合 SQL,那么只需使用普通的 IN 语句。
SELECT
*
FROM
atable a
WHERE
a.category in (1,2,3,4);
回答by Joyce
I used this in the end
我最后用了这个
create or replace function split
(
p_list varchar2
) return sys.dbms_debug_vc2coll pipelined
is
l_idx pls_integer;
l_list varchar2(32767) := p_list;
l_value varchar2(32767);
begin
loop
l_idx := instr(l_list,',');
if l_idx > 0 then
pipe row(substr(l_list,1,l_idx-1));
l_list := substr(l_list,l_idx+length(','));
else
pipe row(l_list);
exit;
end if;
end loop;
return;
end split;
declare
CURSOR c IS select occurrence_num, graphics from supp where graphics is not null and graphics not like ' %';
begin
FOR r IN c LOOP
insert into image (photo_id,report_id, filename)
select image_key_seq.nextval photo_id, r.occurrence_num report_id,
t.column_value filename from table(split(cast(r.graphics as varchar2(1000)))) t where t.column_value is not null;
END LOOP;
end ;