如何在 oracle 9i 中最好地拆分 csv 字符串

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1089508/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-10 02:00:33  来源:igfitidea点击:

How to best split csv strings in oracle 9i

oraclecsvtokenize

提问by Joyce

I want to be able to split csv strings in Oracle 9i

我希望能够在 Oracle 9i 中拆分 csv 字符串

I've read the following article http://www.oappssurd.com/2009/03/string-split-in-oracle.html

我已阅读以下文章 http://www.oappssurd.com/2009/03/string-split-in-oracle.html

But I didn't understand how to make this work. Here are some of my questions pertaining to it

但我不明白如何使这项工作。以下是我的一些相关问题

  1. Would this work in Oracle 9i, if not, why not?
  2. Is there a better way of going about splitting csv strings then the solution presented above?
  3. Do I need to create a new type? If so, do I need specific privilages for that?
  4. Can I declare the type w/in the function?
  1. 这会在 Oracle 9i 中工作吗,如果没有,为什么不呢?
  2. 有没有比上面介绍的解决方案更好的分割 csv 字符串的方法?
  3. 我需要创建一个新类型吗?如果是这样,我需要特定的特权吗?
  4. 我可以在函数中声明类型吗?

采纳答案by Michael Sofaer

Here's a string tokenizer for Oracle that's a little more straightforward than that page, but no idea if it's as fast:

这是 Oracle 的字符串标记器,它比该页面更简单一些,但不知道它是否一样快:

create or replace function splitter_count(str in varchar2, delim in char) return int as
val int;
begin
  val := length(replace(str, delim, delim || ' '));
  return val - length(str); 
end;

create type token_list is varray(100) of varchar2(200);

CREATE or replace function tokenize (str varchar2, delim char) return token_list as
ret token_list;
target int;
i int;
this_delim int;
last_delim int;
BEGIN
  ret := token_list();
  i := 1;
  last_delim := 0;
  target := splitter_count(str, delim);
  while i <= target
  loop
    ret.extend();
    this_delim := instr(str, delim, 1, i);
    ret(i):= substr(str, last_delim + 1, this_delim - last_delim -1);
    i := i + 1;
    last_delim := this_delim;
  end loop;
  ret.extend();
  ret(i):= substr(str, last_delim + 1);
  return ret;
end;

You can use it like this:

你可以这样使用它:

select tokenize('hi you person', ' ') from dual;
VARCHAR(hi,you,person)

回答by Rob van Wijk

Joyce,

乔伊斯,

Here are three examples:

下面是三个例子:

1) Using dbms_utility.comma_to_table. This is not a general purpose routine, because the elements should be valid identifiers. With some dirty tricks we can make it work more universal:

1) 使用 dbms_utility.comma_to_table。这不是通用例程,因为元素应该是有效的标识符。通过一些肮脏的技巧,我们可以让它更通用:

SQL> declare
  2    cn_non_occuring_prefix constant varchar2(4) := 'zzzz';
  3    mystring varchar2(2000):='a:sd:dfg:31456:dasd: :sdfsdf'; -- just an example
  4    l_tablen binary_integer;
  5    l_tab    dbms_utility.uncl_array;
  6  begin
  7    dbms_utility.comma_to_table
  8    ( list   => cn_non_occuring_prefix || replace(mystring,':',','||cn_non_occuring_prefix)
  9    , tablen => l_tablen
 10    , tab    => l_tab
 11    );
 12    for i in 1..l_tablen
 13    loop
 14      dbms_output.put_line(substr(l_tab(i),1+length(cn_non_occuring_prefix)));
 15    end loop;
 16  end;
 17  /
a
sd
dfg
31456
dasd

sdfsdf

PL/SQL-procedure is geslaagd.

2) Using SQL's connect by level. If you are on 10g or higher you can use the connect-by-level approach in combination with regular expressions, like this:

2) 使用 SQL 的按级别连接。如果您使用 10g 或更高版本,则可以将逐级连接方法与正则表达式结合使用,如下所示:

SQL> declare
  2    mystring varchar2(2000):='a:sd:dfg:31456:dasd: :sdfsdf'; -- just an example
  3  begin
  4    for r in
  5    ( select regexp_substr(mystring,'[^:]+',1,level) element
  6        from dual
  7     connect by level <= length(regexp_replace(mystring,'[^:]+')) + 1
  8    )
  9    loop
 10      dbms_output.put_line(r.element);
 11    end loop;
 12  end;
 13  /
a
sd
dfg
31456
dasd

sdfsdf

PL/SQL-procedure is geslaagd.

3) Again using SQL's connect by level, but now in combination with good old SUBSTR/INSTR in case you are on version 9, like you are:

3) 再次使用 SQL 的按级别连接,但现在与旧的 SUBSTR/INSTR 结合使用,以防您使用版本 9,就像您一样:

    SQL> declare
      2    mystring varchar2(2000):='a:sd:dfg:31456:dasd: :sdfsdf'; -- just an example
      3  begin
      4    for r in
      5    ( select substr
      6             ( str
      7             , instr(str,':',1,level) + 1
      8             , instr(str,':',1,level+1) - instr(str,':',1,level) - 1
      9             ) element
     10        from (select ':' || mystring || ':' str from dual)
     11     connect by level <= length(str) - length(replace(str,':')) - 1
     12    )
     13    loop
     14      dbms_output.put_line(r.element);
     15    end loop;
     16  end;
     17  /
    a
    sd
    dfg
    31456
    dasd

    sdfsdf

PL/SQL-procedure is geslaagd.

You can see some more techniques like these, in this blogpost: http://rwijk.blogspot.com/2007/11/interval-based-row-generation.html

您可以在这篇博文中看到更多类似的技术:http://rwijk.blogspot.com/2007/11/interval-based-row-generation.html

Hope this helps.

希望这可以帮助。

Regards, Rob.

问候,罗布。



To address your comment:

要解决您的评论:

An example of inserting the separated values into a normalized table.

将分隔值插入规范化表的示例。

First create the tables:

首先创建表:

SQL> create table csv_table (col)
  2  as
  3  select 'a,sd,dfg,31456,dasd,,sdfsdf' from dual union all
  4  select 'a,bb,ccc,dddd' from dual union all
  5  select 'zz,yy,' from dual
  6  /

Table created.

SQL> create table normalized_table (value varchar2(10))
  2  /

Table created.

Because you seem interested in the dbms_utility.comma_to_table approach, I mention it here. However, I certainly do not recommend this variant, because of the identifier quirks and because of the slow row by row processing.

因为您似乎对 dbms_utility.comma_to_table 方法感兴趣,所以我在这里提到它。但是,我当然不推荐这种变体,因为标识符的怪癖和逐行处理速度很慢。

SQL> declare
  2    cn_non_occuring_prefix constant varchar2(4) := 'zzzz';
  3    l_tablen binary_integer;
  4    l_tab    dbms_utility.uncl_array;
  5  begin
  6    for r in (select col from csv_table)
  7    loop
  8      dbms_utility.comma_to_table
  9      ( list   => cn_non_occuring_prefix || replace(r.col,',',','||cn_non_occuring_prefix)
 10      , tablen => l_tablen
 11      , tab    => l_tab
 12      );
 13      forall i in 1..l_tablen
 14        insert into normalized_table (value)
 15        values (substr(l_tab(i),length(cn_non_occuring_prefix)+1))
 16      ;
 17    end loop;
 18  end;
 19  /

PL/SQL procedure successfully completed.

SQL> select * from normalized_table
  2  /

VALUE
----------
a
sd
dfg
31456
dasd

sdfsdf
a
bb
ccc
dddd
zz
yy


14 rows selected.

I do recommend this single SQL variant:

我确实推荐这个单一的 SQL 变体:

SQL> truncate table normalized_table
  2  /

Table truncated.

SQL> insert into normalized_table (value)
  2   select substr
  3          ( col
  4          , instr(col,',',1,l) + 1
  5          , instr(col,',',1,l+1) - instr(col,',',1,l) - 1
  6          )
  7     from ( select ',' || col || ',' col from csv_table )
  8        , ( select level l from dual connect by level <= 100 )
  9    where l <= length(col) - length(replace(col,',')) - 1
 10  /

14 rows created.

SQL> select * from normalized_table
  2  /

VALUE
----------
a
a
zz
sd
bb
yy
dfg
ccc

31456
dddd
dasd

sdfsdf

14 rows selected.

Regards, Rob.

问候,罗布。

回答by Brian

It sounds like you don't want to add schema (types, function). One SQL only way to parse the delimited text is to 'go crazy' with instr and substr calls.

听起来您不想添加架构(类型、函数)。解析分隔文本的一种 SQL 唯一方法是使用 instr 和 substr 调用“发疯”。

    DECLARE
      V_CSV_STRING VARCHAR2(100);
    BEGIN
      --Create a test delimited list of first_name, last_name, middle_init
      V_CSV_STRING := 'Brian,Hart,M';

    select substr( V_CSV_STRING||',', 1, instr(V_CSV_STRING,',')-1 ) FIRST_NAME,
           substr( V_CSV_STRING||',,', instr( V_CSV_STRING||',,', ',') +1, 
                             instr( V_CSV_STRING||',,', ',', 1, 2 )-instr(V_CSV_STRING||',,',',')-1 ) LAST_NAME,
           rtrim(substr( V_CSV_STRING||',,', instr( V_CSV_STRING||',,',',',1,2)+1),',') MIDDLE_INIT
     from dual;
     END;

If your looking to formalize a structure and adding the appropriate application code (functions, views, types etc...) I would take a look at Tom Kyte's writingon this subject.

如果你想找一个正式的结构和添加相应的应用程序代码(函数,视图,类型等...)我想看看汤姆凯特的写作这个主题

回答by Mark Nold

You might want to be a bit clearer on what you want to do, then we can give you a specific answer. Showing some of your code is always helpful :)

您可能想要更清楚地了解您想要做什么,然后我们可以给您一个具体的答案。显示一些代码总是有帮助的 :)

If you are using paramters, to split a string of csv numbers (eg: 1,2,3,4) then use that in a INstatement have a look at the function str2tbl()in Question 670922. With a few changes you could change it to a VARCHAR2or whatever you need.

如果您正在使用paramters,分裂CSV数字(例如:1,2,3,4)的字符串,然后使用在IN声明中看看功能str2tbl()问题670922。通过一些更改,您可以将其更改为 aVARCHAR2或您需要的任何内容。

In the following you could set :sMyCatagoriesequal to '1,2,3,4'

在下面你可以设置:sMyCatagories等于'1,2,3,4'

create or replace type myTableType as table of number;

create or replace function str2tbl( p_str in varchar2 ) return myTableType
  as
     l_str   long default p_str || ',';
     l_n        number;
     l_data    myTableType := myTabletype();
  begin
      loop
          l_n := instr( l_str, ',' );
          exit when (nvl(l_n,0) = 0);
          l_data.extend;
          l_data( l_data.count ) := ltrim(rtrim(substr(l_str,1,l_n-1)));
          l_str := substr( l_str, l_n+1 );
      end loop;
      return l_data;
  end;

and using it in a select statement....

并在选择语句中使用它....

SELECT 
  *
FROM
  atable a 
WHERE 
  a.category in (
        select * from INLIST (
           select cast(str2tbl(:sMyCatagories) as mytableType) from dual
        ) 
  );

This is really only useful if you are using parameters. If you are munging together SQL in your application, then just use a normal IN statement.

这仅在您使用参数时才有用。如果您在应用程序中混合 SQL,那么只需使用普通的 IN 语句。

SELECT 
  *
FROM
  atable a 
WHERE 
  a.category in (1,2,3,4);

回答by Joyce

I used this in the end

我最后用了这个

create or replace function split
(
   p_list varchar2

) return sys.dbms_debug_vc2coll pipelined
is
   l_idx    pls_integer;
   l_list    varchar2(32767) := p_list;
   l_value    varchar2(32767);
begin
   loop
       l_idx := instr(l_list,',');
       if l_idx > 0 then
           pipe row(substr(l_list,1,l_idx-1));
           l_list := substr(l_list,l_idx+length(','));

       else
           pipe row(l_list);
           exit;
       end if;
   end loop;
   return;
end split;


declare
CURSOR c IS  select occurrence_num, graphics from supp where graphics is not null and graphics not like ' %';
begin
  FOR r IN c LOOP   
      insert into image (photo_id,report_id, filename) 
      select image_key_seq.nextval   photo_id, r.occurrence_num report_id, 
      t.column_value  filename from table(split(cast(r.graphics as varchar2(1000)))) t where t.column_value is not null;
   END LOOP;  
end ;