SQL：如何将 Oracle 表中的 100,000 条记录拆分为 5 个块？

Question

提问by Shaun Kinnair

I'm trying to figure out away to split the first 100,000 records from a table that has 1 million+ records into 5 (five) 20,000 records chunks to go into a file? Maybe some SQL that will get the min and max rowid or primary id for each 5 chunks of 20,000 records, so I can put the min and max value into a variable and pass it into the SQL and use a BETWEEN in the where clause to the SQL.

我试图找出将具有 100 万多条记录的表中的前 100,000 条记录拆分为 5（五）条 20,000 条记录块以放入一个文件？也许某些 SQL 会为每 5 个 20,000 条记录块获取最小和最大 rowid 或主 ID，因此我可以将最小和最大值放入变量并将其传递到 SQL 并在 where 子句中使用 BETWEEN 到SQL。

Can this be done?

这能做到吗？

I'm on an Oracle 11g database.

我在 Oracle 11g 数据库上。

Thanks in advance.

提前致谢。

Answer 1

回答by Gordon Linoff

If you just want to assign values 1-5 to basically equal sized groups, then use ntile():

如果您只想将值 1-5 分配给大小基本相同的组，请使用ntile()：

select t.*, ntile(5) over (order by NULL) as num
from (select t.*
      from t
      where rownum <= 100000
     ) t;

If you want to insert into 5 different tables, then use insert all:

如果要插入 5 个不同的表，请使用insert all：

insert all
    when num = 1 then into t1
    when num = 2 then into t2
    when num = 3 then into t3
    when num = 4 then into t4
    when num = 5 then into t5
    select t.*, ntile(5) over (order by NULL) as num
    from (select t.*
          from t
          where rownum <= 100000
         ) t;

Answer 2

回答by Shaun Kinnair

Thanks so much to Gordon Linoff for giving me a starter to the code.

非常感谢Gordon Linoff 为我提供了代码入门。

just an update on how to get the min and max values for 5 chunks.

只是关于如何获取 5 个块的最小值和最大值的更新。

select num, min(cre_surr_id), max(cre_surr_id)
from
(select p.cre_surr_id, ntile(5) over (order by NULL) as num
from (select p.*
      from productions p
      where rownum <= 100000
 ) p )
group by num

Answer 3

回答by Aleksej

You can even try with simple aggregation:

您甚至可以尝试使用简单的聚合：

create table test_chunk(val) as
(
    select floor(dbms_random.value(1, level * 10)) from dual
    connect by level <= 100
)

select min(val), max(val), floor((num+1)/2)
from (select rownum as num, val from test_chunk)
group by floor((num+1)/2)

Answer 4

回答by TenG

A bit harsh down voting another fair question.

投票另一个公平的问题有点苛刻。

Anyway, NTILE is new to me, so I wouldn't have discovered that were it not for your question.

无论如何，NTILE 对我来说是新的，所以如果不是你的问题，我不会发现它。

My way of doing this , the old school way, would have been to MOD the rownum to get the group number, e.g.

我这样做的方式，老派的方式，本来是修改 rownum 来获取组号，例如

select t.*, mod(rn,5) as num
from (select t.*, rownnum rn
      from t
     ) t;

This solves the SQL part, or rather how to group rows into equal chunks, but that is only half your question. The next half is how to write these to 5 separate files.

这解决了 SQL 部分，或者更确切地说如何将行分组为相等的块，但这只是您问题的一半。下半部分是如何将这些写入 5 个单独的文件。

You can either have 5 separate queries each spooling to a separate file, e.g:

您可以有 5 个单独的查询，每个查询都假脱机到一个单独的文件，例如：

spool f1.dat
    select t.*
    from (select t.*, rownnum rn
          from t
         ) t
    where mod(t.rn,5) = 0;
spool off

spool f2.dat
    select t.*
    from (select t.*, rownnum rn
          from t
         ) t
    where mod(t.rn,5) = 1;
spool off

etc.

等等。

Or, using UTL_FILE. You could try something clever with a single query and have an array of UTL_FILE types where the array index matches the MOD(rn,5) then you wouldn't need logic like "IF rn = 0 THEN UTL_FILE.WRITELN(f0, ...".

或者，使用 UTL_FILE。您可以使用单个查询尝试一些巧妙的方法，并拥有一个 UTL_FILE 类型的数组，其中数组索引与 MOD(rn,5) 匹配，那么您就不需要像“IF rn = 0 THEN UTL_FILE.WRITELN(f0, .. .”。

So, something like (not tested, just in a rough form for guidance, never tried this myself):

所以，像（未经测试，只是粗略的指导形式，我自己从未尝试过）：

DECLARE
   TYPE fname IS VARRAY(5) OF VARCHAR2(100);
   TYPE fh    IS VARRAY(5) OF UTL_FILE.FILE_TYPE;
   CURSOR c1 IS 
    select t.*, mod(rn,5) as num
    from (select t.*, rownnum rn
          from t
         ) t;
   idx INTEGER;
BEGIN
  FOR idx IN 1..5 LOOP
      fname(idx) := 'data_' || idx || '.dat';
      fh(idx) := UTL_FILE.'THE_DIR', fname(idx), 'w');
  END LOOP;
  FOR r1 IN c1 LOOP
     UTL_FILE.PUT_LINE ( fh(r1.num+1), r1.{column value from C1} );
  END LOOP;
  FOR idx IN 1..5 LOOP
      UTL_FILE.FCLOSE (fh(idx));
  END LOOP;
END;

SQL：如何将 Oracle 表中的 100,000 条记录拆分为 5 个块？

提问by Shaun Kinnair

回答by Gordon Linoff

回答by Shaun Kinnair

回答by Aleksej

回答by TenG

相关推荐

最近更新

标签

SQL：如何将 Oracle 表中的 100,000 条记录拆分为 5 个块？

提问by Shaun Kinnair

回答by Gordon Linoff

回答by Shaun Kinnair

回答by Aleksej

回答by TenG

相关推荐

在 sys.servers 中找不到服务器“服务器名称”。SQL Server 2014

在 Powershell 中使用 select SQL 语句检索数据

SQL 遍历表，对每一行进行计算

SQL 与 FOREIGN KEY 约束冲突

相关推荐

最近更新

标签