在 Oracle 中从非常大的记录集中选择记录的子集会耗尽内存
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/4181362/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Selecting a subset of records from a very large record set in Oracle runs out of memory
提问by Cyntech
I have a process that is converting dates from GMT to Australian Eastern Standard Time. To do this, I need to select the records from the database, process them and then save them back.
我有一个将日期从 GMT 转换为澳大利亚东部标准时间的过程。为此,我需要从数据库中选择记录,处理它们,然后将它们保存回去。
To select the records, I have the following query:
要选择记录,我有以下查询:
SELECT id,
user_id,
event_date,
event,
resource_id,
resource_name
FROM
(SELECT rowid id,
rownum r,
user_id,
event_date,
event,
resource_id,
resource_name
FROM user_activity
ORDER BY rowid)
WHERE r BETWEEN 0 AND 50000
to select a block of 50000 rows from a total of approx. 60 million rows. I am splitting them up because a) Java (what the update process is written in) runs out of memory with too many rows (I have a bean object for each row) and b) I only have 4 gig of Oracle temp space to play with.
从总共大约 50000 行中选择一个块。6000 万行。我将它们分开是因为 a) Java(更新过程是用什么编写的)内存不足,行太多(我每行都有一个 bean 对象)和 b)我只有 4 gig 的 Oracle 临时空间可以玩和。
In the process, I use the rowid to update the record (so I have a unique value) and the rownum to select the blocks. I then call this query in iterations, selecting the next 50000 records until none remain (the java program controls this).
在这个过程中,我使用 rowid 来更新记录(所以我有一个唯一的值)和 rownum 来选择块。然后我在迭代中调用这个查询,选择接下来的 50000 条记录,直到没有记录为止(java 程序控制这个)。
The problem I'm getting is that I'm still running out of Oracle temp space with this query. My DBA has told me that more temp space cannot be granted, so another method must be found.
我遇到的问题是我仍然用完这个查询的 Oracle 临时空间。我的 DBA 告诉我不能授予更多临时空间,因此必须找到另一种方法。
I've tried substituting the subquery (what I presume is using all the temp space with the sort) with a view but an explain plan using a view is identical to one of the original query.
我试过用视图替换子查询(我假设使用所有临时空间进行排序),但使用视图的解释计划与原始查询之一相同。
Is there a different/better way to achieve this without running into the memory/tempspace problems? I'm assuming an update query to update the dates (as opposed to a java program) would suffer from the same problem using temp space available?
有没有不同/更好的方法来实现这一点而不会遇到内存/临时空间问题?我假设更新日期(而不是 java 程序)的更新查询会遇到使用可用临时空间的相同问题吗?
Your assistance on this is greatly appreciated.
非常感谢您在这方面的帮助。
Update
更新
I went down the path of the pl/sql block as suggested below:
我按照下面的建议沿着 pl/sql 块的路径走下去:
declare
cursor c is select event_date from user_activity for update;
begin
for t_row in c loop
update user_activity
set event_date = t_row.event_date + 10/24 where current of c;
commit;
end loop;
end;
However, I'm running out of undo space. I was under the impression that if the commit was made after each update, then the need for undo space is minimal. Am I incorrect in this assumption?
但是,我的撤消空间不足。我的印象是,如果每次更新后都进行提交,那么对撤消空间的需求是最小的。我的这个假设不正确吗?
回答by Jon Heller
A single update probably would not suffer from the same issue, and would probably be orders of magnitude faster. The large amount of temp tablespace is only needed because of the sorting. Although if your DBA is so stingy with the temp tablespace you may end up running out of UNDO space or something else. (Take a look at ALL_SEGMENTS, how large is your table?)
单个更新可能不会遇到相同的问题,并且可能会快几个数量级。仅由于排序需要大量临时表空间。尽管如果您的 DBA 对临时表空间如此吝啬,您最终可能会耗尽 UNDO 空间或其他空间。(看看ALL_SEGMENTS,你的表有多大?)
But if you really must use this method, maybe you can use a filter instead of an order by. Create 1200 buckets and process them one at a time:
但是如果你真的必须使用这个方法,也许你可以使用过滤器而不是 order by。创建 1200 个桶并一次处理一个:
where ora_hash(rowid, 1200) = 1
where ora_hash(rowid, 1200) = 2
...
But this will be horribly, horribly slow. And what happens if a value changes halfway through the process? A single SQL statement is almost certainly the best way to do this.
但这将是非常非常缓慢的。如果值在过程中途发生变化,会发生什么?一条 SQL 语句几乎肯定是执行此操作的最佳方式。
回答by xt.and.r
Why not just one update or merge? Or you can write anonymous pl/sql block with processing data with cursor For example
为什么不只是一次更新或合并?或者您可以编写匿名 pl/sql 块,使用游标处理数据例如
declare
cursor c is select * from aa for update;
begin
for t_row in c loop
update aa
set val=t_row.val||' new value';
end loop;
commit;
end;
回答by Tony Andrews
How about not updating it at all?
完全不更新怎么办?
rename user_activity to user_activity_gmt
create view user_activity as
select id,
user_id,
event_date+10/24 as event_date,
event,
resource_id,
resource_name
from user_activity_gmt;