SQL Oracle 简单的更新语句,性能 POOR
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/8586773/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Oracle simple update statement, POOR performance
提问by Ehab Al-Hakawati
I have a table with a big data (around 10 million record), so that the simplest update statement take for ever.
我有一个包含大数据的表(大约 1000 万条记录),因此最简单的更新语句永远存在。
For example:
例如:
update mesg
set archived = 1
, last_update = SYSDATE
where id = 0
and crea_date_time < '07/27/2011 13:53:36'
and archived = 0;
This statement takes around 3 Hours. although we have index on id, and composite index on crea_date_time, and there is no triggers.
此语句大约需要 3 小时。虽然我们在 id 上有索引,在 crea_date_time 上有复合索引,但没有触发器。
Is there any enhancement I can do to increase the performance.
我可以做任何改进来提高性能。
I tried to add index on archive but with no effect.
我试图在存档上添加索引但没有效果。
here is some extra information.
这是一些额外的信息。
CREATE TABLE "MESG"
(
"ID" NUMBER(3,0) NOT NULL ENABLE,
"UMIDL" NUMBER(10,0) NOT NULL ENABLE,
"UMIDH" NUMBER(10,0) NOT NULL ENABLE,
.
.
.
"ARCHIVED" NUMBER(1,0) NOT NULL ENABLE,
"LAST_UPDATE" DATE,
"CREA_DATE_TIME" DATE NOT NULL ENABLE,
.
.
.
CONSTRAINT "PK_RMESG" PRIMARY KEY ("AID", "UMIDH", "UMIDL") USING INDEX PCTFREE 10 INITRANS 2 MAXTRANS 255 COMPUTE STATISTICS STORAGE(INITIAL 524288 NEXT 524288 MINEXTENTS 1 MAXEXTENTS 2147483645 PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT) TABLESPACE "XXXX_IDX" ENABLE
)
SEGMENT CREATION IMMEDIATE PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255 NOCOMPRESS LOGGING STORAGE
(
INITIAL 524288 NEXT 524288 MINEXTENTS 1 MAXEXTENTS 2147483645 PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT
)
TABLESPACE "XXXX_MESG" ;
The indexes:
指标:
CREATE INDEX "E_RCREATIONDATE" ON "RMESG"
(
"CREA_DATE_TIME"
)
PCTFREE 10 INITRANS 2 MAXTRANS 255 COMPUTE STATISTICS STORAGE
(
INITIAL 524288 NEXT 524288 MINEXTENTS 1 MAXEXTENTS 2147483645 PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT
)
TABLESPACE "XXXX_IDX" ;
CREATE UNIQUE INDEX "PK_RMESG" ON "RMESG"
(
"ID", "UMIDH", "UMIDL"
)
PCTFREE 10 INITRANS 2 MAXTRANS 255 COMPUTE STATISTICS STORAGE
(
INITIAL 524288 NEXT 524288 MINEXTENTS 1 MAXEXTENTS 2147483645 PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT
)
TABLESPACE "XXXX_IDX" ;
And The query plan is for my local machine with 180K records
查询计划适用于我的本地机器,有 180K 条记录
ID PID Operation Name Rows Bytes Cost CPU Cost IO Cost Temp space IN-OUT PQ Dist PStart PStop
0 UPDATE STATEMENT 1 44 877 6245703 877
1 0 UPDATE MESG
2 1 TABLE ACCESS BY INDEX ROWID MESG 1 44 877 6245703 877
3 2 INDEX RANGE SCAN IX_MESG_CREATIONDATE 158K 877 6245703 877
回答by René Nyffenegger
I assume that archived is a pseudo-boolean that can be either 0 or 1. In such a case the optimizer might choose to ignore the index.
我假设 archived 是一个伪布尔值,可以是 0 或 1。在这种情况下,优化器可能会选择忽略索引。
In order to verify if the index is used you might try
为了验证是否使用了索引,您可以尝试
explain plan for
update mesg set archived = :a,
last_update = :b
where id = :c and
crea_date_time < :d and
archived = :e;
and then
进而
select * from table(dbms_xplan.display);
回答by Florin Ghita
If you update many rows, indexes will not help you. They only improve acces to data.
如果您更新许多行,索引将无济于事。它们只会改进对数据的访问。
What is slow for you is the UPDATE itself.
对你来说慢的是 UPDATE 本身。
A. Is this table read/updated heavy by users?
A. 该表是否被用户大量读取/更新?
If not, you can try to recreate the table with new data.
如果没有,您可以尝试使用新数据重新创建表。
create table copy_table as
select case
when s.archived = 0 and s.crea_date_time < '07/27/2011 13:53:36' and s.id = 0
then 1 else s.archived as archived,
when s.archived = 0 and s.crea_date_time < '07/27/2011 13:53:36' and s.id = 0
then sysdate else s.last_update_date as last_update_date,
id,
other_columns
from mesg;
rename mesg to mesg_old;
rename copy_table to mesg;
B. Another ideea/help is, if you have license, to partition your mesg table on crea_date_time (I assume it is date
type).
In this case, your update, will not scan entire table, but more important, probably you'll never need to mark as archive things. Old partitions - old data.
B. 另一个想法/帮助是,如果你有许可证,在 crea_date_time 上对你的 mesg 表进行分区(我假设它是date
类型)。在这种情况下,您的更新不会扫描整个表,但更重要的是,您可能永远不需要将内容标记为存档。旧分区 - 旧数据。
C. The index on CREA_DATE_TIME is slowing your update. If it is not absolutley necessary, drop it.
C. CREA_DATE_TIME 上的索引正在减慢您的更新。如果它不是绝对必要的,就放弃它。
回答by HLGEM
First do you really mean to run:
首先你真的想运行:
update mesg
set archived = 1
, last_update = SYSDATE
where id = 0 and crea_date_time < '07/27/2011 13:53:36'
and archived <>1
( is <> the right code for ORacle? Or does is use !=? or maybe where archived is null or where archived = 0 depanding on how you store the data)
(<> 是适用于 ORacle 的正确代码吗?还是使用 !=?或者归档为 null 或归档 = 0 的位置取决于您如何存储数据)
Right now you are changing the last_update date and the archived field for all the records already archived. So you could be updating millions of records that are already archived. So rather than update the 120,000 records that need to be archived since teh last time you did it you are updating 35,000,000 records most of which are already archived. Can make a big difference in performance to only update records that need to be updated.
现在,您正在更改所有已存档记录的 last_update 日期和存档字段。因此,您可能正在更新数百万条已存档的记录。因此,与其更新自上次执行以来需要存档的 120,000 条记录,不如更新 35,000,000 条记录,其中大部分已存档。仅更新需要更新的记录可以对性能产生很大的影响。
Next I don't know about Oracle, but sometimes it is faster to run a large update/insert/delete in batches in SQl Server. So have you tried looping through 1000 (or 50000 you may have to test to see what works) at a time? This can reduce a lot of contention on the table and make things work faster.
接下来我不知道Oracle,但有时在SQl Server中批量运行大型更新/插入/删除会更快。那么您是否尝试过一次循环 1000(或 50000,您可能需要测试以查看哪些有效)?这可以减少桌面上的大量争用,并使工作更快。
回答by Rob van Laarhoven
Indexes are not the problem. It's the updating.
索引不是问题。是更新。
I think you'll have to dive into wait analysis, dictionary table v$session_waits. If you have enterprise manager database control you can use the performance tools to see what is causing the delay. My guess is that is has something to do with IO performance of the redo logs OR as others mentioned a locking issue.
我认为您必须深入研究等待分析,字典表 v$session_waits。如果您拥有企业管理器数据库控制,您可以使用性能工具查看导致延迟的原因。我的猜测是这与重做日志的 IO 性能或其他人提到的锁定问题有关。
To start with:
首先:
select
seq#, event, p1, p2, p3
from
v$session_wait_history
where
sid = <yoursid>
order by seq#
;
Google on oracle wait analysis, you'll find lot's of material.
谷歌上 oracle 等待分析,你会发现很多材料。
回答by Mayur Sawant
You should go for table partitioning and index partitioning. Performance will increase
您应该进行表分区和索引分区。性能会提高
回答by dani herrera
Three hours is too long, even for the amount of data.
三个小时太长了,即使对于数据量也是如此。
You can try to modify query to avoid type casting, apply to_data function to your date string (convert this '07/27/2011 13:53:36' to date)
您可以尝试修改查询以避免类型转换,将 to_data 函数应用于您的日期字符串(将此 '07/27/2011 13:53:36' 转换为日期)
But the problem should be locks, you can check for locks with Killing an oracle session to remove a lockscript.
但是问题应该是 locks,您可以使用Killing an oracle session 来检查 locks以删除 lockscript。
Also, you can split your big query in small updates, some thinkl like:
此外,您可以将大查询拆分为小更新,例如:
for each year:
begin transaction
update statement where year(date) = @year and other conditions
end transaction