在 SQL 中,UPDATE 总是比 DELETE+INSERT 快吗?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1271641/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
In SQL, is UPDATE always faster than DELETE+INSERT?
提问by Roee Adler
Say I have a simple table that has the following fields:
假设我有一个简单的表,其中包含以下字段:
- ID: int, autoincremental (identity), primary key
- Name: varchar(50), unique, has unique index
- Tag: int
- ID: int, autoincremental (identity), 主键
- 名称:varchar(50),唯一,具有唯一索引
- 标签:int
I never use the ID field for lookup, because my application is always based on working with the Name field.
我从不使用 ID 字段进行查找,因为我的应用程序始终基于使用 Name 字段。
I need to change the Tag value from time to time. I'm using the following trivial SQL code:
我需要不时更改标签值。我正在使用以下简单的 SQL 代码:
UPDATE Table SET Tag = XX WHERE Name = YY;
I wondered if anyone knows whether the above is always faster than:
我想知道是否有人知道以上是否总是比:
DELETE FROM Table WHERE Name = YY;
INSERT INTO Table (Name, Tag) VALUES (YY, XX);
Again - I know that in the second example the ID is changed, but it does not matter for my application.
再次 - 我知道在第二个示例中 ID 已更改,但这对我的应用程序无关紧要。
回答by Dyptorden
A bit too late with this answer, but since I faced a similar question, I made a test with JMeter and a MySQL server on same machine, where I have used:
这个答案有点太晚了,但由于我遇到了类似的问题,我在同一台机器上用 JMeter 和 MySQL 服务器进行了测试,我使用过:
- A transaction Controller (generating parent sample) that contained two JDBC Requests: a Delete and an Insert statement
- A sepparate JDBC Request containing the Update statement.
- 包含两个 JDBC 请求的事务控制器(生成父样本):删除和插入语句
- 包含更新语句的单独 JDBC 请求。
After running the test for 500 loops, I have obtained the following results:
在运行了 500 个循环的测试后,我得到了以下结果:
DEL + INSERT - Average: 62ms
删除 + 插入 - 平均:62 毫秒
Update - Average: 30ms
更新 - 平均:30 毫秒
回答by APC
The bigger the table (number of and size of columns) the more expensive it becomes to delete and insert rather than update. Because you have to pay the price of UNDO and REDO. DELETEs consume more UNDO space than UPDATEs, and your REDO contains twice as many statements as are necessary.
表越大(列的数量和大小),删除和插入而不是更新的成本就越高。因为你必须付出UNDO和REDO的代价。DELETE 比 UPDATE 消耗更多的 UNDO 空间,并且您的 REDO 包含两倍于必要的语句。
Besides, it is plain wrong from a business point of view. Consider how much harder it would be to understand a notional audit trail on that table.
此外,从商业角度来看,这显然是错误的。考虑一下理解该表上的名义审计跟踪会有多困难。
There are some scenarios involving bulk updates of all the rows in a table where it is faster to create a new table using CTAS from the old table (applying the update in the the projection of the SELECT clause), dropping the old table and renaming the new table. The side-effects are creating indexes, managing constraints and renewing privileges, but it is worth considering.
有一些涉及批量更新表中所有行的场景,其中使用旧表中的 CTAS 创建新表(在 SELECT 子句的投影中应用更新)、删除旧表并重命名新表会更快新表。副作用是创建索引、管理约束和更新权限,但值得考虑。
回答by KM.
One command on the same row should always be faster than two on that same row. So the UPDATE only would be better.
同一行上的一个命令应该总是比同一行上的两个命令快。所以更新只会更好。
EDITset up the table:
编辑设置表:
create table YourTable
(YourName varchar(50) primary key
,Tag int
)
insert into YourTable values ('first value',1)
run this, which takes 1 second on my system (sql server 2005):
运行这个,在我的系统上需要 1 秒(sql server 2005):
SET NOCOUNT ON
declare @x int
declare @y int
select @x=0,@y=0
UPDATE YourTable set YourName='new name'
while @x<10000
begin
Set @x=@x+1
update YourTable set YourName='new name' where YourName='new name'
SET @y=@y+@@ROWCOUNT
end
print @y
run this, which took 2 seconds on my system:
运行这个,在我的系统上花了 2 秒钟:
SET NOCOUNT ON
declare @x int
declare @y int
select @x=0,@y=0
while @x<10000
begin
Set @x=@x+1
DELETE YourTable WHERE YourName='new name'
insert into YourTable values ('new name',1)
SET @y=@y+@@ROWCOUNT
end
print @y
回答by Gennady Vanin Геннадий Ванин
I am afraid the body of your question is unrelated to title question.
恐怕您的问题的正文与标题问题无关。
If to answer the title:
如果回答标题:
In SQL, is UPDATE always faster than DELETE+INSERT?
在 SQL 中,UPDATE 总是比 DELETE+INSERT 快吗?
then answer is NO!
那么答案是否定的!
Just google for
只需谷歌搜索
- "Expensive direct update"* "sql server"
- "deferred update"* "sql server"
- “昂贵的直接更新”*“sql server”
- “延迟更新”*“sql server”
Such update(s) result in more costly (more processing) realization of update through insert+update than direct insert+update. These are the cases when
与直接插入+更新相比,这种更新导致通过插入+更新实现更新的成本更高(更多处理)。这些是当
- one updates the field with unique (or primary) key or
- when the new data does not fit (is bigger) in the pre-update row space allocated (or even maximum row size),resulting in fragmentation,
- etc.
- 使用唯一(或主)键更新字段或
- 当新数据不适合(较大)分配的更新前行空间(甚至最大行大小)时,导致碎片,
- 等等。
My fast (non-exhaustive) search, not pretending to be covering one, gave me [1], [2]
我的快速(非穷举)搜索,而不是假装覆盖一个,给了我 [1], [2]
[1]
Update Operations
(Sybase? SQL Server Performance and Tuning Guide
Chapter 7: The SQL Server Query Optimizer)
http://www.lcard.ru/~nail/sybase/perf/11500.htm
[2]
UPDATE Statements May be Replicated as DELETE/INSERT Pairs
http://support.microsoft.com/kb/238254
[1]
更新操作
(Sybase?SQL Server 性能和调优指南
第 7 章:SQL Server 查询优化器)
http://www.lcard.ru/~nail/sybase/perf/11500.htm
[2]
UPDATE 语句可能是复制为删除/插入对
http://support.microsoft.com/kb/238254
回答by jishi
Keep in mind the actual fragmentation that occurs when DELETE+INSERT is issued opposed to a correctly implemented UPDATE will make great difference by time.
请记住,当发出 DELETE+INSERT 与正确实现的 UPDATE 相对时发生的实际碎片将随着时间产生很大的不同。
Thats why, for instance, REPLACE INTO that MySQL implements is discouraged as opposed to using the INSERT INTO ... ON DUPLICATE KEY UPDATE ... syntax.
这就是为什么,例如,与使用 INSERT INTO ... ON DUPLICATE KEY UPDATE ... 语法相反,不鼓励 MySQL 实现的 REPLACE INTO。
回答by jishi
Just tried updating 43 fields on a table with 44 fields, the remaining field was the primary clustered key.
刚刚尝试更新具有 44 个字段的表上的 43 个字段,其余字段是主聚集键。
The update took 8 seconds.
更新耗时 8 秒。
A Delete + Insert is faster than the minimum time interval that the "Client Statistics" reports via SQL Management Studio.
删除 + 插入比“客户端统计”通过 SQL Management Studio 报告的最小时间间隔更快。
Peter
彼得
MS SQL 2008
微软 SQL 2008
回答by jishi
In your case, I believe the update will be faster.
在你的情况下,我相信更新会更快。
Remember indexes!
记住索引!
You have defined a primary key, it will likely automatically become a clustered index (at least SQL Server does so). A cluster index means the records are physically laid on the disk according to the index. DELETE operation itself won't cause much trouble, even after one record goes away, the index stays correct. But when you INSERT a new record, the DB engine will have to put this record in the correct location which under circumstances will cause some "reshuffling" of the old records to "make place" for a new one. There where it will slow down the operation.
您已经定义了一个主键,它很可能会自动成为一个聚集索引(至少 SQL Server 是这样)。簇索引意味着记录根据索引物理放置在磁盘上。DELETE 操作本身不会造成太大的麻烦,即使一条记录消失,索引仍然正确。但是当你插入一条新记录时,数据库引擎必须把这条记录放在正确的位置,这在某些情况下会导致旧记录的一些“重新排列”为新记录“腾出空间”。在那里它会减慢操作速度。
An index (especially clustered) works best if the values are ever increasing, so the new records just get appended to the tail. Maybe you can add an extra INT IDENTITY column to become a clustered index, this will simplify insert operations.
如果值不断增加,索引(尤其是聚集的)效果最好,因此新记录只会附加到尾部。也许您可以添加一个额外的 INT IDENTITY 列来成为聚集索引,这将简化插入操作。
回答by Andy Lester
The question of speed is irrelevant without a specific speed problem.
如果没有特定的速度问题,速度问题就无关紧要。
If you are writing SQL code to make a change to an existing row, you UPDATE it. Anything else is incorrect.
如果您正在编写 SQL 代码来更改现有行,请更新它。其他的都是不正确的。
If you're going to break the rules of how code should work, then you'd better have a damn good, quantified reason for it, and not a vague idea of "This way is faster", when you don't have any idea what "faster" is.
如果你要打破代码应该如何工作的规则,那么你最好有一个非常好的、量化的理由,而不是一个模糊的“这种方式更快”的想法,当你没有任何了解什么是“更快”。
回答by brian
What if you have a few million rows. Each row starts with one piece of data, perhaps a client name. As you collect data for clients, their entries must be updated. Now, let's assume that the collection of client data is distributed across numerous other machines from which it is later collected and put into the database. If each client has unique information, then you would not be able to perform a bulk update; i.e., there is no where-clause criteria for you to use to update multiple clients in one shot. On the other hand, you could perform bulk inserts. So, the question might be better posed as follows: Is it better to perform millions of single updates, or is it better to compile them into large bulk deletes and inserts. In other words, instead of "update [table] set field=data where clientid=123" a milltion times, you do 'delete from [table] where clientid in ([all clients to be updated]);insert into [table] values (data for client1), (data for client2), etc'
如果你有几百万行怎么办。每一行都以一个数据开头,可能是一个客户端名称。当您为客户收集数据时,必须更新他们的条目。现在,让我们假设客户端数据的集合分布在许多其他机器上,稍后从这些机器中收集并放入数据库。如果每个客户端都有唯一的信息,那么您将无法执行批量更新;即,没有 where-clause 标准可用于一次性更新多个客户端。另一方面,您可以执行批量插入。因此,问题可能更好地提出如下:执行数百万次单个更新更好,还是将它们编译成大批量删除和插入更好。换句话说,不是“update [table] set field=data where clientid=123”一百万次,
Is either choice better than the other, or are you screwed both ways?
要么选择比另一个更好,要么你都搞砸了?
回答by erjiang
Obviously, the answer varies based on what database you are using, but UPDATE can always be implemented faster than DELETE+INSERT. Since in-memory ops are mostly trivial anyways, given a hard-drive based database, an UPDATE can change a database field in-place on the hdd, while a delete would remove a row (leaving an empty space), and insert a new row, perhaps to the end of the table (again, it's all in the implementation).
显然,答案因您使用的数据库而异,但 UPDATE 的实现速度总是比 DELETE+INSERT 快。由于内存操作无论如何都是微不足道的,鉴于基于硬盘驱动器的数据库,更新可以在硬盘上就地更改数据库字段,而删除将删除一行(留下空白空间),并插入一个新的行,也许到表的末尾(同样,这一切都在实现中)。
The other, minor, issue is that when you UPDATE a single variable in a single row, the other columns in that row remain the same. If you DELETE and then do an INSERT, you run the risk of forgetting about other columns and consequently leaving them behind (in which case you would have to do a SELECT before your DELETE to temporarily store your other columns before writing them back with INSERT).
另一个次要问题是,当您更新单行中的单个变量时,该行中的其他列保持不变。如果您删除然后执行插入,则可能会忘记其他列并因此将它们留在后面(在这种情况下,您必须在 DELETE 之前执行 SELECT 以临时存储其他列,然后再使用 INSERT 将它们写回) .