MySQL 更新与插入性能

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/7299731/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-31 20:59:35  来源:igfitidea点击:

UPDATE vs INSERT performance

mysqlsql

提问by user838437

Am I correct to assume that an UPDATEquery takes more resources than an INSERTquery?

我假设UPDATE查询比查询占用更多资源是否正确INSERT

采纳答案by Davide Piras

I am not a database guru but here my two cents:

我不是数据库专家,但这里有我的两分钱:

Personally I don't think you have much to do in this regard, even if INSERT would be faster (all to be proven), can you convert an update in an insert?! Frankly I don't think you can do it all the times.

就我个人而言,我认为您在这方面没有什么可做的,即使 INSERT 会更快(所有这些都有待证明),您能否在插入中转换更新?!坦率地说,我不认为你可以一直这样做。

During an INSERT you don't usually have to use WHERE to identify which row to update but depending on your indices on that table the operation can have some cost.

在 INSERT 期间,您通常不必使用 WHERE 来确定要更新的行,但根据您在该表上的索引,操作可能会产生一些成本。

During an update if you do not change any column included in any indices you could have quick execution, if the where clause is easy and fast enough.

在更新期间,如果您不更改任何索引中包含的任何列,您可以快速执行,如果 where 子句足够简单和快速。

Nothing is written on stones and really I would imagine it depends on whole database setup, indices and so on.

没有什么是写在石头上的,我真的认为这取决于整个数据库设置、索引等。

Anyway, found this one as a reference:

无论如何,找到了这个作为参考:

Top 84 MySQL Performance Tips

前 84 个 MySQL 性能提示

回答by GK10

If you plan to perform a large processing (such as rating or billing for a cellular company), this question has a huge impact on system performance.

如果您计划执行大型处理(例如为蜂窝公司进行评级或计费),则此问题对系统性能有巨大影响。

Performing large scale updates vs making many new tables and index has proven to reduce my company billing process form 26 hours to 1 hour!

执行大规模更新与创建许多新表和索引已被证明可以将我公司的计费流程从 26 小时减少到 1 小时!

I have tried it on 2 million records for 100,000 customer. I first created the billing table and then every customer summary calls, I updated the billing table with the duration, price, discount.. a total of 10 fields.

我已经为 100,000 个客户尝试了 200 万条记录。我首先创建了帐单表,然后每个客户汇总呼叫,我用持续时间、价格、折扣……总共 10 个字段更新了帐单表。

In the second option I created 4 phases. Each phase reads the previous table(s), creates index (after the table insert completed) and using: "insert into from select .." I have created the next table for the next phase.

在第二个选项中,我创建了 4 个阶段。每个阶段读取前一个表,创建索引(在表插入完成后)并使用:“从选择插入......”我已经为下一个阶段创建了下一个表。

Summary Although the second alternative requires much more disk space (all views and temporary tables deleted at the end) there are 3 main advantages to this option: 1. It was 4 time faster than option 1. 2. In case there was a problem in the middle of the process I could start the process from the point it failed, as all the tables for the beginning of the phase were ready and the process could restart from this point. If the process fails implementing the first option, you will need to start the all the process all over again. 3. This made the development and QA work much faster as they could work parallel .

总结 尽管第二种选择需要更多的磁盘空间(最后删除所有视图和临时表),但此选项有 3 个主要优点: 1. 它比选项 1 快 4 倍。 2. 万一出现问题在流程的中间,我可以从失败的点开始流程,因为阶段开始的所有表都准备好了,流程可以从这一点重新开始。如果该过程无法实施第一个选项,您将需要重新开始所有过程。3. 这使得开发和 QA 工作得更快,因为他们可以并行工作。

回答by Mark Wilkins

It depends. A simple UPDATE that uses a primary key in the WHERE clause and updates only a single non-indexed field would likely be less costly than an INSERT on the same table. But even that depends on the database engine involved. An UPDATE that involved modifying many indexed fields, however, might be more costly than the INSERT on that table because more index key modifications would be required. An UPDATE with a poorly constructed WHERE clause that required a table scan of millions of records would certainly be more expensive than an INSERT on that table.

这取决于。在 WHERE 子句中使用主键并仅更新单个非索引字段的简单 UPDATE 可能比同一表上的 INSERT 成本更低。但即便如此,也取决于所涉及的数据库引擎。但是,涉及修改许多索引字段的 UPDATE 可能比该表上的 INSERT 成本更高,因为需要更多的索引键修改。带有需要对数百万条记录进行表扫描的构造不佳的 WHERE 子句的 UPDATE 肯定比对该表执行 INSERT 的成本更高。

These statements can take many forms, but if you limit the discussion to their "basic" forms that involve a single record, then the larger portion of the cost will usually be dedicated to modifying the indexes. Each indexed field that is modified during an UPDATE would typically involve two basic operations (delete the old key and add the new key) whereas the INSERT would require one (add the new key). Of course, a clustered index would then add some other dynamics as would locking issues, transaction isolation, etc. So, ultimately, the comparison between these statements in a general sense is not really possible and would probably require benchmarking of specific statements if it actually mattered.

这些语句可以采用多种形式,但是如果您将讨论限制在涉及单个记录的“基本”形式,那么成本的大部分通常将专门用于修改索引。在 UPDATE 期间修改的每个索引字段通常会涉及两个基本操作(删除旧键并添加新键),而 INSERT 将需要一个(添加新键)。当然,聚簇索引会添加一些其他动态,如锁定问题、事务隔离等。因此,最终,这些语句之间的比较在一般意义上是不可能的,如果确实如此,可能需要对特定语句进行基准测试很重要。

Typically, though, it makes sense to just use the correct statement and not worry about it since it is usually not an option to choose between an UPDATE and an INSERT.

但是,通常情况下,只使用正确的语句而不用担心它是有意义的,因为它通常不是在 UPDATE 和 INSERT 之间进行选择的选项。

回答by Simone

It depends. If update don't require changes of the key it's most likely that it will only costs like a search and then it will probably cost less than an insert, unless database is organized like an heap.

This is the only think i can state, because performances greatly depends on the database organization used.

If you for example use MyISAM that i suppose organized like an isam, insert should cost generally the same in terms of database read accesses but it will require some additional write operation.

这取决于。如果更新不需要更改键,那么它很可能只会像搜索一样花费,然后它可能会比插入花费更少,除非数据库像堆一样组织。

这是我唯一能说的,因为性能在很大程度上取决于所使用的数据库组织。

例如,如果您使用 MyISAM,我认为它的组织方式类似于 isam,则插入在数据库读取访问方面的成本通常相同,但它需要一些额外的写入操作。

回答by OlivierS

On Sybase / SQL Server an update which impacts a column with a read-only index is internally replaced by a delete and then an insert, so this is obviously slower than insert. I do not know the implementation for other engines but I think this is a common strategy at least when indices are involved. Now for tables without indices ( or for update requests not involving any index ) I suppose there are cases where the update can be faster, depending on the structure of the table.

在 Sybase / SQL Server 上,影响具有只读索引的列的更新在内部由删除和插入替换,因此这显然比插入慢。我不知道其他引擎的实现,但我认为至少在涉及索引时这是一个常见的策略。现在对于没有索引的表(或不涉及任何索引的更新请求),我想在某些情况下更新可以更快,具体取决于表的结构。

回答by jayadev

The key resource here is disk access (IOPS to be precise) and we should evaluate which ones results in minimum of that.

这里的关键资源是磁盘访问(准确地说是 IOPS),我们应该评估哪些资源会导致该资源最少。

Agree with others on how it is impossible to give a generic answer but some thoughts to lead you in the right direction , assume a simple key-value store and key is indexed. Insertion is inserting a new key and update is updating the value of an existing key.

同意其他人的看法,即不可能给出通用答案,但有一些想法可以引导您朝着正确的方向前进,假设一个简单的键值存储和键已编入索引。插入是插入一个新键,更新是更新现有键的值。

If that is the case (a very common case) , update would be faster than insertion because update involves an indexed lookup and changing an existing value without touching the index. You can assume that is one disk read to get the data and possibly one disk write. On the other hand insertion would involve two disk writes one for index , one for data. But the another hidden cost is the btree node splitting and new node creation which would happen in background while insertion leading to more disk access on average.

如果是这种情况(一种非常常见的情况),更新将比插入更快,因为更新涉及索引查找和更改现有值而不触及索引。您可以假设这是一个磁盘读取以获取数据,并且可能是一个磁盘写入。另一方面,插入将涉及两次磁盘写入,一次用于索引,一次用于数据。但是另一个隐藏的成本是 btree 节点分裂和新节点创建,这会在后台发生,而插入导致平均更多的磁盘访问。

回答by bpgergo

You cannot compare an INSERT and an UPDATE in general. Give us an example (with schema definition) and we will explain which one costs more and why. Also, you can compere a concrete INSERT and an UPDATE by checking their plan and execution time.

一般来说,您不能比较 INSERT 和 UPDATE。给我们一个例子(带有模式定义),我们将解释哪个成本更高以及为什么。此外,您可以通过检查它们的计划和执行时间来比较具体的 INSERT 和 UPDATE。

Some rules of thumbs though:

一些经验法则:

  • if you only update only one field, which is not indexed and you only update one record and you use rowid/primary key to find that record then this UPDATE will cost less, than
  • an INSERT, which will also affect only one row, though this row will have many not null constrained, indexed fields; and all those indexes have to be maintained (e.g. add a new leaf)
  • 如果您只更新一个未编入索引的字段,并且您只更新一条记录,并且您使用 rowid/primary key 来查找该记录,那么此 UPDATE 的成本将低于
  • 一个 INSERT,它也只会影响一行,尽管这一行会有许多非空约束的索引字段;并且必须维护所有这些索引(例如添加新叶子)