MySQL 替换为查询语法

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/19820724/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-31 19:21:19  来源:igfitidea点击:

Replace Into Query Syntax

mysqlsqldatabasereplace

提问by kformeck

I want to be able to update a table of the same schema using a "replace into" statement. In the end, I need to be able to update a large table with values that may have changed.

我希望能够使用“替换为”语句更新相同架构的表。最后,我需要能够使用可能已更改的值来更新大表。

Here is the query I am using to start off:

这是我用来开始的查询:

REPLACE INTO table_name
(visual, inspection_status, inspector_name, gelpak_name, gelpak_location),
VALUES (3, 'Partially Inspected', 'Me', 'GP1234', 'A01');

What I don't understand is how does the database engine know what is a duplicate row and what isn't? This data is extremely important and I can't risk the data being corrupted. Is it as simple as "if all columns listed have the same value, it is a duplicate row"?

我不明白的是数据库引擎如何知道什么是重复行,什么不是?这些数据非常重要,我不能冒数据被破坏的风险。它是否像“如果列出的所有列都具有相同的值,则它是重复的行”一样简单吗?

I am just trying to figure out an efficient way of doing this so I can update > 45,000 rows in under a minute.

我只是想找出一种有效的方法来做到这一点,这样我就可以在一分钟内更新 > 45,000 行。

回答by Filipe Silva

As the documentationsays:

正如文档所说:

REPLACE works exactly like INSERT, except that if an old row in the table has the same value as a new row for a PRIMARY KEY or a UNIQUE index, the old row is deleted before the new row is inserted.

REPLACE 的工作方式与 INSERT 完全相同,但如果表中的旧行与 PRIMARY KEY 或 UNIQUE 索引的新行具有相同的值,则在插入新行之前删除旧行。

回答by Andrew Dunn

REPLACEdoes work much like an INSERTthat just overwrites records that have the same PRIMARY KEYor UNIQUEindex, however, beware.

REPLACE确实很像INSERT只是覆盖具有相同PRIMARY KEYUNIQUE索引的记录,但是,请注意。

Shlomi Noach writes about the problem with using REPLACE INTOhere:

Shlomi Noach 写到在REPLACE INTO这里使用的问题:

But weak hearted people as myself should be aware of the following: it is a heavyweight solution. It may be just what you were looking for in terms of ease of use, but the fact is that on duplicate keys, a DELETE and INSERT are performed, and this calls for a closer look.

Whenever a row is deleted, all indexes need to be updated, and most importantly the PRIMARY KEY. When a new row is inserted, the same happens. Especially on InnoDB tables (because of their clustered nature), this means much overhead. The restructuring of an index is an expensive operation. Index nodes may need to be merged upon DELETE. Nodes may need to be split due to INSERT. After many REPLACE INTO executions, it is most probable that your index is more fragmented than it would have been, had you used SELECT/UPDATE or INSERT INTO ... ON DUPLICATE KEY

Also, there's the notion of "well, if the row isn't there, we create it. If it's there, it simply get's updated". This is false. The row doesn't just get updated, it is completely removed. The problem is, if there's a PRIMARY KEY on that table, and the REPLACE INTO does not specify a value for the PRIMARY KEY (for example, it's an AUTO_INCREMENT column), the new row gets a different value, and this may not be what you were looking for in terms of behavior.

Many uses of REPLACE INTO have no intention of changing PRIMARY KEY (or other UNIQUE KEY) values. In that case, it's better left alone. On a production system I've seen, changing REPLACE INTO to INSERT INTO ... ON DPLICATE KEY resulted in a ten fold more throughput (measured in queries per second) and a drastic decrease in IO operations and in load average.

但是像我这样心软的人应该意识到以下几点:这是一个重量级的解决方案。它可能正是您在易用性方面所寻找的,但事实是,在重复键上,执行 DELETE 和 INSERT,这需要仔细研究。

每当删除一行时,所有索引都需要更新,最重要的是 PRIMARY KEY。插入新行时,也会发生同样的情况。特别是在 InnoDB 表上(因为它们的集群性质),这意味着很多开销。指数的重组是一项昂贵的操作。DELETE 时可能需要合并索引节点。由于 INSERT,节点可能需要拆分。在多次 REPLACE INTO 执行之后,如果您使用 SELECT/UPDATE 或 INSERT INTO ... ON DUPLICATE KEY,您的索引很可能比它本来的更碎片化

此外,还有“好吧,如果行不存在,我们创建它。如果存在,它只是得到更新”的概念。这是错误的。该行不仅会更新,而且会被完全删除。问题是,如果该表上有一个 PRIMARY KEY,并且 REPLACE INTO 没有为 PRIMARY KEY 指定值(例如,它是一个 AUTO_INCREMENT 列),则新行将获得不同的值,这可能不是什么您正在寻找行为方面的内容。

REPLACE INTO 的许多用途无意更改 PRIMARY KEY(或其他 UNIQUE KEY)值。在这种情况下,最好别管它。在我见过的生产系统上,将 REPLACE INTO 更改为 INSERT INTO ... ON DPLICATE KEY 导致吞吐量增加 10 倍(以每秒查询数来衡量),并且 IO 操作和平均负载急剧下降。

In summary, REPLACE INTOmaybe right for your implementation, but you might find it more appropriate (and less risky) to use INSERT ... ON DUPLICATE KEY UPDATEinstead.

总之,REPLACE INTO可能适合您的实现,但您可能会发现使用它更合适(风险更小)INSERT ... ON DUPLICATE KEY UPDATE

回答by emmanuel

or something like that:

或类似的东西:

insert ignore tbl1 (select * from tbl2);

UPDATE
        `tbl1` AS `dest`,
        (SELECT * FROM tbl2) AS `src`
    SET
       dest.field=src.field,
       dest.field=if (length(src.field)>0,src.field,dest.field) /* or anything like that*/
    WHERE
        `dest`.id = `src`.id;