如何在 MySQL 中“如果不存在则插入”?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1361340/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-31 14:01:05  来源:igfitidea点击:

How to 'insert if not exists' in MySQL?

mysqlsqlprimary-keysql-insert

提问by warren

I started by googling, and found this articlewhich talks about mutex tables.

我从谷歌搜索开始,发现这篇文章讨论了互斥表。

I have a table with ~14 million records. If I want to add more data in the same format, is there a way to ensure the record I want to insert does not already exist without using a pair of queries (ie, one query to check and one to insert is the result set is empty)?

我有一张包含约 1400 万条记录的表。如果我想以相同的格式添加更多数据,有没有办法确保我想插入的记录不存在而不使用一对查询(即,一个查询要检查,一个要插入的是结果集是空的)?

Does a uniqueconstraint on a field guarantee the insertwill fail if it's already there?

如果unique字段insert已经存在,对字段的约束是否保证会失败?

It seems that with merelya constraint, when I issue the insert via php, the script croaks.

似乎只有一个约束,当我通过 php 发出插入时,脚本会发出咝咝声。

采纳答案by knittl

use INSERT IGNORE INTO table

INSERT IGNORE INTO table

see http://bogdan.org.ua/2007/10/18/mysql-insert-if-not-exists-syntax.html

http://bogdan.org.ua/2007/10/18/mysql-insert-if-not-exists-syntax.html

there's also INSERT … ON DUPLICATE KEY UPDATEsyntax, you can find explanations on dev.mysql.com

还有INSERT … ON DUPLICATE KEY UPDATE语法,你可以在dev.mysql.com上找到解释



Post from bogdan.org.ua according to Google's webcache:

根据谷歌的网络缓存从 bogdan.org.ua 发布:

18th October 2007

To start: as of the latest MySQL, syntax presented in the title is not possible. But there are several very easy ways to accomplish what is expected using existing functionality.

There are 3 possible solutions: using INSERT IGNORE, REPLACE, or INSERT … ON DUPLICATE KEY UPDATE.

Imagine we have a table:

CREATE TABLE `transcripts` (
`ensembl_transcript_id` varchar(20) NOT NULL,
`transcript_chrom_start` int(10) unsigned NOT NULL,
`transcript_chrom_end` int(10) unsigned NOT NULL,
PRIMARY KEY (`ensembl_transcript_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

Now imagine that we have an automatic pipeline importing transcripts meta-data from Ensembl, and that due to various reasons the pipeline might be broken at any step of execution. Thus, we need to ensure two things:

  1. repeated executions of the pipeline will not destroy our database

  2. repeated executions will not die due to ‘duplicate primary key' errors.

Method 1: using REPLACE

It's very simple:

REPLACE INTO `transcripts`
SET `ensembl_transcript_id` = 'ENSORGT00000000001',
`transcript_chrom_start` = 12345,
`transcript_chrom_end` = 12678;

If the record exists, it will be overwritten; if it does not yet exist, it will be created. However, using this method isn't efficient for our case: we do not need to overwrite existing records, it's fine just to skip them.

Method 2: using INSERT IGNORE Also very simple:

INSERT IGNORE INTO `transcripts`
SET `ensembl_transcript_id` = 'ENSORGT00000000001',
`transcript_chrom_start` = 12345,
`transcript_chrom_end` = 12678;

Here, if the ‘ensembl_transcript_id' is already present in the database, it will be silently skipped (ignored). (To be more precise, here's a quote from MySQL reference manual: “If you use the IGNORE keyword, errors that occur while executing the INSERT statement are treated as warnings instead. For example, without IGNORE, a row that duplicates an existing UNIQUE index or PRIMARY KEY value in the table causes a duplicate-key error and the statement is aborted.”.) If the record doesn't yet exist, it will be created.

This second method has several potential weaknesses, including non-abortion of the query in case any other problem occurs (see the manual). Thus it should be used if previously tested without the IGNORE keyword.

Method 3: using INSERT … ON DUPLICATE KEY UPDATE:

Third option is to use INSERT … ON DUPLICATE KEY UPDATEsyntax, and in the UPDATE part just do nothing do some meaningless (empty) operation, like calculating 0+0 (Geoffray suggests doing the id=id assignment for the MySQL optimization engine to ignore this operation). Advantage of this method is that it only ignores duplicate key events, and still aborts on other errors.

As a final notice: this post was inspired by Xaprb. I'd also advise to consult his other post on writing flexible SQL queries.

2007 年 10 月 18 日

首先:从最新的 MySQL 开始,标题中显示的语法是不可能的。但是有几种非常简单的方法可以使用现有功能实现预期目标。

有 3 种可能的解决方案:使用 INSERT IGNORE、REPLACE 或 INSERT ... ON DUPLICATE KEY UPDATE。

假设我们有一张表:

CREATE TABLE `transcripts` (
`ensembl_transcript_id` varchar(20) NOT NULL,
`transcript_chrom_start` int(10) unsigned NOT NULL,
`transcript_chrom_end` int(10) unsigned NOT NULL,
PRIMARY KEY (`ensembl_transcript_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

现在想象一下,我们有一个从 Ensembl 导入转录本元数据的自动管道,并且由于各种原因,管道可能会在执行的任何步骤中被破坏。因此,我们需要确保两件事:

  1. 管道的重复执行不会破坏我们的数据库

  2. 重复执行不会因“重复主键”错误而死亡。

方法一:使用REPLACE

这很简单:

REPLACE INTO `transcripts`
SET `ensembl_transcript_id` = 'ENSORGT00000000001',
`transcript_chrom_start` = 12345,
`transcript_chrom_end` = 12678;

如果记录存在,将被覆盖;如果它还不存在,它将被创建。但是,对于我们的情况,使用这种方法效率不高:我们不需要覆盖现有记录,跳过它们就可以了。

方法二:使用 INSERT IGNORE 也很简单:

INSERT IGNORE INTO `transcripts`
SET `ensembl_transcript_id` = 'ENSORGT00000000001',
`transcript_chrom_start` = 12345,
`transcript_chrom_end` = 12678;

在这里,如果 'ensembl_transcript_id' 已经存在于数据库中,它将被悄悄跳过(忽略)。(更准确地说,这是 MySQL 参考手册中的引用:“如果您使用 IGNORE 关键字,则在执行 INSERT 语句时发生的错误将被视为警告。例如,如果没有 IGNORE,重复现有 UNIQUE 索引的行或表中的 PRIMARY KEY 值会导致重复键错误并且语句被中止。”。)如果该记录尚不存在,则将创建该记录。

第二种方法有几个潜在的弱点,包括在发生任何其他问题时不中止查询(参见手册)。因此,如果之前在没有 IGNORE 关键字的情况下进行过测试,则应该使用它。

方法 3:使用 INSERT ... ON DUPLICATE KEY UPDATE:

第三个选项是使用INSERT … ON DUPLICATE KEY UPDATE语法,在 UPDATE 部分什么也不做一些无意义的(空)操作,比如计算 0+0(Geoffray 建议为 MySQL 优化引擎做 id=id 分配来忽略这个操作)。这种方法的优点是它只忽略重复的键事件,并且仍然在其他错误时中止。

最后通知:这篇文章的灵感来自 Xaprb。我还建议参考他关于编写灵活的 SQL 查询的另一篇文章。

回答by Server

Solution:

解决方案:

INSERT INTO `table` (`value1`, `value2`) 
SELECT 'stuff for value1', 'stuff for value2' FROM DUAL 
WHERE NOT EXISTS (SELECT * FROM `table` 
      WHERE `value1`='stuff for value1' AND `value2`='stuff for value2' LIMIT 1) 

Explanation:

解释:

The innermost query

最里面的查询

SELECT * FROM `table` 
      WHERE `value1`='stuff for value1' AND `value2`='stuff for value2' LIMIT 1

used as the WHERE NOT EXISTS-condition detects if there already exists a row with the data to be inserted. After one row of this kind is found, the query may stop, hence the LIMIT 1(micro-optimization, may be omitted).

用作WHERE NOT EXISTS-condition 检测是否已经存在要插入数据的行。找到这样的一行后,查询可能会停止,因此LIMIT 1(微优化,可以省略)。

The intermediate query

中间查询

SELECT 'stuff for value1', 'stuff for value2' FROM DUAL

represents the values to be inserted. DUALrefers to a special one row, one column table present by default in all Oracle databases (see https://en.wikipedia.org/wiki/DUAL_table). On a MySQL-Server version 5.7.26 I got a valid query when omitting FROM DUAL, but older versions (like 5.5.60) seem to require the FROMinformation. By using WHERE NOT EXISTSthe intermediate query returns an empty result set if the innermost query found matching data.

表示要插入的值。DUAL指的是所有 Oracle 数据库中默认存在的特殊的一行一列表(请参阅https://en.wikipedia.org/wiki/DUAL_table)。在 MySQL 服务器版本 5.7.26 上,省略时我得到了一个有效的查询FROM DUAL,但旧版本(如 5.5.60)似乎需要该FROM信息。WHERE NOT EXISTS如果最内层查询找到匹配数据,则通过使用中间查询返回空结果集。

The outer query

外部查询

INSERT INTO `table` (`value1`, `value2`) 

inserts the data, if any is returned by the intermediate query.

插入数据,如果中间查询返回任何数据。

回答by Zed

on duplicate key update, or insert ignorecan be viable solutions with MySQL.

在重复的密钥更新上,或插入忽略可能是 MySQL 的可行解决方案。



Example of on duplicate key updateupdate based on mysql.com

基于mysql.com的重复密钥更新更新示例

INSERT INTO table (a,b,c) VALUES (1,2,3)
  ON DUPLICATE KEY UPDATE c=c+1;

UPDATE table SET c=c+1 WHERE a=1;


Example of insert ignorebased on mysql.com

基于mysql.com的insert ignore示例

INSERT [LOW_PRIORITY | DELAYED | HIGH_PRIORITY] [IGNORE]
    [INTO] tbl_name [(col_name,...)]
    {VALUES | VALUE} ({expr | DEFAULT},...),(...),...
    [ ON DUPLICATE KEY UPDATE
      col_name=expr
        [, col_name=expr] ... ]

Or:

或者:

INSERT [LOW_PRIORITY | DELAYED | HIGH_PRIORITY] [IGNORE]
    [INTO] tbl_name
    SET col_name={expr | DEFAULT}, ...
    [ ON DUPLICATE KEY UPDATE
      col_name=expr
        [, col_name=expr] ... ]

Or:

或者:

INSERT [LOW_PRIORITY | HIGH_PRIORITY] [IGNORE]
    [INTO] tbl_name [(col_name,...)]
    SELECT ...
    [ ON DUPLICATE KEY UPDATE
      col_name=expr
        [, col_name=expr] ... ]

回答by KLE

Any simple constraint should do the job, if an exception is acceptable. Examples :

如果可以接受异常,则任何简单的约束都可以完成这项工作。例子 :

  • primary key if not surrogate
  • unique constraint on a column
  • multi-column unique constraint
  • 主键,如果不是代理
  • 列上的唯一约束
  • 多列唯一约束

Sorry is this seems deceptively simple. I know it looks bad confronted to the link you share with us. ;-(

抱歉,这看起来似乎很简单。我知道面对您与我们分享的链接看起来很糟糕。;-(

But I neverleless give this answer, because it seem to fill your need. (If not, it may trigger your updating your requirements, which would be "a Good Thing"(TM) also).

但我永远不会给出这个答案,因为它似乎满足了你的需要。(如果没有,它可能会触发您更新您的要求,这也是“一件好事”(TM))。

Edited: If an insert would break the database unique constraint, an exception is throw at the database level, relayed by the driver. It will certainly stop your script, with a failure. It must be possible in PHP to adress that case ...

编辑:如果插入会破坏数据库唯一约束,则会在数据库级别引发异常,由驱动程序中继。它肯定会以失败的方式停止您的脚本。必须可以在 PHP 中解决这种情况......

回答by Jrm

Here is a PHP function that will insert a row only if all the specified columns values don't already exist in the table.

这是一个 PHP 函数,仅当表中不存在所有指定的列值时,它才会插入一行。

  • If one of the columns differ, the row will be added.

  • If the table is empty, the row will be added.

  • If a row exists where all the specified columns have the specified values, the row won't be added.

    function insert_unique($table, $vars)
    {
      if (count($vars)) {
        $table = mysql_real_escape_string($table);
        $vars = array_map('mysql_real_escape_string', $vars);
    
        $req = "INSERT INTO `$table` (`". join('`, `', array_keys($vars)) ."`) ";
        $req .= "SELECT '". join("', '", $vars) ."' FROM DUAL ";
        $req .= "WHERE NOT EXISTS (SELECT 1 FROM `$table` WHERE ";
    
        foreach ($vars AS $col => $val)
          $req .= "`$col`='$val' AND ";
    
        $req = substr($req, 0, -5) . ") LIMIT 1";
    
        $res = mysql_query($req) OR die();
        return mysql_insert_id();
      }
    
      return False;
    }
    
  • 如果其中一列不同,将添加该行。

  • 如果表为空,则将添加该行。

  • 如果存在所有指定列都具有指定值的行,则不会添加该行。

    function insert_unique($table, $vars)
    {
      if (count($vars)) {
        $table = mysql_real_escape_string($table);
        $vars = array_map('mysql_real_escape_string', $vars);
    
        $req = "INSERT INTO `$table` (`". join('`, `', array_keys($vars)) ."`) ";
        $req .= "SELECT '". join("', '", $vars) ."' FROM DUAL ";
        $req .= "WHERE NOT EXISTS (SELECT 1 FROM `$table` WHERE ";
    
        foreach ($vars AS $col => $val)
          $req .= "`$col`='$val' AND ";
    
        $req = substr($req, 0, -5) . ") LIMIT 1";
    
        $res = mysql_query($req) OR die();
        return mysql_insert_id();
      }
    
      return False;
    }
    

Example usage :

示例用法:

<?php
insert_unique('mytable', array(
  'mycolumn1' => 'myvalue1',
  'mycolumn2' => 'myvalue2',
  'mycolumn3' => 'myvalue3'
  )
);
?>

回答by Rocio

REPLACE INTO `transcripts`
SET `ensembl_transcript_id` = 'ENSORGT00000000001',
`transcript_chrom_start` = 12345,
`transcript_chrom_end` = 12678;

If the record exists, it will be overwritten; if it does not yet exist, it will be created.

如果记录存在,将被覆盖;如果它还不存在,它将被创建。

回答by Jeb's

Try the following:

请尝试以下操作:

IF (SELECT COUNT(*) FROM beta WHERE name = 'John' > 0)
  UPDATE alfa SET c1=(SELECT id FROM beta WHERE name = 'John')
ELSE
BEGIN
  INSERT INTO beta (name) VALUES ('John')
  INSERT INTO alfa (c1) VALUES (LAST_INSERT_ID())
END

回答by wortwart

There are several answers that cover how to solve this if you have a UNIQUEindex that you can check against with ON DUPLICATE KEYor INSERT IGNORE. That is not always the case, and as UNIQUEhas a length constraint (1000 bytes) you might not be able to change that. For example, I had to work with metadata in WordPress (wp_postmeta).

如果您有一个UNIQUE可以使用ON DUPLICATE KEY或进行检查的索引,则有几个答案涵盖了如何解决此问题INSERT IGNORE。情况并非总是如此,并且由于UNIQUE有长度限制(1000 字节),您可能无法更改它。例如,我必须在 WordPress ( wp_postmeta) 中处理元数据。

I finally solved it with two queries:

我终于用两个查询解决了它:

UPDATE wp_postmeta SET meta_value = ? WHERE meta_key = ? AND post_id = ?;
INSERT INTO wp_postmeta (post_id, meta_key, meta_value) SELECT DISTINCT ?, ?, ? FROM wp_postmeta WHERE NOT EXISTS(SELECT * FROM wp_postmeta WHERE meta_key = ? AND post_id = ?);

Query 1 is a regular UPDATEquery with no effect when the dataset in question is not there. Query 2 is an INSERTwhich depends on a NOT EXISTS, i.e. the INSERTis only executed when the dataset doesn't exist.

查询 1 是常规UPDATE查询,当相关数据集不存在时无效。查询 2 是INSERT依赖于 a 的an NOT EXISTS,即INSERT只有在数据集不存在时才执行。

回答by Gilly

Something worth noting is that INSERT IGNORE will still increment the primary key whether the statement was a success or not just like a normal INSERT would.

值得注意的是,无论语句是否成功,INSERT IGNORE 仍然会增加主键,就像正常的 INSERT 一样。

This will cause gaps in your primary keys that might make a programmer mentally unstable. Or if your application is poorly designed and depends on perfect incremental primary keys, it might become a headache.

这将导致主键中的间隙,这可能会使程序员精神不稳定。或者,如果您的应用程序设计不佳并且依赖于完美的增量主键,则可能会令人头疼。

Look into innodb_autoinc_lock_mode = 0(server setting, and comes with a slight performance hit), or use a SELECT first to make sure your query will not fail (which also comes with a performance hit and extra code).

查看innodb_autoinc_lock_mode = 0(服务器设置,并带有轻微的性能影响),或首先使用 SELECT 以确保您的查询不会失败(这也会导致性能下降和额外的代码)。

回答by Yeti

Update or insert without known primary key

在没有已知主键的情况下更新或插入

If you already have a unique or primary key, the other answers with either INSERT INTO ... ON DUPLICATE KEY UPDATE ...or REPLACE INTO ...should work fine (note that replace into deletes if exists and then inserts - thus does not partially update existing values).

如果您已经有一个唯一键或主键,则其他答案为INSERT INTO ... ON DUPLICATE KEY UPDATE ...orREPLACE INTO ...应该可以正常工作(请注意,如果存在则替换为删除然后插入 - 因此不会部分更新现有值)。

But if you have the values for some_column_idand some_type, the combination of which are known to be unique. And you want to update some_valueif exists, or insert if not exists. And you want to do it in just one query (to avoid using a transaction). This might be a solution:

但是,如果你有值some_column_idsome_type,这是众所周知的组合是唯一的。some_value如果存在则更新,如果不存在则插入。并且您只想在一个查询中完成它(以避免使用事务)。这可能是一个解决方案:

INSERT INTO my_table (id, some_column_id, some_type, some_value)
SELECT t.id, t.some_column_id, t.some_type, t.some_value
FROM (
    SELECT id, some_column_id, some_type, some_value
    FROM my_table
    WHERE some_column_id = ? AND some_type = ?
    UNION ALL
    SELECT s.id, s.some_column_id, s.some_type, s.some_value
    FROM (SELECT NULL AS id, ? AS some_column_id, ? AS some_type, ? AS some_value) AS s
) AS t
LIMIT 1
ON DUPLICATE KEY UPDATE
some_value = ?

Basically, the query executes this way (less complicated than it may look):

基本上,查询以这种方式执行(没有看起来那么复杂):

  • Select an existing row via the WHEREclause match.
  • Union that result with a potential new row (table s), where the column values are explicitly given (s.id is NULL, so it will generate a new auto-increment identifier).
  • If an existing row is found, then the potential new row from table sis discarded (due to LIMIT 1 on table t), and it will always trigger an ON DUPLICATE KEYwhich will UPDATEthe some_valuecolumn.
  • If an existing row is not found, then the potential new row is inserted (as given by table s).
  • 通过WHERE子句匹配选择现有行。
  • 将结果与潜在的新行( table s)联合,其中明确给出了列值(s.id 为 NULL,因此它将生成一个新的自动增量标识符)。
  • 如果找到现有行,则s丢弃table 中潜在的新行(由于 table 上的 LIMIT 1 t),并且它将始终触发ON DUPLICATE KEYwhich will UPDATEthe some_valuecolumn。
  • 如果未找到现有行,则插入潜在的新行(由 table 给出s)。

Note: Every table in a relational database should have at least a primary auto-increment idcolumn. If you don't have this, add it, even when you don't need it at first sight. It is definitely needed for this "trick".

注意:关系数据库中的每个表都应该至少有一个主自增列id。如果你没有这个,添加它,即使你一开始不需要它。这个“技巧”绝对需要它。