SQL 在 PostgreSQL 中重复更新时插入?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1109061/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Insert, on duplicate update in PostgreSQL?
提问by Teifion
Several months ago I learned from an answer on Stack Overflow how to perform multiple updates at once in MySQL using the following syntax:
几个月前,我从 Stack Overflow 上的一个答案中了解到如何使用以下语法在 MySQL 中一次执行多个更新:
INSERT INTO table (id, field, field2) VALUES (1, A, X), (2, B, Y), (3, C, Z)
ON DUPLICATE KEY UPDATE field=VALUES(Col1), field2=VALUES(Col2);
I've now switched over to PostgreSQL and apparently this is not correct. It's referring to all the correct tables so I assume it's a matter of different keywords being used but I'm not sure where in the PostgreSQL documentation this is covered.
我现在已经切换到 PostgreSQL,显然这是不正确的。它指的是所有正确的表,所以我认为这是使用不同关键字的问题,但我不确定在 PostgreSQL 文档中的哪个位置。
To clarify, I want to insert several things and if they already exist to update them.
为了澄清,我想插入几件东西,如果它们已经存在来更新它们。
采纳答案by Stephen Denne
PostgreSQL since version 9.5 has UPSERTsyntax, with ON CONFLICTclause.with the following syntax (similar to MySQL)
PostgreSQL 自 9.5 版起具有UPSERT语法,带有ON CONFLICT子句。使用以下语法(类似于 MySQL)
INSERT INTO the_table (id, column_1, column_2)
VALUES (1, 'A', 'X'), (2, 'B', 'Y'), (3, 'C', 'Z')
ON CONFLICT (id) DO UPDATE
SET column_1 = excluded.column_1,
column_2 = excluded.column_2;
Searching postgresql's email group archives for "upsert" leads to finding an example of doing what you possibly want to do, in the manual:
在 postgresql 的电子邮件组档案中搜索“upsert”会导致在手册中找到一个执行您可能想做的事情的示例:
Example 38-2. Exceptions with UPDATE/INSERT
This example uses exception handling to perform either UPDATE or INSERT, as appropriate:
例 38-2。UPDATE/INSERT 异常
此示例使用异常处理来执行 UPDATE 或 INSERT,视情况而定:
CREATE TABLE db (a INT PRIMARY KEY, b TEXT);
CREATE FUNCTION merge_db(key INT, data TEXT) RETURNS VOID AS
$$
BEGIN
LOOP
-- first try to update the key
-- note that "a" must be unique
UPDATE db SET b = data WHERE a = key;
IF found THEN
RETURN;
END IF;
-- not there, so try to insert the key
-- if someone else inserts the same key concurrently,
-- we could get a unique-key failure
BEGIN
INSERT INTO db(a,b) VALUES (key, data);
RETURN;
EXCEPTION WHEN unique_violation THEN
-- do nothing, and loop to try the UPDATE again
END;
END LOOP;
END;
$$
LANGUAGE plpgsql;
SELECT merge_db(1, 'david');
SELECT merge_db(1, 'dennis');
There's possibly an example of how to do this in bulk, using CTEs in 9.1 and above, in the hackers mailing list:
在黑客邮件列表中,可能有一个使用 9.1 及更高版本中的 CTE 批量执行此操作的示例:
WITH foos AS (SELECT (UNNEST(%foo[])).*)
updated as (UPDATE foo SET foo.a = foos.a ... RETURNING foo.id)
INSERT INTO foo SELECT foos.* FROM foos LEFT JOIN updated USING(id)
WHERE updated.id IS NULL;
See a_horse_with_no_name's answerfor a clearer example.
有关更清晰的示例,请参阅a_horse_with_no_name 的答案。
回答by bovine
Warning: this is not safe if executed from multiple sessions at the same time(see caveats below).
警告:如果同时从多个会话执行,这是不安全的(请参阅下面的警告)。
Another clever way to do an "UPSERT" in postgresql is to do two sequential UPDATE/INSERT statements that are each designed to succeed or have no effect.
在 postgresql 中执行“UPSERT”的另一种聪明方法是执行两个连续的 UPDATE/INSERT 语句,每个语句都设计为成功或无效。
UPDATE table SET field='C', field2='Z' WHERE id=3;
INSERT INTO table (id, field, field2)
SELECT 3, 'C', 'Z'
WHERE NOT EXISTS (SELECT 1 FROM table WHERE id=3);
The UPDATE will succeed if a row with "id=3" already exists, otherwise it has no effect.
如果已存在具有“id=3”的行,则 UPDATE 将成功,否则无效。
The INSERT will succeed only if row with "id=3" does not already exist.
只有当“id=3”的行不存在时,插入才会成功。
You can combine these two into a single string and run them both with a single SQL statement execute from your application. Running them together in a single transaction is highly recommended.
您可以将这两者组合成一个字符串,并通过从您的应用程序执行的单个 SQL 语句来运行它们。强烈建议在单个事务中一起运行它们。
This works very well when run in isolation or on a locked table, but is subject to race conditions that mean it might still fail with duplicate key error if a row is inserted concurrently, or might terminate with no row inserted when a row is deleted concurrently. A SERIALIZABLE
transaction on PostgreSQL 9.1 or higher will handle it reliably at the cost of a very high serialization failure rate, meaning you'll have to retry a lot. See why is upsert so complicated, which discusses this case in more detail.
这在单独运行或在锁定表上运行时非常有效,但会受到竞争条件的影响,这意味着如果并发插入行,它可能仍会因重复键错误而失败,或者在并发删除行时可能会因未插入行而终止. SERIALIZABLE
PostgreSQL 9.1 或更高版本上的事务将以非常高的序列化失败率为代价可靠地处理它,这意味着您必须重试很多次。查看为什么 upsert 如此复杂,其中更详细地讨论了这种情况。
This approach is also subject to lost updates in read committed
isolation unless the application checks the affected row counts and verifies that either the insert
or the update
affected a row.
除非应用程序检查受影响的行计数并验证该行或受影响的行,否则此方法也容易丢失更新read committed
insert
update
。
回答by a_horse_with_no_name
With PostgreSQL 9.1 this can be achieved using a writeable CTE (common table expression):
在 PostgreSQL 9.1 中,这可以使用可写 CTE(公共表表达式)来实现:
WITH new_values (id, field1, field2) as (
values
(1, 'A', 'X'),
(2, 'B', 'Y'),
(3, 'C', 'Z')
),
upsert as
(
update mytable m
set field1 = nv.field1,
field2 = nv.field2
FROM new_values nv
WHERE m.id = nv.id
RETURNING m.*
)
INSERT INTO mytable (id, field1, field2)
SELECT id, field1, field2
FROM new_values
WHERE NOT EXISTS (SELECT 1
FROM upsert up
WHERE up.id = new_values.id)
See these blog entries:
请参阅这些博客条目:
Note that this solution does notprevent a unique key violation but it is not vulnerable to lost updates.
See the follow up by Craig Ringer on dba.stackexchange.com
请注意,此解决方案不能防止唯一密钥违规,但它不易受到更新丢失的影响。在 dba.stackexchange.com 上查看 Craig Ringer
的后续报道
回答by Craig Ringer
In PostgreSQL 9.5 and newer you can use INSERT ... ON CONFLICT UPDATE
.
在 PostgreSQL 9.5 及更新版本中,您可以使用INSERT ... ON CONFLICT UPDATE
.
See the documentation.
请参阅文档。
A MySQL INSERT ... ON DUPLICATE KEY UPDATE
can be directly rephrased to a ON CONFLICT UPDATE
. Neither is SQL-standard syntax, they're both database-specific extensions. There are good reasons MERGE
wasn't used for this, a new syntax wasn't created just for fun. (MySQL's syntax also has issues that mean it wasn't adopted directly).
MySQLINSERT ... ON DUPLICATE KEY UPDATE
可以直接改写为ON CONFLICT UPDATE
. 也不是 SQL 标准语法,它们都是特定于数据库的扩展。有充分的理由MERGE
不用于此,新的语法不是为了好玩而创建的。(MySQL 的语法也有问题,这意味着它没有被直接采用)。
e.g. given setup:
例如给定的设置:
CREATE TABLE tablename (a integer primary key, b integer, c integer);
INSERT INTO tablename (a, b, c) values (1, 2, 3);
the MySQL query:
MySQL查询:
INSERT INTO tablename (a,b,c) VALUES (1,2,3)
ON DUPLICATE KEY UPDATE c=c+1;
becomes:
变成:
INSERT INTO tablename (a, b, c) values (1, 2, 10)
ON CONFLICT (a) DO UPDATE SET c = tablename.c + 1;
Differences:
区别:
You mustspecify the column name (or unique constraint name) to use for the uniqueness check. That's the
ON CONFLICT (columnname) DO
The keyword
SET
must be used, as if this was a normalUPDATE
statement
您必须指定用于唯一性检查的列名(或唯一约束名)。那就是
ON CONFLICT (columnname) DO
SET
必须使用关键字,就好像这是一个正常的UPDATE
语句
It has some nice features too:
它也有一些不错的功能:
You can have a
WHERE
clause on yourUPDATE
(letting you effectively turnON CONFLICT UPDATE
intoON CONFLICT IGNORE
for certain values)The proposed-for-insertion values are available as the row-variable
EXCLUDED
, which has the same structure as the target table. You can get the original values in the table by using the table name. So in this caseEXCLUDED.c
will be10
(because that's what we tried to insert) and"table".c
will be3
because that's the current value in the table. You can use either or both in theSET
expressions andWHERE
clause.
你可以
WHERE
在你的UPDATE
(让你有效地ON CONFLICT UPDATE
转化ON CONFLICT IGNORE
为某些值)上有一个条款建议的插入值可用作行变量
EXCLUDED
,其结构与目标表相同。您可以通过使用表名获取表中的原始值。所以在这种情况下EXCLUDED.c
将是10
(因为那是我们试图插入的)并且"table".c
将是3
因为这是表中的当前值。您可以在SET
表达式和WHERE
子句中使用一个或两个。
For background on upsert see How to UPSERT (MERGE, INSERT ... ON DUPLICATE UPDATE) in PostgreSQL?
有关 upsert 的背景,请参阅如何在 PostgreSQL 中进行 UPSERT (MERGE, INSERT ... ON DUPLICATE UPDATE)?
回答by Paul Scheltema
I was looking for the same thing when I came here, but the lack of a generic "upsert" function botherd me a bit so I thought you could just pass the update and insert sql as arguments on that function form the manual
当我来到这里时,我一直在寻找同样的东西,但是缺少通用的“upsert”函数让我有点困扰,所以我认为您可以通过更新并插入 sql 作为手册中该函数的参数
that would look like this:
看起来像这样:
CREATE FUNCTION upsert (sql_update TEXT, sql_insert TEXT)
RETURNS VOID
LANGUAGE plpgsql
AS $$
BEGIN
LOOP
-- first try to update
EXECUTE sql_update;
-- check if the row is found
IF FOUND THEN
RETURN;
END IF;
-- not found so insert the row
BEGIN
EXECUTE sql_insert;
RETURN;
EXCEPTION WHEN unique_violation THEN
-- do nothing and loop
END;
END LOOP;
END;
$$;
and perhaps to do what you initially wanted to do, batch "upsert", you could use Tcl to split the sql_update and loop the individual updates, the preformance hit will be very small see http://archives.postgresql.org/pgsql-performance/2006-04/msg00557.php
也许为了做你最初想做的事情,批量“更新插入”,你可以使用 Tcl 来拆分 sql_update 并循环各个更新,性能命中将非常小,请参阅http://archives.postgresql.org/pgsql-性能/2006-04/msg00557.php
the highest cost is executing the query from your code, on the database side the execution cost is much smaller
最高成本是从您的代码执行查询,在数据库端执行成本要小得多
回答by Paul Scheltema
There is no simple command to do it.
没有简单的命令可以做到这一点。
The most correct approach is to use function, like the one from docs.
最正确的方法是使用函数,就像docs 中的那样。
Another solution (although not that safe) is to do update with returning, check which rows were updates, and insert the rest of them
另一个解决方案(虽然不是那么安全)是通过返回进行更新,检查哪些行是更新,然后插入其余的行
Something along the lines of:
类似的东西:
update table
set column = x.column
from (values (1,'aa'),(2,'bb'),(3,'cc')) as x (id, column)
where table.id = x.id
returning id;
assuming id:2 was returned:
假设返回了 id:2:
insert into table (id, column) values (1, 'aa'), (3, 'cc');
Of course it will bail out sooner or later (in concurrent environment), as there is clear race condition in here, but usually it will work.
当然它迟早会退出(在并发环境中),因为这里有明显的竞争条件,但通常它会起作用。
Here's a longer and more comprehensive article on the topic.
回答by Ch'marr
Personally, I've set up a "rule" attached to the insert statement. Say you had a "dns" table that recorded dns hits per customer on a per-time basis:
就个人而言,我已经设置了一个附加到插入语句的“规则”。假设您有一个“dns”表,它记录了每个客户每次的 dns 命中:
CREATE TABLE dns (
"time" timestamp without time zone NOT NULL,
customer_id integer NOT NULL,
hits integer
);
You wanted to be able to re-insert rows with updated values, or create them if they didn't exist already. Keyed on the customer_id and the time. Something like this:
您希望能够重新插入具有更新值的行,或者如果它们不存在则创建它们。键入 customer_id 和时间。像这样的东西:
CREATE RULE replace_dns AS
ON INSERT TO dns
WHERE (EXISTS (SELECT 1 FROM dns WHERE ((dns."time" = new."time")
AND (dns.customer_id = new.customer_id))))
DO INSTEAD UPDATE dns
SET hits = new.hits
WHERE ((dns."time" = new."time") AND (dns.customer_id = new.customer_id));
Update: This has the potential to fail if simultaneous inserts are happening, as it will generate unique_violation exceptions. However, the non-terminated transaction will continue and succeed, and you just need to repeat the terminated transaction.
更新:如果同时发生插入,这有可能会失败,因为它会生成 unique_violation 异常。但是,未终止的事务会继续并成功,您只需要重复终止的事务即可。
However, if there are tons of inserts happening all the time, you will want to put a table lock around the insert statements: SHARE ROW EXCLUSIVE locking will prevent any operations that could insert, delete or update rows in your target table. However, updates that do not update the unique key are safe, so if you no operation will do this, use advisory locks instead.
但是,如果一直有大量插入发生,您将需要在插入语句周围放置一个表锁:SHARE ROW EXCLUSIVE 锁定将阻止任何可能在目标表中插入、删除或更新行的操作。但是,不更新唯一键的更新是安全的,因此如果您没有操作会这样做,请改用咨询锁。
Also, the COPY command does not use RULES, so if you're inserting with COPY, you'll need to use triggers instead.
此外,COPY 命令不使用 RULES,因此如果您使用 COPY 插入,则需要改用触发器。
回答by Mise
I use this function merge
我用这个函数合并
CREATE OR REPLACE FUNCTION merge_tabla(key INT, data TEXT)
RETURNS void AS
$BODY$
BEGIN
IF EXISTS(SELECT a FROM tabla WHERE a = key)
THEN
UPDATE tabla SET b = data WHERE a = key;
RETURN;
ELSE
INSERT INTO tabla(a,b) VALUES (key, data);
RETURN;
END IF;
END;
$BODY$
LANGUAGE plpgsql
回答by Felipe FMMobile
I custom "upsert" function above, if you want to INSERT AND REPLACE :
如果你想插入和替换,我在上面自定义了“upsert”函数:
`
`
CREATE OR REPLACE FUNCTION upsert(sql_insert text, sql_update text)
RETURNS void AS
$BODY$
BEGIN
-- first try to insert and after to update. Note : insert has pk and update not...
EXECUTE sql_insert;
RETURN;
EXCEPTION WHEN unique_violation THEN
EXECUTE sql_update;
IF FOUND THEN
RETURN;
END IF;
END;
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100;
ALTER FUNCTION upsert(text, text)
OWNER TO postgres;`
And after to execute, do something like this :
在执行之后,做这样的事情:
SELECT upsert($$INSERT INTO ...$$,$$UPDATE... $$)
Is important to put double dollar-comma to avoid compiler errors
使用双美元逗号以避免编译器错误很重要
- check the speed...
- 检查速度...
回答by alexkovelsky
Similar to most-liked answer, but works slightly faster:
类似于最喜欢的答案,但工作速度稍快:
WITH upsert AS (UPDATE spider_count SET tally=1 WHERE date='today' RETURNING *)
INSERT INTO spider_count (spider, tally) SELECT 'Googlebot', 1 WHERE NOT EXISTS (SELECT * FROM upsert)