SQL 最快检查 PostgreSQL 中是否存在行

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/7471625/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-01 12:10:57  来源:igfitidea点击:

Fastest check if row exists in PostgreSQL

sqlpostgresql

提问by Valentin Kuzub

I have a bunch of rows that I need to insert into table, but these inserts are always done in batches. So I want to check if a single row from the batch exists in the table because then I know they all were inserted.

我有一堆需要插入到表中的行,但这些插入总是分批完成的。所以我想检查表中是否存在批处理中的一行,因为我知道它们都被插入了。

So its not a primary key check, but shouldn't matter too much. I would like to only check single row so count(*)probably isn't good, so its something like existsI guess.

所以它不是主键检查,但不应该太重要。我只想检查单行,所以count(*)可能不太好,所以exists我猜是这样的。

But since I'm fairly new to PostgreSQL I'd rather ask people who know.

但是由于我对 PostgreSQL 还很陌生,所以我宁愿询问了解的人。

My batch contains rows with following structure:

我的批处理包含具有以下结构的行:

userid | rightid | remaining_count

So if table contains any rows with provided useridit means they all are present there.

因此,如果 table 包含任何提供的行,userid则意味着它们都存在于那里。

回答by MikeM

Use the EXISTS key word for TRUE / FALSE return:

将 EXISTS 关键字用于 TRUE / FALSE 返回:

select exists(select 1 from contact where id=12)

回答by NPE

How about simply:

简单地说:

select 1 from tbl where userid = 123 limit 1;

where 123is the userid of the batch that you're about to insert.

123您将要插入的批次的用户 ID在哪里。

The above query will return either an empty set or a single row, depending on whether there are records with the given userid.

上面的查询将返回空集或单行,具体取决于是否存在具有给定用户 ID 的记录。

If this turns out to be too slow, you could look into creating an index on tbl.userid.

如果结果证明这太慢了,您可以考虑在 上创建索引tbl.userid

if even a single row from batch exists in table, in that case I don't have to insert my rows because I know for sure they all were inserted.

如果表中甚至存在批处理中的单行,在这种情况下,我不必插入我的行,因为我确定它们都已插入。

For this to remain true even if your program gets interrupted mid-batch, I'd recommend that you make sure you manage database transactions appropriately (i.e. that the entire batch gets inserted within a single transaction).

即使您的程序在批处理中被中断,为了保持这一点,我建议您确保正确管理数据库事务(即整个批处理插入单个事务中)。

回答by wildplasser

INSERT INTO target( userid, rightid, count )
  SELECT userid, rightid, count 
  FROM batch
  WHERE NOT EXISTS (
    SELECT * FROM target t2, batch b2
    WHERE t2.userid = b2.userid
    -- ... other keyfields ...
    )       
    ;

BTW: if you want the whole batch to failin case of a duplicate, then (given a primary key constraint)

顺便说一句:如果您希望整个批次在重复的情况下失败,那么(给定主键约束)

INSERT INTO target( userid, rightid, count )
SELECT userid, rightid, count 
FROM batch
    ;

will do exactly what you want: either it succeeds, or it fails.

会做你想做的事:要么成功,要么失败。

回答by hcnak

as @MikeM pointed out.

正如@MikeM 指出的那样。

select exists(select 1 from contact where id=12)

with indexon contact, it can usually reduce time cost to 1 ms.

指数上的接触,它通常可以减少时间成本为1毫秒。

CREATE INDEX index_contact on contact(id);

回答by Royce

select true from tablename where condition limit 1;

I believe that this is the query that postgres uses for checking foreign keys.

我相信这是 postgres 用于检查外键的查询。

In your case, you could do this in one go too:

在您的情况下,您也可以一次性完成此操作:

insert into yourtable select $userid, $rightid, $count where not (select true from yourtable where userid = $userid limit 1);

回答by MikeM

I would like to propose another thought to specifically address your sentence: "So I want to check if a single row from the batch exists in the table because then I know they all wereinserted."

我想提出另一个想法来专门解决您的句子:“所以我想检查表中是否存在批处理中的一行,因为那时我知道它们都已插入。”

You are making things efficient by inserting in "batches" but then doing existence checks one record at a time? This seems counter intuitive to me. So when you say "inserts are always done in batches" I take it you mean you are inserting multiple records with one insert statement. You need to realize that Postgres is ACID compliant. If you are inserting multiple records (a batch of data) with one insert statement, there is no need to check if some were inserted or not. The statement either passes or it will fail. All records will be inserted or none.

您通过“批量”插入来提高效率,但随后一次检查一条记录?这对我来说似乎违反直觉。因此,当您说“插入总是分批完成”时,我认为您的意思是您使用一条插入语句插入多条记录。您需要意识到 Postgres 是 ACID 兼容的。如果您使用一个 insert 语句插入多条记录(一批数据),则无需检查是否插入了某些记录。该语句要么通过,要么失败。将插入所有记录或不插入。

On the other hand, if your C# code is simply doing a "set" separate insert statements, for example, in a loop, and in your mind, this is a "batch" .. then you should not in fact describe it as "inserts are always done in batches". The fact that you expect that part of what you call a "batch", may actually not be inserted, and hence feel the need for a check, strongly suggests this is the case, in which case you have a more fundamental problem. You need change your paradigm to actually insert multiple records with one insert, and forego checking if the individual records made it.

另一方面,如果您的 C# 代码只是执行“设置”单独的插入语句,例如,在循环中,并且在您看来,这是一个“批处理”..那么您实际上不应将其描述为“插入总是分批完成”。您期望所谓的“批处理”部分实际上可能不会被插入,因此需要进行检查,这一事实强烈表明情况确实如此,在这种情况下,您有一个更根本的问题。你需要改变你的范式,用一次插入实际插入多条记录,并放弃检查单个记录是否成功。

Consider this example:

考虑这个例子:

CREATE TABLE temp_test (
    id SERIAL PRIMARY KEY,
    sometext TEXT,
    userid INT,
    somethingtomakeitfail INT unique
)
-- insert a batch of 3 rows
;;
INSERT INTO temp_test (sometext, userid, somethingtomakeitfail) VALUES
('foo', 1, 1),
('bar', 2, 2),
('baz', 3, 3)
;;
-- inspect the data of what we inserted
SELECT * FROM temp_test
;;
-- this entire statement will fail .. no need to check which one made it
INSERT INTO temp_test (sometext, userid, somethingtomakeitfail) VALUES
('foo', 2, 4),
('bar', 2, 5),
('baz', 3, 3)  -- <<--(deliberately simulate a failure)
;;
-- check it ... everything is the same from the last successful insert ..
-- no need to check which records from the 2nd insert may have made it in
SELECT * FROM temp_test

This is in fact the paradigm for any ACID compliant DB .. not just Postgresql. In other words you are better off if you fix your "batch" concept and avoid having to do any row by row checks in the first place.

这实际上是任何符合 ACID 的数据库的范例......不仅仅是 Postgresql。换句话说,如果您修复“批处理”概念并避免首先进行任何逐行检查,您会更好。

回答by francs

If you think about the performace ,may be you can use "PERFORM" in a function just like this:

如果您考虑性能,也许您可​​以像这样在函数中使用“PERFORM”:

 PERFORM 1 FROM skytf.test_2 WHERE id=i LIMIT 1;
  IF FOUND THEN
      RAISE NOTICE ' found record id=%', i;  
  ELSE
      RAISE NOTICE ' not found record id=%', i;  
 END IF;

回答by Fabian Barney

SELECT 1 FROM user_right where userid = ? LIMIT 1

If your resultset contains a row then you do not have to insert. Otherwise insert your records.

如果您的结果集包含一行,那么您不必插入。否则插入您的记录。