SQL 如何在一个表中查找在另一表中没有对应行的行

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1415438/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-01 03:35:37  来源:igfitidea点击:

How to find rows in one table that have no corresponding row in another table

sqloptimizationh2

提问by Steve McLeod

I have a 1:1 relationship between two tables. I want to find all the rows in table A that don't have a corresponding row in table B. I use this query:

我在两个表之间有 1:1 的关系。我想找到表 A 中所有在表 B 中没有对应行的行。我使用这个查询:

SELECT id 
  FROM tableA 
 WHERE id NOT IN (SELECT id 
                    FROM tableB) 
ORDER BY id desc

id is the primary key in both tables. Apart from primary key indices, I also have a index on tableA(id desc).

id 是两个表中的主键。除了主键索引,我在 tableA(id desc) 上也有一个索引。

Using H2 (Java embedded database), this results in a full table scan of tableB. I want to avoid a full table scan.

使用 H2(Java 嵌入式数据库),这会导致对 tableB 进行全表扫描。我想避免全表扫描。

How can I rewrite this query to run quickly? What index should I should?

如何重写此查询以快速运行?我应该使用什么索引?

回答by SquareCog

select tableA.id from tableA left outer join tableB on (tableA.id = tableB.id)
where tableB.id is null
order by tableA.id desc 

If your db knows how to do index intersections, this will only touch the primary key index

如果你的数据库知道如何做索引交叉,这只会触及主键索引

回答by Eric

You can also use exists, since sometimes it's faster than left join. You'd have to benchmark them to figure out which one you want to use.

您也可以使用exists,因为有时它比left join. 您必须对它们进行基准测试才能确定要使用哪一个。

select
    id
from
    tableA a
where
    not exists
    (select 1 from tableB b where b.id = a.id)

To show that existscan be more efficient than a left join, here's the execution plans of these queries in SQL Server 2008:

为了表明它exists比 a 更有效left join,以下是 SQL Server 2008 中这些查询的执行计划:

left join- total subtree cost: 1.09724:

left join- 总子树成本:1.09724:

left join

左连接

exists- total subtree cost: 1.07421:

exists- 总子树成本:1.07421:

exists

存在

回答by APC

You have to check every ID in tableA against every ID in tableB. A fully featured RDBMS (such as Oracle) would be able to optimize that into an INDEX FULL FAST SCAN and not touch the table at all. I don't know whether H2's optimizer is as smart as that.

您必须根据 tableB 中的每个 ID 检查 tableA 中的每个 ID。功能齐全的 RDBMS(例如 Oracle)将能够将其优化为 INDEX FULL FAST SCAN,并且根本不接触表。不知道H2的优化器有没有这么聪明。

H2 does support the MINUS syntax so you should try this

H2 确实支持 MINUS 语法,所以你应该试试这个

select id from tableA
minus
select id from tableB
order by id desc

That may perform faster; it is certainly worth benchmarking.

那可能会执行得更快;这当然值得进行基准测试。

回答by Leigh Riffel

For my small dataset, Oracle gives almost all of these queries the exact same plan that uses the primary key indexes without touching the table. The exception is the MINUS version which manages to do fewer consistent gets despite the higher plan cost.

对于我的小数据集,Oracle 为几乎所有这些查询提供了完全相同的计划,使用主键索引而不触及表。例外是 MINUS 版本,尽管计划成本较高,但它设法减少了一致的获取。

--Create Sample Data.
d r o p table tableA;
d r o p table tableB;

create table tableA as (
   select rownum-1 ID, chr(rownum-1+70) bb, chr(rownum-1+100) cc 
      from dual connect by rownum<=4
);

create table tableB as (
   select rownum ID, chr(rownum+70) data1, chr(rownum+100) cc from dual
   UNION ALL
   select rownum+2 ID, chr(rownum+70) data1, chr(rownum+100) cc 
      from dual connect by rownum<=3
);

a l t e r table tableA Add Primary Key (ID);
a l t e r table tableB Add Primary Key (ID);

--View Tables.
select * from tableA;
select * from tableB;

--Find all rows in tableA that don't have a corresponding row in tableB.

--Method 1.
SELECT id FROM tableA WHERE id NOT IN (SELECT id FROM tableB) ORDER BY id DESC;

--Method 2.
SELECT tableA.id FROM tableA LEFT JOIN tableB ON (tableA.id = tableB.id)
WHERE tableB.id IS NULL ORDER BY tableA.id DESC;

--Method 3.
SELECT id FROM tableA a WHERE NOT EXISTS (SELECT 1 FROM tableB b WHERE b.id = a.id) 
   ORDER BY id DESC;

--Method 4.
SELECT id FROM tableA
MINUS
SELECT id FROM tableB ORDER BY id DESC;

回答by Aaron Alton

I can't tell you which of these methods will be best on H2 (or even if all of them will work), but I did write an article detailing all of the (good) methods available in TSQL. You can give them a shot and see if any of them works for you:

我不能告诉你这些方法中的哪一个最适合 H2(或者即使它们都可以工作),但我确实写了一篇文章详细介绍了 TSQL 中可用的所有(好的)方法。您可以试一试,看看它们是否适合您:

http://code.msdn.microsoft.com/SQLExamples/Wiki/View.aspx?title=QueryBasedUponAbsenceOfData&referringTitle=Home

http://code.msdn.microsoft.com/SQLExamples/Wiki/View.aspx?title=QueryBasedUponAbsenceOfData&referringTitle=Home

回答by Faysal Maqsood

select parentTable.id from parentTable
left outer join childTable on (parentTable.id = childTable.parentTableID) 
where childTable.id is null