SQL 从一列而不是另一列中选择所有值的有效方法
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/8737079/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Efficient way to select all values from one column not in another column
提问by Flash
I need to return all values from colA
that are not in colB
from mytable
. I am using:
我需要返回colA
不在colB
from 中的所有值mytable
。我在用:
SELECT DISTINCT(colA) FROM mytable WHERE colA NOT IN (SELECT colB FROM mytable)
It is working however the query is taking an excessively long time to complete.
它正在工作,但是查询需要很长时间才能完成。
Is there a more efficient way to do this?
有没有更有效的方法来做到这一点?
回答by Erwin Brandstetter
In standard SQL there are no parenthesesin DISTINCT colA
. DISTINCT
is not a function.
在标准 SQL中,DISTINCT colA
. DISTINCT
不是函数。
SELECT DISTINCT colA
FROM mytable
WHERE colA NOT IN (SELECT DISTINCT colB FROM mytable);
Added DISTINCT
to the sub-select as well. If you have many duplicates it could speed up the query.
也添加DISTINCT
到子选择中。如果您有很多重复项,它可以加快查询速度。
A CTE might be faster, depending on your DBMS. I additionally demonstrate LEFT JOIN
as alternative to exclude the values in valB
, and an alternative way to get distinct values with GROUP BY
:
CTE 可能更快,具体取决于您的 DBMS。我还演示了LEFT JOIN
作为排除 中值valB
的替代方法,以及使用 获得不同值的替代方法GROUP BY
:
WITH x AS (SELECT colB FROM mytable GROUP BY colB)
SELECT m.colA
FROM mytable m
LEFT JOIN x ON x.colB = m.colA
WHERE x.colB IS NULL
GROUP BY m.colA;
Or, simplified further, and with a plain subquery (probably fastest):
或者,进一步简化,并使用简单的子查询(可能最快):
SELECT DISTINCT m.colA
FROM mytable m
LEFT JOIN mytable x ON x.colB = m.colA
WHERE x.colB IS NULL;
There are basically 4 techniquesto exclude rows with keys present in another (or the same) table:
有基本上4种技术来排除与存在于另一键(或相同)的表中的行:
The deciding factor for speed will be indexes. You need to have indexes on colA
and colB
for this query to be fast.
速度的决定因素将是索引。您需要有索引,colA
并且colB
此查询要快速。
回答by Eric
You can use exists
:
您可以使用exists
:
select distinct
colA
from
mytable m1
where
not exists (select 1 from mytable m2 where m2.colB = m1.colA)
exists
does a semi-join to quickly match the values. not in
completes the entire result set and then does an or
on it. exists
is typically faster for values in tables.
exists
执行半连接以快速匹配值。not in
完成整个结果集,然后or
对其进行处理。exists
对于表中的值通常更快。