MySQL SQL JOIN 查询返回我们在连接表中没有找到匹配项的行
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/22975556/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
SQL JOIN Query to return rows where we did NOT find a match in joined table
提问by twistedpixel
More of a theory/logic question but what I have is two tables: links
and options
. Links is a table where I add rows that represent a link between a product ID (in a separate products
table) and an option. The options
table holds all available options.
更多的是一个理论/逻辑问题,但我有两个表:links
和options
. 链接是一个表格,我在其中添加了代表产品 ID(在单独的products
表格中)和选项之间的链接的行。该options
表包含所有可用选项。
What I'm trying to do (but struggling to create the logic for) is to join the two tables, returning only the rows where there is no option link in the links
table, therefore representing which options are still available to add to the product.
我想要做的(但努力为其创建逻辑)是连接两个表,只返回表中没有选项链接的行links
,因此表示哪些选项仍然可以添加到产品中。
Is there a feature of SQL that might help me here? I'm not tremendously experienced with SQL yet.
是否有 SQL 的功能可以帮助我?我对 SQL 还不是很有经验。
回答by spencer7593
Your table design sounds fine.
你的桌子设计听起来不错。
If this query returns the id
values of the "options" linked to a particular "product"...
如果此查询返回id
链接到特定“产品”的“选项”的值...
SELECT k.option_id
FROM links k
WHERE k.product_id = 'foo'
Then this query would get the details of all the options related to the "product"
然后此查询将获得与“产品”相关的所有选项的详细信息
SELECT o.id
, o.name
FROM options o
JOIN links k
ON k.option_id = o.id
WHERE k.product_id = 'foo'
Note that we can actually move the "product_id='foo'"
predicate from the WHERE clause to the ON clause of the JOIN, for an equivalent result, e.g.
请注意,我们实际上可以将"product_id='foo'"
谓词从 WHERE 子句移动到 JOIN 的 ON 子句,以获得等效的结果,例如
SELECT o.id
, o.name
FROM options o
JOIN links k
ON k.option_id = o.id
AND k.product_id = 'foo'
(Not that it makes any difference here, but it would make a difference if we were using an OUTER JOIN (in the WHERE clause, it would negate the "outer-ness" of the join, and make it equivalent to an INNER JOIN.)
(并不是说它在这里有什么不同,但是如果我们使用 OUTER JOIN 会有所不同(在 WHERE 子句中,它会否定连接的“外部性”,并使其等效于 INNER JOIN。 )
But, none of that answers your question, it only sets the stage for answering your question:
但是,这些都没有回答您的问题,它只是为回答您的问题奠定了基础:
How do we get the rows from "options" that are NOT linked to particular product?
我们如何从未链接到特定产品的“选项”中获取行?
The most efficient approach is (usually) an anti-joinpattern.
最有效的方法是(通常)反连接模式。
What that is, we will get all the rows from "options", along with any matching rows from "links" (for a particular product_id, in your case). That result set will include the rows from "options" that don't have a matching row in "links".
也就是说,我们将从“选项”中获取所有行,以及“链接”中的任何匹配行(对于特定的 product_id,在您的情况下)。该结果集将包括“选项”中在“链接”中没有匹配行的行。
The "trick" is to filter out all the rows that had matching row(s) found in "links". That will leave us with onlythe rows that didn't have a match.
“技巧”是过滤掉在“链接”中找到匹配行的所有行。这将只留下没有匹配的行。
And way we filter those rows, we use a predicate in the WHERE clause that checks whether a match was found. We do that by checking a column that we know for certain will be NOT NULLif a matching row was found. And we know* for certain that column will be NULLif a matching row was NOTfound.
我们过滤这些行的方式是在 WHERE 子句中使用一个谓词来检查是否找到了匹配项。我们通过检查一个列,如果找到匹配的行,该列肯定不会为空。而我们知道*对于某些该列将是NULL,如果有匹配的行是不是发现。
Something like this:
像这样的东西:
SELECT o.id
, o.name
FROM options o
LEFT
JOIN links k
ON k.option_id = o.id
AND k.product_id = 'foo'
WHERE k.option_id IS NULL
The "LEFT"
keyword specifies an "outer" join operation, we get all the rows from "options" (the table on the "left" side of the JOIN) even if a matching row is not found. (A normal inner join would filter out rows that didn't have a match.)
该"LEFT"
关键字指定的“外部”联接操作,我们得到了所有来自“选项”(在其加入“左”侧表),即使没有找到匹配行的行。(普通的内连接会过滤掉没有匹配的行。)
The "trick" is in the WHERE clause... if we found a matching row from links, we know that the "option_id"
column returned from "links"
would not be NULL. It can't be NULL if it "equals" something, and we know it had to "equals" something because of the predicate in the ON clause.
“技巧”在 WHERE 子句中……如果我们从链接中找到匹配的行,我们就知道"option_id"
从中返回的列"links"
不会为 NULL。如果它“等于”某些东西,它就不能为 NULL,而且我们知道它必须“等于”某些东西,因为 ON 子句中的谓词。
So, we know that the rows from options that didn't have a match will have a NULL value for that column.
因此,我们知道没有匹配项的选项行将具有该列的 NULL 值。
It takes a bit to get your brain wrapped around it, but the anti-join quickly becomes a familiar pattern.
让你的大脑围绕它需要一点时间,但反连接很快就变成了一种熟悉的模式。
The "anti-join" pattern isn't the only way to get the result set. There are a couple of other approaches.
“反连接”模式不是获得结果集的唯一方法。还有其他几种方法。
One option is to use a query with a "NOT EXISTS"
predicate with a correlated subquery. This is somewhat easier to understand, but doesn't usually perform as well:
一种选择是使用"NOT EXISTS"
带有相关子查询的谓词的查询。这更容易理解,但通常效果不佳:
SELECT o.id
, o.name
FROM options o
WHERE NOT EXISTS ( SELECT 1
FROM links k
WHERE k.option_id = o.id
AND k.product_id = 'foo'
)
That says get me all rows from the options table. But for each row, run a query, and see if a matching row "exists" in the links table. (It doesn't matter what is returned in the select list, we're only testing whether it returns at least one row... I use a "1" in the select list to remind me I'm looking for "1 row".
这表示从选项表中获取所有行。但是对于每一行,运行一个查询,并查看链接表中是否“存在”匹配的行。(选择列表中返回什么并不重要,我们只是测试它是否至少返回一行......我在选择列表中使用“1”来提醒我我正在寻找“1行” ”。
This usually doesn't perform as well as the anti-join, but sometimes it does run faster, especially if other predicates in the WHERE clause of the outer query filter out nearly every row, and the subquery only has to run for a couple of rows. (That is, when we only have to check a few needles in a haystack. When we need to process the whole stack of hay, the anti-join pattern is usually faster.)
这通常不如反连接执行得好,但有时它确实运行得更快,特别是如果外部查询的 WHERE 子句中的其他谓词几乎过滤掉了每一行,并且子查询只需要运行几个行。(也就是说,当我们只需要检查干草堆中的几根针时。当我们需要处理整个干草堆时,反连接模式通常更快。)
And the beginner query you're most likely to see is a NOT IN (subquery)
. I'm not even going to give an example of that. If you've got a list of literals, then by all means, use a NOT IN. But with a subquery, it's rarely the best performer, though it does seem to be the easiest to understand.
您最有可能看到的初学者查询是NOT IN (subquery)
. 我什至不打算举一个例子。如果你有一个文字列表,那么一定要使用 NOT IN。但是对于子查询,它很少是表现最好的,尽管它似乎是最容易理解的。
Oh, what the hay, I'll give a demo of that as well (not that I'm encouraging you to do it this way):
哦,干草,我也会给出一个演示(不是我鼓励你这样做):
SELECT o.id
, o.name
FROM options o
WHERE o.id NOT IN ( SELECT k.option_id
FROM links k
WHERE k.product_id = 'foo'
AND k.option_id IS NOT NULL
GROUP BY k.option_id
)
That subquery (inside the parens) gets a list of all the option_id values associated with a product.
该子查询(在括号内)获取与产品关联的所有 option_id 值的列表。
Now, for each row in options (in the outer query), we can check the id value to see if it's in that list returned by the subquery.
现在,对于 options 中的每一行(在外部查询中),我们可以检查 id 值以查看它是否在子查询返回的列表中。
If we have a guarantee that option_id will never be NULL, we can omit the predicate that tests for "option_id IS NOT NULL"
. (In the more general case, when a NULL creeps into the resultset, then the outer query can't tell if o.id is in the list or not, and the query doesn't return any rows; so I usually include that, even when it's not required. The GROUP BY
isn't strictly necessary either; especially if there's a unique constraint (guaranteed uniqueness) on the (product_id,option_id) tuple.
如果我们保证 option_id 永远不会为 NULL,我们可以省略测试 的谓词"option_id IS NOT NULL"
。(在更一般的情况下,当 NULL 进入结果集时,外部查询无法判断 o.id 是否在列表中,并且查询不返回任何行;所以我通常包括,即使它不是必需的。也不GROUP BY
是绝对必要的;特别是如果 (product_id,option_id) 元组上有唯一约束(保证唯一性)。
But, again, don't use that NOT IN (subquery)
, except for testing, unless there's some compelling reason to (for example, it manages to perform better than the anti-join.)
但是,同样,NOT IN (subquery)
除了测试之外,不要使用 that ,除非有一些令人信服的理由(例如,它设法比反联接表现得更好。)
You're unlikely to notice any performance differences with small sets, the overhead of transmitting the statement, parsing it, generating an access plan, and returning results dwarfs the actual "execution" time of the plan. It's with larger sets that the differences in "execution" time become apparent.
您不太可能注意到小集合的任何性能差异,传输语句、解析它、生成访问计划和返回结果的开销使计划的实际“执行”时间相形见绌。“执行”时间的差异在更大的集合中变得明显。
EXPLAIN SELECT ...
is a really good way to get a handle on the execution plans, to see what MySQL is really doing with your statement.
EXPLAIN SELECT ...
是处理执行计划的一种非常好的方法,可以查看 MySQL 对您的语句真正执行的操作。
Appropriate indexes, especially covering indexes, can noticeably improve performance of some statements.
适当的索引,尤其是覆盖索引,可以显着提高某些语句的性能。
回答by RobP
Yes, you can do a LEFT JOIN
(if MySQL; there are variations in other dialects) which will include rows in links which do NOT have a match in options. Then test if options.someColumn
IS NULL
and you will have exactly the rows in links which had no "matching" row in options.
是的,您可以执行一个LEFT JOIN
(如果是 MySQL;其他方言中存在变体),它将在链接中包含选项中不匹配的行。然后测试options.someColumn
IS NULL
您是否将在链接中准确地拥有在选项中没有“匹配”行的行。
回答by Achilles
Try something along the lines of this
尝试类似的方法
To count
计数
SELECT Links.linkId, Count(*)
FROM Link
LEFT JOIN Options ON Links.optionId = Options.optionId
Where Options.optionId IS NULL
Group by Links.linkId
To see the lines
查看线条
SELECT Links.linkId
FROM Link
LEFT JOIN Options ON Links.optionId = Options.optionId
Where Options.optionId IS NULL