SQL 选择其中一列值在另一条件列中通用的行

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1408141/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-01 03:33:22  来源:igfitidea点击:

SQL selecting rows where one column's value is common across another criteria column

sqlrelational

提问by Maciek

I have a cross reference table that looks like this:

我有一个交叉引用表,如下所示:

id  document_id  subject_id
1   8            21
2   5            17
3   5            76
4   7            88
5   9            17
6   9            76
7   2            76

It matches documents to subjects. Documents can be members of more than one subject. I want to return rows from this table where a given document matches allthe subjects in a given set. For example, given the set of subjects:

它将文档与主题相匹配。文档可以是多个主题的成员。我想从该表中返回给定文档与给定集合中的所有主题相匹配的行。例如,给定一组主题:

(17,76)

(17,76)

I want to return only rows for documents which match all the subjects in that set (at least) somewhere in the cross reference table. The desired output set given the above set would be:

我只想返回与交叉引用表中某处(至少)该集合中的所有主题匹配的文档的行。给定上述集合所需的输出集将是:

id  document_id  subject_id
2   5            17
3   5            76
5   9            17
6   9            76

Notice that the last row of the table is not returned because that document only matches one of the required subjects.

请注意,不会返回表的最后一行,因为该文档仅匹配一个必需的主题。

What is the simplest and most efficient way to query for this in SQL?

在 SQL 中查询这个的最简单和最有效的方法是什么?

回答by Alex Papadimoulis

I assume that the natrual key of this table is document_id + subject_id, and that id is a surrogate; IOW, document_id and subject_id are unique. As such, I'm just going to pretend it doesn't exist and that a unique constraint is on the natural key.

我假设这个表的自然键是 document_id + subject_id,而那个 id 是一个代理;IOW、document_id 和 subject_id 是唯一的。因此,我将假装它不存在并且唯一约束在自然键上。

Let's start with the obvious.

让我们从显而易见的开始。

SELECT document_id, subject_id
  FROM document_subjects
 WHERE subject_id IN (17,76)

That gets you everything you want plusstuff you don't want. So all we need to do is filter out the other stuff. The "other stuff" is groups of rows having a count that is not equal to the count of the desired subjects.

这会给你你想要的一切加上你不想要的东西。所以我们需要做的就是过滤掉其他东西。“其他东西”是具有计数不等于所需主题计数的行组。

SELECT document_id
  FROM document_subjects
 WHERE subject_id IN (17,76)
 GROUP BY document_id
HAVING COUNT(*) = 2

Note that subject_id is removed because it doesn't participate in grouping. Taking this one step further, i'm going to add an imaginary table called subjects_i_want that contains N rows of subjects you want.

请注意,subject_id 被删除,因为它不参与分组。更进一步,我将添加一个名为subjects_i_want 的假想表,其中包含您想要的N 行主题。

SELECT document_id
  FROM document_subjects
 WHERE subject_id IN (SELECT subject_id FROM subjects_i_want)
 GROUP BY document_id
HAVING COUNT(*) = (SELECT COUNT(*) FROM subjects_i_want)

Obviously subjects_i_want could be swapped out for another subquery, temporary table, or whatever. But, once you have this list of document_id, you can use it within a subselect of a bigger query.

显然,subjects_i_want 可以换成另一个子查询、临时表或其他任何东西。但是,一旦您有了这个 document_id 列表,您就可以在更大查询的子选择中使用它。

SELECT document_id, subject_id, ...
  FROM document_subjects
 WHERE document_id IN(
        SELECT document_id
          FROM document_subjects
          WHERE subject_id IN (SELECT subject_id FROM subjects_i_want)
          GROUP BY document_id
         HAVING COUNT(*) = (SELECT COUNT(*) FROM subjects_i_want))

Or whatever.

管他呢。

回答by Joseph K. Strauss

Using Oracle (or any database that allows the with clause). This allows definition of the subject_id values exactly once.

使用 Oracle(或任何允许 with 子句的数据库)。这允许只定义一次 subject_id 值。

with t as (select distinct document_id from table1 where subject_id in (17,76) )
select document_id from table1 where subject_id in (select subject_id from t)
group by document_id 
having count(*) = (select count (*) from t);

回答by Mike Dinescu

That's a very interesting question.

这是一个非常有趣的问题。

I'm assuming you would like a more generalized query, but this is what I would do in the case where you always have the same number of subjects (say two):

我假设您想要一个更通用的查询,但如果您始终拥有相同数量的主题(例如两个),我会这样做:

 SELECT T.id, T.document_id, T.subject_id
   FROM table T
        INNER JOIN table T1 ON T.document_id = T1.document_id AND T1.subject_ID = 17
        INNER JOIN table T2 ON T.document_id = T2.document_id AND T2.subject_ID = 76            

Of course, you could add yet another INNER JOIN to add another subject ID.. But I admit it's not a very good general solution.

当然,您可以添加另一个 INNER JOIN 以添加另一个主题 ID.. 但我承认这不是一个很好的通用解决方案。

回答by manji

select document_id from table1
 where subject_id in (17, 76)
 group by document_id
having count(distinct subject_id) = 2