SQL：如何从单个列中选择满足多个条件的单个 id（“行”）

Question

提问by W. Young

I have a very narrow table: user_id, ancestry.

我有一个非常狭窄的表：user_id，祖先。

The user_id column is self explanatory.

user_id 列是不言自明的。

The ancestry column contains the country from where the user's ancestors hail.

祖先列包含用户祖先所在的国家/地区。

A user can have multiple rows on the table, as a user can have ancestors from multiple countries.

一个用户可以在表上有多行，因为一个用户可以有来自多个国家的祖先。

My question is this: how do I select users whose ancestors hail from multiple, specified countries?

我的问题是：如何选择祖先来自多个指定国家的用户？

For instance, show me all users who have ancestors from England, France and Germany, and return 1 row per user that met that criteria.

例如，向我展示祖先来自英国、法国和德国的所有用户，并为每个满足该条件的用户返回 1 行。

What is that SQL?

那个 SQL 是什么？

 user_id     ancestry

---------   ----------

    1        England
    1        Ireland
    2        France
    3        Germany
    3        Poland
    4        England
    4        France
    4        Germany
    5        France
    5        Germany

In the case of the data above, I would expect the result to be "4" as user_id 4 has ancestors from England, France and Germany.

在上述数据的情况下，我希望结果为“4”，因为 user_id 4 的祖先来自英国、法国和德国。

Thanks in advance.

提前致谢。

P.S. To clarify: Yes, the user_id / ancestry columns make a unique pair, so a country would not be repeated for a given user.

PS澄清：是的， user_id / ancestry 列是唯一的一对，因此不会为给定用户重复一个国家/地区。

P.P.S. I am looking for users who hail from all 3 countries - England, France, AND Germany (and the countries are arbitrary).

PPS 我正在寻找来自所有 3 个国家/地区的用户 - 英国、法国和德国（这些国家/地区是任意的）。

P.P.P.S. I am not looking for answers specific to a certain RDBMS. I'm looking to answer this problem "in general."

PPPS 我不是在寻找特定于某个 RDBMS 的答案。我正在寻找“一般”来回答这个问题。

I'm content w regenerating the where clause for each query provided generating the where clause can be done programmatically (e.g. that I can build a function to build the WHERE / FROM - WHERE clause).

我满足于为每个查询重新生成 where 子句，生成 where 子句可以以编程方式完成（例如，我可以构建一个函数来构建 WHERE / FROM - WHERE 子句）。

Answer 1

回答by Hong Ning

Try this:

尝试这个：

Select user_id
from yourtable
where ancestry in ('England', 'France', 'Germany')
group by user_id
having count(user_id) = 3

The last line means the user's ancestry has all 3 countries.

最后一行表示用户的祖先拥有所有 3 个国家/地区。

Answer 2

回答by HuckIt

SELECT DISTINCT (user_id) 
FROM [user]
WHERE user.user_id In (select user_id from user where ancestry = 'England') 
    And user.user_id In (select user_id from user where ancestry = 'France') 
    And user.user_id In (select user_id from user where ancestry = 'Germany');`

Answer 3

回答by Chains

Users who have one ofthe 3 countries

拥有3 个国家之一的用户

SELECT DISTINCT user_id
FROM table
WHERE ancestry IN('England','France','Germany')

Users who have all 3countries

拥有全部 3个国家/地区的用户

SELECT DISTINCT A.userID
FROM table A
   INNER JOIN table B on A.user_id = B.user_id
   INNER JOIN table C on A.user_id = C.user_id
WHERE A.ancestry = 'England'
   AND B.ancestry = 'Germany'
   AND C.ancestry = 'France'

Answer 4

回答by Konerak

First way: JOIN:

第一种方式：加入：

get people with multiple countries:

让多个国家的人：

SELECT u1.user_id 
FROM users u1
JOIN users u2
on u1.user_id  = u2.user_id 
AND u1.ancestry <> u2.ancestry

Get people from 2 specific countries:

获取来自 2 个特定国家/地区的人员：

SELECT u1.user_id 
FROM users u1
JOIN users u2
on u1.user_id  = u2.user_id 
WHERE u1.ancestry = 'Germany'
AND u2.ancestry = 'France'

For 3 countries... join three times. To only get the result(s) once, distinct.

对于 3 个国家...加入 3 次。只获得一次结果，不同。

Second way: GROUP BY

第二种方式：GROUP BY

This will get users which have 3 lines (having...count) and then you specify which lines are permitted. Note that if you don't have a UNIQUE KEY on (user_id, ancestry), a user with 'id, england' that appears 3 times will also match... so it depends on your table structure and/or data.

这将获得具有 3 行（具有...计数）的用户，然后您指定允许哪些行。请注意，如果您没有 UNIQUE KEY (user_id, ancestry)，则出现 3 次的 'id, england' 用户也将匹配...因此这取决于您的表结构和/或数据。

SELECT user_id 
FROM users u1
WHERE ancestry = 'Germany'
OR ancestry = 'France'
OR ancestry = 'England'
GROUP BY user_id
HAVING count(DISTINCT ancestry) = 3

Answer 5

回答by Martin K.

This question is some years old but i came via a duplicate to it. I want to suggest a more general solution too. If you know you always have a fixed number of ancestors you can use some self joins as already suggested in the answers. If you want a generic approach go on reading.

这个问题已经有些年头了，但我是通过重复来回答它的。我也想提出一个更通用的解决方案。如果你知道你总是有固定数量的祖先，你可以使用答案中已经建议的一些自连接。如果您想要通用方法，请继续阅读。

What you need here is called Quotient in relational Algebra. The Quotient is more or less the reversal of the Cartesian Product (or Cross Join in SQL).

您在这里需要的是关系代数中的商数。商或多或少是笛卡尔积（或 SQL 中的交叉连接）的反转。

Let's say your ancestor set Ais (i use a table notation here, i think this is better for understanding)

假设您的祖先集A是（我在这里使用表格符号，我认为这更便于理解）

ancestry
-----------
'England'
'France'
'Germany'

and your user set Uis

你的用户集U是

user_id
--------
   1
   2
   3

The cartesian product C=AxUis then:

笛卡尔积C=AxU为：

user_id  |  ancestry
---------+-----------
   1     | 'England'
   1     | 'France'
   1     | 'Germany'
   2     | 'England'
   2     | 'France'
   2     | 'Germany'
   3     | 'England'
   3     | 'France'
   3     | 'Germany'

If you calculate the set quotient U=C/Athen you get

如果你计算集商，U=C/A那么你得到

user_id
--------
   1
   2
   3

If you redo the cartesian product UXAyou will get Cagain. But note that for a set T, (T/A)xAwill not necessarily reproduce T. For example, if Tis

如果你重做笛卡尔积，UXA你会C再次得到。但需要注意的是，对于一套T， (T/A)xA不一定会重现T。例如，如果T是

user_id  |  ancestry
---------+-----------
   1     | 'England'
   1     | 'France'
   1     | 'Germany'
   2     | 'England'
   2     | 'France'

then (T/A)is

那么(T/A)是

user_id
--------
   1

(T/A)xAwill then be

(T/A)xA然后将是

user_id  |  ancestry
---------+------------
   1     | 'England'
   1     | 'France'
   1     | 'Germany'

Note that the records for user_id=2have been eliminated by the Quotient and Cartesian Product operations.

请注意，的记录user_id=2已被商和笛卡尔积操作消除。

Your question is: Which user_id has ancestors from all countries in your ancestor set? In other words you want U=T/Awhere Tis your original set (or your table).

您的问题是：哪个 user_id 在您的祖先集中拥有来自所有国家的祖先？换句话说，你想要你的原始集合（或你的桌子）U=T/A在哪里T。

To implement the quotient in SQL you have to do 4 steps:

要在 SQL 中实现商，您必须执行 4 个步骤：

Create the Cartesian Product of your ancestry set and the set of all user_ids.
Find all records in the Cartesian Product which have no partner in the original set (Left Join)
Extract the user_ids from the resultset of 2)
Return all user_ids from the original set which are not included in the result set of 3)

创建祖先集的笛卡尔积和所有 user_id 的集合。
查找笛卡尔积中原始集合中没有伙伴的所有记录（左连接）
从 2) 的结果集中提取 user_ids
返回原始集合中未包含在结果集中的所有 user_ids 3)

So let's do it step by step. I will use TSQL syntax (Microsoft SQL server) but it should easily be adaptable to other DBMS. As a name for the table (user_id, ancestry)i choose ancestor

所以让我们一步一步来。我将使用 TSQL 语法（Microsoft SQL 服务器），但它应该很容易适应其他 DBMS。作为(user_id, ancestry)我选择的表的名称ancestor

CREATE TABLE ancestry_set (ancestry nvarchar(25))
INSERT INTO ancestry_set (ancestry) VALUES ('England')
INSERT INTO ancestry_set (ancestry) VALUES ('France')
INSERT INTO ancestry_set (ancestry) VALUES ('Germany')

CREATE TABLE ancestor ([user_id] int, ancestry nvarchar(25))
INSERT INTO ancestor ([user_id],ancestry) VALUES (1,'England')
INSERT INTO ancestor ([user_id],ancestry) VALUES(1,'Ireland')
INSERT INTO ancestor ([user_id],ancestry) VALUES(2,'France')
INSERT INTO ancestor ([user_id],ancestry) VALUES(3,'Germany')
INSERT INTO ancestor ([user_id],ancestry) VALUES(3,'Poland')
INSERT INTO ancestor ([user_id],ancestry) VALUES(4,'England')
INSERT INTO ancestor ([user_id],ancestry) VALUES(4,'France')
INSERT INTO ancestor ([user_id],ancestry) VALUES(4,'Germany')
INSERT INTO ancestor ([user_id],ancestry) VALUES(5,'France')
INSERT INTO ancestor ([user_id],ancestry) VALUES(5,'Germany')

1) Create the Cartesian Product of your ancestry set and the set of all user_ids.

1）创建您的祖先集的笛卡尔积和所有 user_id 的集合。

SELECT a.[user_id],s.ancestry
FROM ancestor a, ancestry_set s
GROUP BY a.[user_id],s.ancestry

2) Find all records in the Cartesian Product which have no partner in the original set (Left Join) and

2) 查找笛卡尔积中原始集合中没有伙伴的所有记录（左连接）和

3) Extract the user_ids from the resultset of 2)

3) 从 2) 的结果集中提取 user_ids

SELECT DISTINCT cp.[user_id]
FROM (SELECT a.[user_id],s.ancestry
      FROM ancestor a, ancestry_set s
      GROUP BY a.[user_id],s.ancestry) cp
   LEFT JOIN ancestor a ON cp.[user_id]=a.[user_id] AND cp.ancestry=a.ancestry
WHERE a.[user_id] is null

4) Return all user_ids from the original set which are not included in the result set of 3)

4) 返回原始集合中所有未包含在结果集中的 user_ids 3)

SELECT DISTINCT [user_id]
FROM ancestor
WHERE [user_id] NOT IN (
   SELECT DISTINCT cp.[user_id]
   FROM (SELECT a.[user_id],s.ancestry
         FROM ancestor a, ancestry_set s
         GROUP BY a.[user_id],s.ancestry) cp
   LEFT JOIN ancestor a ON cp.[user_id]=a.[user_id] AND cp.ancestry=a.ancestry
   WHERE a.[user_id] is null
   )

Answer 6

回答by CyberNinja

like the answer above but I have a duplicate record so I have to create a subquery with distinct

像上面的答案，但我有一个重复的记录，所以我必须创建一个不同的子查询

Select user_id
(
   select distinct userid
   from yourtable
   where user_id = @userid

) t1
where 
ancestry in ('England', 'France', 'Germany')
group by user_id
having count(user_id) = 3

this is what I used because I have multiple record(download logs) and this checks that all the required files have been downloaded

这是我使用的，因为我有多个记录（下载日志），这会检查是否已下载所有必需的文件

Answer 7

回答by StevenV

brute force (and only tested on an Oracle system, but I think this is pretty standard):

蛮力（仅在 Oracle 系统上测试，但我认为这是非常标准的）：

select distinct usr_id from users where user_id in (
    select user_id from (
      Select user_id, Count(User_Id) As Cc
      From users 
      GROUP BY user_id
    ) Where Cc =3
  )
  and ancestry in ('England', 'France', 'Germany')
;

edit: I like @HuckIt's answer even better.

编辑：我更喜欢@HuckIt 的回答。

Answer 8

回答by mkk

one of the approach if you want to get all user_id that satisfies all conditions is:

如果您想获取满足所有条件的所有 user_id，其中一种方法是：

SELECT DISTINCT user_id FROM table WHERE ancestry IN ('England', '...', '...') GROUP BY user_id HAVING count(*) = <number of conditions that has to be satisfied>

etc. If you need to take all user_ids that satisfies at least one condition, then you can do

等如果您需要获取满足至少一个条件的所有 user_ids，那么您可以这样做

SELECT DISTINCT user_id from table where ancestry IN ('England', 'France', ... , '...')

I am not aware if there is something similar to IN but that joins conditions with AND instead of OR

我不知道是否有类似于 IN 的东西，但是用 AND 而不是 OR 连接条件

SQL：如何从单个列中选择满足多个条件的单个 id（“行”）

提问by W. Young

回答by Hong Ning

回答by HuckIt

回答by Chains

回答by Konerak

First way: JOIN:

第一种方式：加入：

Second way: GROUP BY

第二种方式：GROUP BY

回答by Martin K.

回答by CyberNinja

回答by StevenV

回答by mkk

相关推荐

最近更新

标签

SQL：如何从单个列中选择满足多个条件的单个 id（“行”）

提问by W. Young

回答by Hong Ning

回答by HuckIt

回答by Chains

回答by Konerak

First way: JOIN:

第一种方式：加入：

Second way: GROUP BY

第二种方式：GROUP BY

回答by Martin K.

回答by CyberNinja

回答by StevenV

回答by mkk

相关推荐

SQL 基于另一列数据的列中的 SUM() 数据

SQL 在 v4 之前，MongoDB 不符合 ACID 的真正含义是什么？

SQL 将excel工作表转换为sql脚本

使用 SQLCMD 的 PostDeployment.sql 脚本中的条件逻辑

相关推荐

最近更新

标签