oracle 通过一组唯一的列值过滤 SQL 查询，而不管它们的顺序

Question

提问by Kevin Babcock

I have a table in Oracle containing two columns that I'd like to query for records containing a unique combination of values, regardless of the order of those values. For example, if I have the following table:

我在 Oracle 中有一个表，其中包含两列，我想查询包含唯一值组合的记录，而不管这些值的顺序如何。例如，如果我有下表：

create table RELATIONSHIPS (
    PERSON_1 number not null,
    PERSON_2 number not null,
    RELATIONSHIP  number not null,
    constraint PK_RELATIONSHIPS
        primary key (PERSON_1, PERSON_2)
);

I'd like to query for all unique relationships. So if I have a record PERSON_1 = John and PERSON_2 = Jill, I don't want to see another record where PERSON_1 = Jill and PERSON_2 = John.

我想查询所有独特的关系。因此，如果我有一个记录 PERSON_1 = John 和 PERSON_2 = Jill，我不想看到另一个记录，其中 PERSON_1 = Jill 和 PERSON_2 = John。

Is there an easy way to do this?

是否有捷径可寻？

Answer 1

采纳答案by Bill Karwin

There's some uncertainty as to whether you want to preventduplicates from being inserted into the database. You might just want to fetch unique pairs, while preserving the duplicates.

关于是否要防止将重复项插入到数据库中存在一些不确定性。您可能只想获取唯一的对，同时保留重复项。

So here's an alternative solution for the latter case, querying unique pairs even if duplicates exist:

因此，这是后一种情况的替代解决方案，即使存在重复项，也查询唯一对：

SELECT r1.*
FROM Relationships r1
LEFT OUTER JOIN Relationships r2
  ON (r1.person_1 = r2.person_2 AND r1.person_2 = r2.person_1)
WHERE r1.person_1 < r1.person_2
  OR  r2.person_1 IS NULL;

So if there is a matching row with the id's reversed, there's a rule for which one the query should prefer (the one with id's in numerical order).

因此，如果有一个匹配的行的 id 颠倒了，则有一个规则，查询应该更喜欢哪一个（id 按数字顺序排列的那个）。

If there is no matching row, then r2 will be NULL (this is the way outer join works), so just use whatever is found in r1 in that case.

如果没有匹配的行，则 r2 将为 NULL（这是外连接的工作方式），因此在这种情况下只需使用 r1 中找到的任何内容。

No need to use GROUP BYor DISTINCT, because there can only be zero or one matching rows.

不需要使用GROUP BYor DISTINCT，因为只能有零个或一个匹配的行。

Trying this in MySQL, I get the following optimization plan:

在 MySQL 中尝试这个，我得到以下优化计划：

+----+-------------+-------+--------+---------------+---------+---------+-----------------------------------+------+--------------------------+
| id | select_type | table | type   | possible_keys | key     | key_len | ref                               | rows | Extra                    |
+----+-------------+-------+--------+---------------+---------+---------+-----------------------------------+------+--------------------------+
|  1 | SIMPLE      | r1    | ALL    | NULL          | NULL    | NULL    | NULL                              |    2 |                          | 
|  1 | SIMPLE      | r2    | eq_ref | PRIMARY       | PRIMARY | 8       | test.r1.person_2,test.r1.person_1 |    1 | Using where; Using index | 
+----+-------------+-------+--------+---------------+---------+---------+-----------------------------------+------+--------------------------+

This seems to be a reasonably good use of indexes.

这似乎是对索引的合理使用。

Answer 2

回答by Marc Gravell

Is the relationship always there in both directions? i.e. if John and Jill are related, then is there alwaysa {John,Jill} and {Jill,John} ? If so, just limit to those where Person_1 < Person_2 and take the distinct set.

这种关系总是双向的吗？即如果 John 和 Jill 是相关的，那么总是有 {John,Jill} 和 {Jill,John} 吗？如果是这样，只需限制 Person_1 < Person_2 并采用不同的集合。

Answer 3

回答by tekBlues

select distinct
case when PERSON_1>=PERSON_2 then PERSON_1 ELSE PERSON_2 END person_a,
case when PERSON_1>=PERSON_2 then PERSON_2 ELSE PERSON_1 END person_b
FROM RELATIONSHIPS;

Answer 4

回答by Rob van Wijk

Untested:

未经测试：

select least(person_1,person_2)
     , greatest(person_1,person_2)
  from relationships
 group by least(person_1,person_2)
     , greatest(person_1,person_2)

To prevent such double entries, you can add a unique index, using the same idea (tested!):

为了防止这种重复输入，您可以使用相同的想法（经过测试！）添加唯一索引：

SQL> create table relationships
  2  ( person_1 number not null
  3  , person_2 number not null
  4  , relationship number not null
  5  , constraint pk_relationships primary key (person_1, person_2)
  6  )
  7  /

Table created.

SQL> create unique index ui_relationships on relationships(least(person_1,person_2),greatest(person_1,person_2))
  2  /

Index created.

SQL> insert into relationships values (1,2,0)
  2  /

1 row created.

SQL> insert into relationships values (1,3,0)
  2  /

1 row created.

SQL> insert into relationships values (2,1,0)
  2  /
insert into relationships values (2,1,0)
*
ERROR at line 1:
ORA-00001: unique constraint (RWIJK.UI_RELATIONSHIPS) violated

Regards, Rob.

问候，罗伯。

Answer 5

回答by Bill Karwin

You should create a constraint on your Relationshipstable so that the numeric person_1value must be less than the numeric person_2value.

您应该在您的Relationships表上创建一个约束，以便数值person_1必须小于数值person_2。

create table RELATIONSHIPS (
    PERSON_1 number not null,
    PERSON_2 number not null,
    RELATIONSHIP  number not null,
    constraint PK_RELATIONSHIPS
        primary key (PERSON_1, PERSON_2),
    constraint UNIQ_RELATIONSHIPS
        CHECK (PERSON_1 < PERSON_2)
);

That way you can be sure that (2,1) can never be inserted -- it would have to be (1,2). Then your PRIMARY KEY constraint will prevent duplicates.

这样你就可以确定 (2,1) 永远不会被插入——它必须是 (1,2)。那么您的 PRIMARY KEY 约束将防止重复。

PS: I see Marc Gravell has answered more quickly than I have, with a similar solution.

PS：我看到 Marc Gravell 用类似的解决方案比我回答得更快。

Answer 6

回答by Bill Karwin

Possibly the simplest solution (that does not require alteration of data structure or creation of triggers) is to create a set of results without the duplicate entries, and add one of the duplicate entries to that set.

可能最简单的解决方案（不需要更改数据结构或创建触发器）是创建一组没有重复条目的结果，并将其中一个重复条目添加到该集合中。

would look something like:

看起来像：

 select * from relationships where rowid not in 
    (select a.rowid from  relationships a,relationships b 
       where a.person_1=b.person_2 and a.person_2=b.person_1)
union all
 select * from relationships where rowid in 
    (select a.rowid from  relationships a,relationships b where 
       a.person_1=b.person_2 and a.person_2=b.person_1 and a.person_1>a.person_2)

But usually I never create a table without a one-column primary key.

但通常我从不创建没有单列主键的表。

Answer 7

回答by Scott Swank

You could just,

你可以只，

with rel as (
select *,
       row_number() over (partition by least(person_1,person_2), 
                                       greatest(person_1,person_2)) as rn
  from relationships
       )
select *
  from rel
 where rn = 1;

Answer 8

回答by MikeNereson

I think KM almost got it right, I added concat.

我认为 KM 几乎是对的，我添加了 concat。

SELECT DISTINCT *
    FROM (SELECT DISTINCT concat(Person_1,Person_2) FROM RELATIONSHIPS
          UNION 
          SELECT DISTINCT concat(Person_2, Person_1) FROM RELATIONSHIPS
         ) dt

Answer 9

回答by copaX

it's kludgy as heck, but it'd at least tell you what unique combinations you have, just not in a real handy way...

它很笨拙，但它至少会告诉你你有哪些独特的组合，只是不是以一种真正方便的方式......

select distinct(case when person_1 <= person_2 then person_1||'|'||person_2 else person_2||'|'||person_1 end)
from relationships;

Answer 10

回答by Aistina

I think something like this should do the trick:

我认为这样的事情应该可以解决问题：

select * from RELATIONSHIPS group by PERSON_1, PERSON_2

oracle 通过一组唯一的列值过滤 SQL 查询，而不管它们的顺序

提问by Kevin Babcock

采纳答案by Bill Karwin

回答by Marc Gravell

回答by tekBlues

回答by Rob van Wijk

回答by Bill Karwin

回答by Bill Karwin

回答by Scott Swank

回答by MikeNereson

回答by copaX

回答by Aistina

相关推荐

最近更新

标签

oracle 通过一组唯一的列值过滤 SQL 查询，而不管它们的顺序

提问by Kevin Babcock

采纳答案by Bill Karwin

回答by Marc Gravell

回答by tekBlues

回答by Rob van Wijk

回答by Bill Karwin

回答by Bill Karwin

回答by Scott Swank

回答by MikeNereson

回答by copaX

回答by Aistina

相关推荐

oracle 如何确定 PL/SQL 参数值是否为默认值？

oracle 为什么使用 EXECUTE IMMEDIATE 运行此查询会导致它失败？

Oracle Form Builder：在表单中的选项卡之间切换

oracle 如何取消长时间运行的数据库操作？

相关推荐

最近更新

标签