oracle 通过一组唯一的列值过滤 SQL 查询,而不管它们的顺序

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/977648/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-18 18:21:55  来源:igfitidea点击:

Filter SQL query by a unique set of column values, regardless of their order

sqloracle

提问by Kevin Babcock

I have a table in Oracle containing two columns that I'd like to query for records containing a unique combination of values, regardless of the order of those values. For example, if I have the following table:

我在 Oracle 中有一个表,其中包含两列,我想查询包含唯一值组合的记录,而不管这些值的顺序如何。例如,如果我有下表:

create table RELATIONSHIPS (
    PERSON_1 number not null,
    PERSON_2 number not null,
    RELATIONSHIP  number not null,
    constraint PK_RELATIONSHIPS
        primary key (PERSON_1, PERSON_2)
);

I'd like to query for all unique relationships. So if I have a record PERSON_1 = John and PERSON_2 = Jill, I don't want to see another record where PERSON_1 = Jill and PERSON_2 = John.

我想查询所有独特的关系。因此,如果我有一个记录 PERSON_1 = John 和 PERSON_2 = Jill,我不想看到另一个记录,其中 PERSON_1 = Jill 和 PERSON_2 = John。

Is there an easy way to do this?

是否有捷径可寻?

采纳答案by Bill Karwin

There's some uncertainty as to whether you want to preventduplicates from being inserted into the database. You might just want to fetch unique pairs, while preserving the duplicates.

关于是否要防止将重复项插入到数据库中存在一些不确定性。您可能只想获取唯一的对,同时保留重复项。

So here's an alternative solution for the latter case, querying unique pairs even if duplicates exist:

因此,这是后一种情况的替代解决方案,即使存在重复项,也查询唯一对:

SELECT r1.*
FROM Relationships r1
LEFT OUTER JOIN Relationships r2
  ON (r1.person_1 = r2.person_2 AND r1.person_2 = r2.person_1)
WHERE r1.person_1 < r1.person_2
  OR  r2.person_1 IS NULL;

So if there is a matching row with the id's reversed, there's a rule for which one the query should prefer (the one with id's in numerical order).

因此,如果有一个匹配的行的 id 颠倒了,则有一个规则,查询应该更喜欢哪一个(id 按数字顺序排列的那个)。

If there is no matching row, then r2 will be NULL (this is the way outer join works), so just use whatever is found in r1 in that case.

如果没有匹配的行,则 r2 将为 NULL(这是外连接的工作方式),因此在这种情况下只需使用 r1 中找到的任何内容。

No need to use GROUP BYor DISTINCT, because there can only be zero or one matching rows.

不需要使用GROUP BYor DISTINCT,因为只能有零个或一个匹配的行。

Trying this in MySQL, I get the following optimization plan:

在 MySQL 中尝试这个,我得到以下优化计划:

+----+-------------+-------+--------+---------------+---------+---------+-----------------------------------+------+--------------------------+
| id | select_type | table | type   | possible_keys | key     | key_len | ref                               | rows | Extra                    |
+----+-------------+-------+--------+---------------+---------+---------+-----------------------------------+------+--------------------------+
|  1 | SIMPLE      | r1    | ALL    | NULL          | NULL    | NULL    | NULL                              |    2 |                          | 
|  1 | SIMPLE      | r2    | eq_ref | PRIMARY       | PRIMARY | 8       | test.r1.person_2,test.r1.person_1 |    1 | Using where; Using index | 
+----+-------------+-------+--------+---------------+---------+---------+-----------------------------------+------+--------------------------+

This seems to be a reasonably good use of indexes.

这似乎是对索引的合理使用。

回答by Marc Gravell

Is the relationship always there in both directions? i.e. if John and Jill are related, then is there alwaysa {John,Jill} and {Jill,John} ? If so, just limit to those where Person_1 < Person_2 and take the distinct set.

这种关系总是双向的吗?即如果 John 和 Jill 是相关的,那么总是有 {John,Jill} 和 {Jill,John} 吗?如果是这样,只需限制 Person_1 < Person_2 并采用不同的集合。

回答by tekBlues

select distinct
case when PERSON_1>=PERSON_2 then PERSON_1 ELSE PERSON_2 END person_a,
case when PERSON_1>=PERSON_2 then PERSON_2 ELSE PERSON_1 END person_b
FROM RELATIONSHIPS;

回答by Rob van Wijk

Untested:

未经测试:

select least(person_1,person_2)
     , greatest(person_1,person_2)
  from relationships
 group by least(person_1,person_2)
     , greatest(person_1,person_2)

To prevent such double entries, you can add a unique index, using the same idea (tested!):

为了防止这种重复输入,您可以使用相同的想法(经过测试!)添加唯一索引:

SQL> create table relationships
  2  ( person_1 number not null
  3  , person_2 number not null
  4  , relationship number not null
  5  , constraint pk_relationships primary key (person_1, person_2)
  6  )
  7  /

Table created.

SQL> create unique index ui_relationships on relationships(least(person_1,person_2),greatest(person_1,person_2))
  2  /

Index created.

SQL> insert into relationships values (1,2,0)
  2  /

1 row created.

SQL> insert into relationships values (1,3,0)
  2  /

1 row created.

SQL> insert into relationships values (2,1,0)
  2  /
insert into relationships values (2,1,0)
*
ERROR at line 1:
ORA-00001: unique constraint (RWIJK.UI_RELATIONSHIPS) violated

Regards, Rob.

问候,罗伯。

回答by Bill Karwin

You should create a constraint on your Relationshipstable so that the numeric person_1value must be less than the numeric person_2value.

您应该在您的Relationships表上创建一个约束,以便数值person_1必须小于数值person_2

create table RELATIONSHIPS (
    PERSON_1 number not null,
    PERSON_2 number not null,
    RELATIONSHIP  number not null,
    constraint PK_RELATIONSHIPS
        primary key (PERSON_1, PERSON_2),
    constraint UNIQ_RELATIONSHIPS
        CHECK (PERSON_1 < PERSON_2)
);

That way you can be sure that (2,1) can never be inserted -- it would have to be (1,2). Then your PRIMARY KEY constraint will prevent duplicates.

这样你就可以确定 (2,1) 永远不会被插入——它必须是 (1,2)。那么您的 PRIMARY KEY 约束将防止重复。

PS: I see Marc Gravell has answered more quickly than I have, with a similar solution.

PS:我看到 Marc Gravell 用类似的解决方案比我回答得更快。

回答by Bill Karwin

Possibly the simplest solution (that does not require alteration of data structure or creation of triggers) is to create a set of results without the duplicate entries, and add one of the duplicate entries to that set.

可能最简单的解决方案(不需要更改数据结构或创建触发器)是创建一组没有重复条目的结果,并将其中一个重复条目添加到该集合中。

would look something like:

看起来像:

 select * from relationships where rowid not in 
    (select a.rowid from  relationships a,relationships b 
       where a.person_1=b.person_2 and a.person_2=b.person_1)
union all
 select * from relationships where rowid in 
    (select a.rowid from  relationships a,relationships b where 
       a.person_1=b.person_2 and a.person_2=b.person_1 and a.person_1>a.person_2)

But usually I never create a table without a one-column primary key.

但通常我从不创建没有单列主键的表。

回答by Scott Swank

You could just,

你可以只,

with rel as (
select *,
       row_number() over (partition by least(person_1,person_2), 
                                       greatest(person_1,person_2)) as rn
  from relationships
       )
select *
  from rel
 where rn = 1;

回答by MikeNereson

I think KM almost got it right, I added concat.

我认为 KM 几乎是对的,我添加了 concat。

SELECT DISTINCT *
    FROM (SELECT DISTINCT concat(Person_1,Person_2) FROM RELATIONSHIPS
          UNION 
          SELECT DISTINCT concat(Person_2, Person_1) FROM RELATIONSHIPS
         ) dt

回答by copaX

it's kludgy as heck, but it'd at least tell you what unique combinations you have, just not in a real handy way...

它很笨拙,但它至少会告诉你你有哪些独特的组合,只是不是以一种真正方便的方式......

select distinct(case when person_1 <= person_2 then person_1||'|'||person_2 else person_2||'|'||person_1 end)
from relationships;

回答by Aistina

I think something like this should do the trick:

我认为这样的事情应该可以解决问题:

select * from RELATIONSHIPS group by PERSON_1, PERSON_2