从 SQL Join 中删除重复项

Question

提问by Hammad Khan

The following is a hypothetical situations this which is close to my real problem. Table1

以下是一个假设情况，这与我的实际问题很接近。表格1

recid   firstname    lastname   company
1       A             B          AAA
2       D             E          DEF
3       G             H          IJK
4       A             B          ABC

I have a table2 which looks like this

我有一个看起来像这样的 table2

recid   firstname    lastname   company
10      A             B          ABC
20      D             E          DEF
30      M             D          DIM
40      A             B          CCC

Now if I join the table on recid, it will give 0 result, there will be no duplicates because recid is unique. But if I join on firstname and lastname column, which are not unique and there are duplicates, I get duplicates on inner join. The more columns I add on join, the worse it becomes (more duplicates are created).

现在，如果我在 recid 上加入表，它将给出 0 结果，不会有重复项，因为 recid 是唯一的。但是，如果我在 firstname 和 lastname 列上加入，它们不是唯一的并且有重复，我会在内部连接上得到重复。我在 join 上添加的列越多，情况就越糟（创建的重复项越多）。

In the above simple situation, how can I remove duplicates in the following query. I want to compare firstname and lastname, if they match, I return firstname, lastname and recid from table2

在上述简单情况下，如何删除以下查询中的重复项。我想比较名字和姓氏，如果它们匹配，我从表 2 中返回名字、姓氏和 recid

select distinct * from
(select recid, first, last from table1) a
inner join
(select recid, first,last from table2) b
on a.first = b.first

Script is here if anyone wants to play with it in future

如果将来有人想玩它，脚本就在这里

create table table1 (recid int not null primary key, first varchar(20), last varchar(20), company varchar(20))
create table table2 (recid int not null primary key, first varchar(20), last varchar(20), company varchar(20))

insert into table1 values(1,'A','B','ABC')
insert into table1 values(2,'D','E','DEF')
insert into table1 values(3,'M','N','MNO')
insert into table1 values(4,'A','B','ABC')

insert into table2 values(10,'A','B','ABC')
insert into table2 values(20,'D','E','DEF')
insert into table2 values(30,'Q','R','QRS')
insert into table2 values(40,'A','B','ABC')

Answer 1

回答by Code Magician

You don't want to do a join per se, you're merely testing for existence/set inclusion.

您不想进行连接本身，您只是在测试存在/集合包含。

I don't know what current flavor of SQL you're coding in, but this should work.

我不知道您正在编码的 SQL 的当前风格，但这应该有效。

SELECT MAX(recid), firstname, lastname 
FROM table2 T2
WHERE EXISTS (SELECT * FROM table1 WHERE firstname = T2.firstame AND lastname = T2.lastname)
GROUP BY lastname, firstname

If you want to implement as a join, leaving the code largely the same:

如果要实现为连接，则代码大致相同：

i.e.

IE

SELECT max(t2.recid), t2.firstame, t2.lastname 
FROM Table2 T2 
INNER JOIN Table1 T1 
    ON T2.firstname = t1.firstname and t2.lastname = t1.lastname
GROUP BY t2.firstname, t2.lastname

Depending on the DBMS, an inner join may be implemented differently to an Exists (semi-join vs join) but the optimizer can sometimes figure it out anyway and chose the correct operator regardless of which way you write it.

根据 DBMS，内部联接的实现方式可能与 Exists（半联接与联接）不同，但优化器有时无论如何都能弄清楚并选择正确的运算符，无论您以哪种方式编写它。

Answer 2

回答by sll

SELECT t2.recid, t2.first, t2.last 
FROM  table1 t1
INNER JOIN table2 t2 ON t1.first = t2.first AND t1.last = t2.last
GROUP BY t2.recid, t2.first, t2.last

EDIT: Added picture

编辑：添加图片

enter image description here

在此处输入图片说明

从 SQL Join 中删除重复项

提问by Hammad Khan

回答by Code Magician

回答by sll

相关推荐

最近更新

标签

从 SQL Join 中删除重复项

提问by Hammad Khan

回答by Code Magician

回答by sll

相关推荐

SQL 如何在 SQLite 列中找到字符的位置？

SQL 删除除某些行以外的所有行

SQL 将 MIN 聚合函数应用于 BIT 字段

SQL Server 默认日期时间戳？

相关推荐

最近更新

标签