非重复记录的 SQL 查询

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/3686192/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-01 07:30:45  来源:igfitidea点击:

SQL query for non duplicate records

sqlduplicates

提问by nth

I'm attempting to build a query that will return all non duplicate (unique) records in a table. The query will need to use multiple fields to determine if the records are duplicate.

我正在尝试构建一个查询,该查询将返回表中的所有非重复(唯一)记录。查询将需要使用多个字段来确定记录是否重复。

For example, if a table has the following fields; PKID, ClientID, Name, AcctNo, OrderDate, Charge, I'd like to use the AcctNo, OrderDate and Charge fields to find unique records.

例如,如果一个表有以下字段;PKID、ClientID、Name、AcctNo、OrderDate、Charge,我想使用 AcctNo、OrderDate 和 Charge 字段来查找唯一记录。

Table

桌子

PKID-----ClientID-----Name-----AcctNo-----OrderDate-----Charge
1        JX100        John     12345      9/9/2010      0.00
2        JX220        Mark     55567      9/9/2010       .00
3        JX690        Matt     89899      9/9/2010      8.00
4        JX100        John     12345      9/9/2010      0.00

The result of the query would need to be:

查询的结果需要是:

PKID-----ClientID-----Name-----AcctNo-----OrderDate-----Charge
2        JX220        Mark     55567      9/9/2010       .00
3        JX690        Matt     89899      9/9/2010      8.00

I've tried using SELECT DISTINCT, but that doesn't work because it keeps one of the duplicate records in the result. I've also tried using HAVING COUNT = 1, but that returns all records.

我试过使用 SELECT DISTINCT,但这不起作用,因为它在结果中保留了重复记录之一。我也试过使用 HAVING COUNT = 1,但这会返回所有记录。

Thanks for the help.

谢谢您的帮助。

回答by heisenberg

HAVING COUNT(*) = 1will work if you only include the fields in the GROUP BYthat you're using to find the unique records. (i.e. not PKID, but you can use MAXor MINto return that since you'll only have one record per group in the results set.)

HAVING COUNT(*) = 1如果您只包含GROUP BY用于查找唯一记录的字段,则将起作用。(即 not PKID,但您可以使用MAXorMIN返回它,因为结果集中每组只有一条记录。)

回答by Martin Smith

SELECT   MAX(PKID)     AS PKID    ,
         MAX(ClientID) AS ClientID,
         MAX(Name)     AS Name    ,
         AcctNo                   ,
         OrderDate                ,
         Charge
FROM     T
GROUP BY AcctNo   ,
         OrderDate,
         Charge
HAVING   COUNT(*) = 1

or

或者

SELECT PKID      ,
       ClientID  ,
       Name      ,
       AcctNo    ,
       OrderDate ,
       Charge
FROM   YourTable t1
WHERE  NOT EXISTS
       (SELECT *
       FROM    YourTable t2
       WHERE   t1.PKID     <> t2.PKID
       AND     t1.AcctNo    = t2.AcctNo
       AND     t1.OrderDate = t2.OrderDate
       AND     t1.Charge    = t2.Charge
       )

回答by Gus

Simply add:

只需添加:

GROUP BY AcctNo, OrderDate, Charge
HAVING COUNT(1) = 1

The GROUP BYgroups all rows with the same AcctNo, OrderDate and Charge together, then the HAVING COUNT(1) = 1shows only the rows where there was just 1 progenitor.

GROUP BY具有相同 AcctNo、OrderDate 和 Charge 的所有行组合在一起,然后HAVING COUNT(1) = 1仅显示只有 1 个祖先的行。

回答by nth

Thanks kekekela for the nudge in the right direction.

感谢 kekekela 朝着正确的方向推动。

Here's the query that produced the result I wanted:

这是产生我想要的结果的查询:

SELECT AcctNo, OrderDate, Charge FROM Table1 GROUP BY AcctNo, OrderDate, Charge
HAVING (COUNT(AcctNo) = 1) AND (COUNT(OrderDate) = 1) AND (COUNT(Charge) = 1);

Or more simplified based on Gus's example:

或者更简化基于 Gus 的例子:

SELECT AcctNo, OrderDate, Charge FROM Table1 GROUP BY AcctNo, OrderDate, Charge
HAVING COUNT(1) = 1;

回答by vol7ron

You could just drop the PKIDto return all records:

您可以删除PKID以返回所有记录:

SELECT DISTINCT 
           ClientID
         , Name
         , AcctNo
         , OrderDate
         , Charge
FROM       table;


Note:This is slightly different from what you're asking.
It returns a unique set by removing the one non-unique field.
By your example, you're asking to return non-duplicates.

注意:这与您的要求略有不同。
它通过删除一个非唯一字段来返回一个唯一集。
通过您的示例,您要求返回非重复项。

I could only see your example being useful if you're trying
to clean up a table by extracting the "good" records.

如果您尝试
通过提取“好”记录来清理表格,我只能看到您的示例很有用。

回答by Ketil Duna

You could determine the non-unique records first, and then test for those records not in that set - like this

您可以先确定非唯一记录,然后测试不在该集中的那些记录 - 就像这样

select * from mytable where pkid not in
(select t1.pkid 
from mytable t1 inner join mytable t2
on t1.pkid <> t2.pkid
and t1.acctno = t2.acctno
and t1.orderdate = t2.orderdate
and t1.charge = t2.charge)

the last part of the inner query lets you fiddle with the criteria for "equality" - add the required number of columns to test. Of course, this gets a lot more interesting without that primary key :) In such cases I usually end up creating one

内部查询的最后一部分让您可以摆弄“相等”的标准 - 添加所需的列数进行测试。当然,如果没有那个主键,这会变得更有趣:) 在这种情况下,我通常最终会创建一个

Ketil

凯蒂尔

回答by waqas ahmad

 SELECT GMPS.gen.ProductDetail.PaperType, GMPS.gen.ProductDetail.Size FROM
 GMPS.gen.ProductDetail GROUP BY GMPS.gen.ProductDetail.PaperType,
 GMPS.gen.ProductDetail.Size
 HAVING COUNT(1) = 1;