非重复记录的 SQL 查询
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/3686192/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
SQL query for non duplicate records
提问by nth
I'm attempting to build a query that will return all non duplicate (unique) records in a table. The query will need to use multiple fields to determine if the records are duplicate.
我正在尝试构建一个查询,该查询将返回表中的所有非重复(唯一)记录。查询将需要使用多个字段来确定记录是否重复。
For example, if a table has the following fields; PKID, ClientID, Name, AcctNo, OrderDate, Charge, I'd like to use the AcctNo, OrderDate and Charge fields to find unique records.
例如,如果一个表有以下字段;PKID、ClientID、Name、AcctNo、OrderDate、Charge,我想使用 AcctNo、OrderDate 和 Charge 字段来查找唯一记录。
Table
桌子
PKID-----ClientID-----Name-----AcctNo-----OrderDate-----Charge
1 JX100 John 12345 9/9/2010 0.00
2 JX220 Mark 55567 9/9/2010 .00
3 JX690 Matt 89899 9/9/2010 8.00
4 JX100 John 12345 9/9/2010 0.00
The result of the query would need to be:
查询的结果需要是:
PKID-----ClientID-----Name-----AcctNo-----OrderDate-----Charge
2 JX220 Mark 55567 9/9/2010 .00
3 JX690 Matt 89899 9/9/2010 8.00
I've tried using SELECT DISTINCT, but that doesn't work because it keeps one of the duplicate records in the result. I've also tried using HAVING COUNT = 1, but that returns all records.
我试过使用 SELECT DISTINCT,但这不起作用,因为它在结果中保留了重复记录之一。我也试过使用 HAVING COUNT = 1,但这会返回所有记录。
Thanks for the help.
谢谢您的帮助。
回答by heisenberg
HAVING COUNT(*) = 1
will work if you only include the fields in the GROUP BY
that you're using to find the unique records. (i.e. not PKID
, but you can use MAX
or MIN
to return that since you'll only have one record per group in the results set.)
HAVING COUNT(*) = 1
如果您只包含GROUP BY
用于查找唯一记录的字段,则将起作用。(即 not PKID
,但您可以使用MAX
orMIN
返回它,因为结果集中每组只有一条记录。)
回答by Martin Smith
SELECT MAX(PKID) AS PKID ,
MAX(ClientID) AS ClientID,
MAX(Name) AS Name ,
AcctNo ,
OrderDate ,
Charge
FROM T
GROUP BY AcctNo ,
OrderDate,
Charge
HAVING COUNT(*) = 1
or
或者
SELECT PKID ,
ClientID ,
Name ,
AcctNo ,
OrderDate ,
Charge
FROM YourTable t1
WHERE NOT EXISTS
(SELECT *
FROM YourTable t2
WHERE t1.PKID <> t2.PKID
AND t1.AcctNo = t2.AcctNo
AND t1.OrderDate = t2.OrderDate
AND t1.Charge = t2.Charge
)
回答by Gus
Simply add:
只需添加:
GROUP BY AcctNo, OrderDate, Charge
HAVING COUNT(1) = 1
The GROUP BY
groups all rows with the same AcctNo, OrderDate and Charge together,
then the HAVING COUNT(1) = 1
shows only the rows where there was just 1 progenitor.
将GROUP BY
具有相同 AcctNo、OrderDate 和 Charge 的所有行组合在一起,然后HAVING COUNT(1) = 1
仅显示只有 1 个祖先的行。
回答by nth
Thanks kekekela for the nudge in the right direction.
感谢 kekekela 朝着正确的方向推动。
Here's the query that produced the result I wanted:
这是产生我想要的结果的查询:
SELECT AcctNo, OrderDate, Charge FROM Table1 GROUP BY AcctNo, OrderDate, Charge
HAVING (COUNT(AcctNo) = 1) AND (COUNT(OrderDate) = 1) AND (COUNT(Charge) = 1);
Or more simplified based on Gus's example:
或者更简化基于 Gus 的例子:
SELECT AcctNo, OrderDate, Charge FROM Table1 GROUP BY AcctNo, OrderDate, Charge
HAVING COUNT(1) = 1;
回答by vol7ron
You could just drop the PKID
to return all records:
您可以删除PKID
以返回所有记录:
SELECT DISTINCT
ClientID
, Name
, AcctNo
, OrderDate
, Charge
FROM table;
Note:This is slightly different from what you're asking.
It returns a unique set by removing the one non-unique field.
By your example, you're asking to return non-duplicates.
注意:这与您的要求略有不同。
它通过删除一个非唯一字段来返回一个唯一集。
通过您的示例,您要求返回非重复项。
I could only see your example being useful if you're trying
to clean up a table by extracting the "good" records.
如果您尝试
通过提取“好”记录来清理表格,我只能看到您的示例很有用。
回答by Ketil Duna
You could determine the non-unique records first, and then test for those records not in that set - like this
您可以先确定非唯一记录,然后测试不在该集中的那些记录 - 就像这样
select * from mytable where pkid not in
(select t1.pkid
from mytable t1 inner join mytable t2
on t1.pkid <> t2.pkid
and t1.acctno = t2.acctno
and t1.orderdate = t2.orderdate
and t1.charge = t2.charge)
the last part of the inner query lets you fiddle with the criteria for "equality" - add the required number of columns to test. Of course, this gets a lot more interesting without that primary key :) In such cases I usually end up creating one
内部查询的最后一部分让您可以摆弄“相等”的标准 - 添加所需的列数进行测试。当然,如果没有那个主键,这会变得更有趣:) 在这种情况下,我通常最终会创建一个
Ketil
凯蒂尔
回答by waqas ahmad
SELECT GMPS.gen.ProductDetail.PaperType, GMPS.gen.ProductDetail.Size FROM
GMPS.gen.ProductDetail GROUP BY GMPS.gen.ProductDetail.PaperType,
GMPS.gen.ProductDetail.Size
HAVING COUNT(1) = 1;