在 SQL Server 中查找重复行
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/2112618/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Finding duplicate rows in SQL Server
提问by xtine
I have a SQL Server database of organizations, and there are many duplicate rows. I want to run a select statement to grab all of these and the amount of dupes, but also return the ids that are associated with each organization.
我有一个组织的 SQL Server 数据库,并且有很多重复的行。我想运行一个 select 语句来获取所有这些和欺骗的数量,但还要返回与每个组织关联的 id。
A statement like:
像这样的声明:
SELECT orgName, COUNT(*) AS dupes
FROM organizations
GROUP BY orgName
HAVING (COUNT(*) > 1)
Will return something like
会返回类似的东西
orgName | dupes
ABC Corp | 7
Foo Federation | 5
Widget Company | 2
But I'd also like to grab the IDs of them. Is there any way to do this? Maybe like a
但我也想获取他们的 ID。有没有办法做到这一点?也许像一个
orgName | dupeCount | id
ABC Corp | 1 | 34
ABC Corp | 2 | 5
...
Widget Company | 1 | 10
Widget Company | 2 | 2
The reason being that there is also a separate table of users that link to these organizations, and I would like to unify them (therefore remove dupes so the users link to the same organization instead of dupe orgs). But I would like part manually so I don't screw anything up, but I would still need a statement returning the IDs of all the dupe orgs so I can go through the list of users.
原因是还有一个单独的用户表链接到这些组织,我想统一它们(因此删除欺骗使用户链接到同一组织而不是欺骗组织)。但我想要手动部分,所以我不会搞砸任何事情,但我仍然需要一个声明返回所有欺骗组织的 ID,以便我可以浏览用户列表。
回答by RedFilter
select o.orgName, oc.dupeCount, o.id
from organizations o
inner join (
SELECT orgName, COUNT(*) AS dupeCount
FROM organizations
GROUP BY orgName
HAVING COUNT(*) > 1
) oc on o.orgName = oc.orgName
回答by Aykut Ak?nc?
You can run the following query and find the duplicates with max(id)
and delete those rows.
您可以运行以下查询并查找重复项max(id)
并删除这些行。
SELECT orgName, COUNT(*), Max(ID) AS dupes
FROM organizations
GROUP BY orgName
HAVING (COUNT(*) > 1)
But you'll have to run this query a few times.
但是您必须多次运行此查询。
回答by Paul
You can do it like this:
你可以这样做:
SELECT
o.id, o.orgName, d.intCount
FROM (
SELECT orgName, COUNT(*) as intCount
FROM organizations
GROUP BY orgName
HAVING COUNT(*) > 1
) AS d
INNER JOIN organizations o ON o.orgName = d.orgName
If you want to return just the records that can be deleted (leaving one of each), you can use:
如果您只想返回可以删除的记录(保留其中之一),您可以使用:
SELECT
id, orgName
FROM (
SELECT
orgName, id,
ROW_NUMBER() OVER (PARTITION BY orgName ORDER BY id) AS intRow
FROM organizations
) AS d
WHERE intRow != 1
Edit: SQL Server 2000 doesn't have the ROW_NUMBER() function. Instead, you can use:
编辑:SQL Server 2000 没有 ROW_NUMBER() 函数。相反,您可以使用:
SELECT
o.id, o.orgName, d.intCount
FROM (
SELECT orgName, COUNT(*) as intCount, MIN(id) AS minId
FROM organizations
GROUP BY orgName
HAVING COUNT(*) > 1
) AS d
INNER JOIN organizations o ON o.orgName = d.orgName
WHERE d.minId != o.id
回答by ecairol
The solution marked as correct didn't work for me, but I found this answer that worked just great: Get list of duplicate rows in MySql
标记为正确的解决方案对我不起作用,但我发现这个答案非常有效:在 MySql 中获取重复行列表
SELECT n1.*
FROM myTable n1
INNER JOIN myTable n2
ON n2.repeatedCol = n1.repeatedCol
WHERE n1.id <> n2.id
回答by code save
You can try this , it is best for you
你可以试试这个,它最适合你
WITH CTE AS
(
SELECT *,RN=ROW_NUMBER() OVER (PARTITION BY orgName ORDER BY orgName DESC) FROM organizations
)
select * from CTE where RN>1
go
回答by akd
If you want to delete duplicates:
如果要删除重复项:
WITH CTE AS(
SELECT orgName,id,
RN = ROW_NUMBER()OVER(PARTITION BY orgName ORDER BY Id)
FROM organizations
)
DELETE FROM CTE WHERE RN > 1
回答by Debendra Dash
select * from [Employees]
For finding duplicate Record 1)Using CTE
查找重复记录 1)使用 CTE
with mycte
as
(
select Name,EmailId,ROW_NUMBER() over(partition by Name,EmailId order by id) as Duplicate from [Employees]
)
select * from mycte
2)By Using GroupBy
2) 通过使用 GroupBy
select Name,EmailId,COUNT(name) as Duplicate from [Employees] group by Name,EmailId
回答by Mike Clark
Select * from (Select orgName,id,
ROW_NUMBER() OVER(Partition By OrgName ORDER by id DESC) Rownum
From organizations )tbl Where Rownum>1
So the records with rowum> 1 will be the duplicate records in your table. ‘Partition by' first group by the records and then serialize them by giving them serial nos. So rownum> 1 will be the duplicate records which could be deleted as such.
因此 rowum> 1 的记录将是您表中的重复记录。'Partition by' 首先按记录分组,然后通过给它们序列号来序列化它们。所以 rownum> 1 将是可以被删除的重复记录。
回答by iCrazybest
select column_name, count(column_name)
from table_name
group by column_name
having count (column_name) > 1;
回答by user5336758
select a.orgName,b.duplicate, a.id
from organizations a
inner join (
SELECT orgName, COUNT(*) AS duplicate
FROM organizations
GROUP BY orgName
HAVING COUNT(*) > 1
) b on o.orgName = oc.orgName
group by a.orgName,a.id