在 SQL Server 中查找重复行

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/2112618/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-01 05:09:51  来源:igfitidea点击:

Finding duplicate rows in SQL Server

sqlsql-serverduplicates

提问by xtine

I have a SQL Server database of organizations, and there are many duplicate rows. I want to run a select statement to grab all of these and the amount of dupes, but also return the ids that are associated with each organization.

我有一个组织的 SQL Server 数据库,并且有很多重复的行。我想运行一个 select 语句来获取所有这些和欺骗的数量,但还要返回与每个组织关联的 id。

A statement like:

像这样的声明:

SELECT     orgName, COUNT(*) AS dupes  
FROM         organizations  
GROUP BY orgName  
HAVING      (COUNT(*) > 1)

Will return something like

会返回类似的东西

orgName        | dupes  
ABC Corp       | 7  
Foo Federation | 5  
Widget Company | 2 

But I'd also like to grab the IDs of them. Is there any way to do this? Maybe like a

但我也想获取他们的 ID。有没有办法做到这一点?也许像一个

orgName        | dupeCount | id  
ABC Corp       | 1         | 34  
ABC Corp       | 2         | 5  
...  
Widget Company | 1         | 10  
Widget Company | 2         | 2  

The reason being that there is also a separate table of users that link to these organizations, and I would like to unify them (therefore remove dupes so the users link to the same organization instead of dupe orgs). But I would like part manually so I don't screw anything up, but I would still need a statement returning the IDs of all the dupe orgs so I can go through the list of users.

原因是还有一个单独的用户表链接到这些组织,我想统一它们(因此删除欺骗使用户链接到同一组织而不是欺骗组织)。但我想要手动部分,所以我不会搞砸任何事情,但我仍然需要一个声明返回所有欺骗组织的 ID,以便我可以浏览用户列表。

回答by RedFilter

select o.orgName, oc.dupeCount, o.id
from organizations o
inner join (
    SELECT orgName, COUNT(*) AS dupeCount
    FROM organizations
    GROUP BY orgName
    HAVING COUNT(*) > 1
) oc on o.orgName = oc.orgName

回答by Aykut Ak?nc?

You can run the following query and find the duplicates with max(id)and delete those rows.

您可以运行以下查询并查找重复项max(id)并删除这些行。

SELECT orgName, COUNT(*), Max(ID) AS dupes 
FROM organizations 
GROUP BY orgName 
HAVING (COUNT(*) > 1)

But you'll have to run this query a few times.

但是您必须多次运行此查询。

回答by Paul

You can do it like this:

你可以这样做:

SELECT
    o.id, o.orgName, d.intCount
FROM (
     SELECT orgName, COUNT(*) as intCount
     FROM organizations
     GROUP BY orgName
     HAVING COUNT(*) > 1
) AS d
    INNER JOIN organizations o ON o.orgName = d.orgName

If you want to return just the records that can be deleted (leaving one of each), you can use:

如果您只想返回可以删除的记录(保留其中之一),您可以使用:

SELECT
    id, orgName
FROM (
     SELECT 
         orgName, id,
         ROW_NUMBER() OVER (PARTITION BY orgName ORDER BY id) AS intRow
     FROM organizations
) AS d
WHERE intRow != 1

Edit: SQL Server 2000 doesn't have the ROW_NUMBER() function. Instead, you can use:

编辑:SQL Server 2000 没有 ROW_NUMBER() 函数。相反,您可以使用:

SELECT
    o.id, o.orgName, d.intCount
FROM (
     SELECT orgName, COUNT(*) as intCount, MIN(id) AS minId
     FROM organizations
     GROUP BY orgName
     HAVING COUNT(*) > 1
) AS d
    INNER JOIN organizations o ON o.orgName = d.orgName
WHERE d.minId != o.id

回答by ecairol

The solution marked as correct didn't work for me, but I found this answer that worked just great: Get list of duplicate rows in MySql

标记为正确的解决方案对我不起作用,但我发现这个答案非常有效:在 MySql 中获取重复行列表

SELECT n1.* 
FROM myTable n1
INNER JOIN myTable n2 
ON n2.repeatedCol = n1.repeatedCol
WHERE n1.id <> n2.id

回答by code save

You can try this , it is best for you

你可以试试这个,它最适合你

 WITH CTE AS
    (
    SELECT *,RN=ROW_NUMBER() OVER (PARTITION BY orgName ORDER BY orgName DESC) FROM organizations 
    )
    select * from CTE where RN>1
    go

回答by akd

If you want to delete duplicates:

如果要删除重复项:

WITH CTE AS(
   SELECT orgName,id,
       RN = ROW_NUMBER()OVER(PARTITION BY orgName ORDER BY Id)
   FROM organizations
)
DELETE FROM CTE WHERE RN > 1

回答by Debendra Dash

select * from [Employees]

For finding duplicate Record 1)Using CTE

查找重复记录 1)使用 CTE

with mycte
as
(
select Name,EmailId,ROW_NUMBER() over(partition by Name,EmailId order by id) as Duplicate from [Employees]
)
select * from mycte

2)By Using GroupBy

2) 通过使用 GroupBy

select Name,EmailId,COUNT(name) as Duplicate from  [Employees] group by Name,EmailId 

回答by Mike Clark

Select * from (Select orgName,id,
ROW_NUMBER() OVER(Partition By OrgName ORDER by id DESC) Rownum
From organizations )tbl Where Rownum>1

So the records with rowum> 1 will be the duplicate records in your table. ‘Partition by' first group by the records and then serialize them by giving them serial nos. So rownum> 1 will be the duplicate records which could be deleted as such.

因此 rowum> 1 的记录将是您表中的重复记录。'Partition by' 首先按记录分组,然后通过给它们序列号来序列化它们。所以 rownum> 1 将是可以被删除的重复记录。

回答by iCrazybest

select column_name, count(column_name)
from table_name
group by column_name
having count (column_name) > 1;

Src : https://stackoverflow.com/a/59242/1465252

源代码:https: //stackoverflow.com/a/59242/1465252

回答by user5336758

select a.orgName,b.duplicate, a.id
from organizations a
inner join (
    SELECT orgName, COUNT(*) AS duplicate
    FROM organizations
    GROUP BY orgName
    HAVING COUNT(*) > 1
) b on o.orgName = oc.orgName
group by a.orgName,a.id