在 SQL Server 中查找重复行

Question

提问by xtine

I have a SQL Server database of organizations, and there are many duplicate rows. I want to run a select statement to grab all of these and the amount of dupes, but also return the ids that are associated with each organization.

我有一个组织的 SQL Server 数据库，并且有很多重复的行。我想运行一个 select 语句来获取所有这些和欺骗的数量，但还要返回与每个组织关联的 id。

A statement like:

像这样的声明：

SELECT     orgName, COUNT(*) AS dupes  
FROM         organizations  
GROUP BY orgName  
HAVING      (COUNT(*) > 1)

Will return something like

会返回类似的东西

orgName        | dupes  
ABC Corp       | 7  
Foo Federation | 5  
Widget Company | 2

But I'd also like to grab the IDs of them. Is there any way to do this? Maybe like a

但我也想获取他们的 ID。有没有办法做到这一点？也许像一个

orgName        | dupeCount | id  
ABC Corp       | 1         | 34  
ABC Corp       | 2         | 5  
...  
Widget Company | 1         | 10  
Widget Company | 2         | 2

The reason being that there is also a separate table of users that link to these organizations, and I would like to unify them (therefore remove dupes so the users link to the same organization instead of dupe orgs). But I would like part manually so I don't screw anything up, but I would still need a statement returning the IDs of all the dupe orgs so I can go through the list of users.

原因是还有一个单独的用户表链接到这些组织，我想统一它们（因此删除欺骗使用户链接到同一组织而不是欺骗组织）。但我想要手动部分，所以我不会搞砸任何事情，但我仍然需要一个声明返回所有欺骗组织的 ID，以便我可以浏览用户列表。

Answer 1

回答by RedFilter

select o.orgName, oc.dupeCount, o.id
from organizations o
inner join (
    SELECT orgName, COUNT(*) AS dupeCount
    FROM organizations
    GROUP BY orgName
    HAVING COUNT(*) > 1
) oc on o.orgName = oc.orgName

Answer 2

回答by Aykut Ak?nc?

You can run the following query and find the duplicates with max(id)and delete those rows.

您可以运行以下查询并查找重复项max(id)并删除这些行。

SELECT orgName, COUNT(*), Max(ID) AS dupes 
FROM organizations 
GROUP BY orgName 
HAVING (COUNT(*) > 1)

But you'll have to run this query a few times.

但是您必须多次运行此查询。

Answer 3

回答by Paul

You can do it like this:

你可以这样做：

SELECT
    o.id, o.orgName, d.intCount
FROM (
     SELECT orgName, COUNT(*) as intCount
     FROM organizations
     GROUP BY orgName
     HAVING COUNT(*) > 1
) AS d
    INNER JOIN organizations o ON o.orgName = d.orgName

If you want to return just the records that can be deleted (leaving one of each), you can use:

如果您只想返回可以删除的记录（保留其中之一），您可以使用：

SELECT
    id, orgName
FROM (
     SELECT 
         orgName, id,
         ROW_NUMBER() OVER (PARTITION BY orgName ORDER BY id) AS intRow
     FROM organizations
) AS d
WHERE intRow != 1

Edit: SQL Server 2000 doesn't have the ROW_NUMBER() function. Instead, you can use:

编辑：SQL Server 2000 没有 ROW_NUMBER() 函数。相反，您可以使用：

SELECT
    o.id, o.orgName, d.intCount
FROM (
     SELECT orgName, COUNT(*) as intCount, MIN(id) AS minId
     FROM organizations
     GROUP BY orgName
     HAVING COUNT(*) > 1
) AS d
    INNER JOIN organizations o ON o.orgName = d.orgName
WHERE d.minId != o.id

Answer 4

回答by ecairol

The solution marked as correct didn't work for me, but I found this answer that worked just great: Get list of duplicate rows in MySql

标记为正确的解决方案对我不起作用，但我发现这个答案非常有效：在 MySql 中获取重复行列表

SELECT n1.* 
FROM myTable n1
INNER JOIN myTable n2 
ON n2.repeatedCol = n1.repeatedCol
WHERE n1.id <> n2.id

Answer 5

回答by code save

You can try this , it is best for you

你可以试试这个，它最适合你

 WITH CTE AS
    (
    SELECT *,RN=ROW_NUMBER() OVER (PARTITION BY orgName ORDER BY orgName DESC) FROM organizations 
    )
    select * from CTE where RN>1
    go

Answer 6

回答by akd

If you want to delete duplicates:

如果要删除重复项：

WITH CTE AS(
   SELECT orgName,id,
       RN = ROW_NUMBER()OVER(PARTITION BY orgName ORDER BY Id)
   FROM organizations
)
DELETE FROM CTE WHERE RN > 1

Answer 7

回答by Debendra Dash

select * from [Employees]

For finding duplicate Record 1)Using CTE

查找重复记录 1）使用 CTE

with mycte
as
(
select Name,EmailId,ROW_NUMBER() over(partition by Name,EmailId order by id) as Duplicate from [Employees]
)
select * from mycte

2)By Using GroupBy

2) 通过使用 GroupBy

select Name,EmailId,COUNT(name) as Duplicate from  [Employees] group by Name,EmailId

Answer 8

回答by Mike Clark

Select * from (Select orgName,id,
ROW_NUMBER() OVER(Partition By OrgName ORDER by id DESC) Rownum
From organizations )tbl Where Rownum>1

So the records with rowum> 1 will be the duplicate records in your table. ‘Partition by' first group by the records and then serialize them by giving them serial nos. So rownum> 1 will be the duplicate records which could be deleted as such.

因此 rowum> 1 的记录将是您表中的重复记录。'Partition by' 首先按记录分组，然后通过给它们序列号来序列化它们。所以 rownum> 1 将是可以被删除的重复记录。

Answer 9

回答by iCrazybest

select column_name, count(column_name)
from table_name
group by column_name
having count (column_name) > 1;

Src : https://stackoverflow.com/a/59242/1465252

源代码：https: //stackoverflow.com/a/59242/1465252

Answer 10

回答by user5336758

select a.orgName,b.duplicate, a.id
from organizations a
inner join (
    SELECT orgName, COUNT(*) AS duplicate
    FROM organizations
    GROUP BY orgName
    HAVING COUNT(*) > 1
) b on o.orgName = oc.orgName
group by a.orgName,a.id

在 SQL Server 中查找重复行

提问by xtine

回答by RedFilter

回答by Aykut Ak?nc?

回答by Paul

回答by ecairol

回答by code save

回答by akd

回答by Debendra Dash

回答by Mike Clark

回答by iCrazybest

回答by user5336758

相关推荐

最近更新

标签

在 SQL Server 中查找重复行

提问by xtine

回答by RedFilter

回答by Aykut Ak?nc?

回答by Paul

回答by ecairol

回答by code save

回答by akd

回答by Debendra Dash

回答by Mike Clark

回答by iCrazybest

回答by user5336758

相关推荐

SQL JOIN 同一张表

SQL 如何使用case语句捕获NULL值

SQL 使用带有 isnull 和 else 的 CASE 语句

SQL：Oracle - 查询中的参数

相关推荐

最近更新

标签