从没有主键的 SQL 表中删除重复记录
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/985384/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Delete duplicate records from a SQL table without a primary key
提问by Shyju
I have the below table with the below records in it
我有下表,其中包含以下记录
create table employee
(
EmpId number,
EmpName varchar2(10),
EmpSSN varchar2(11)
);
insert into employee values(1, 'Hyman', '555-55-5555');
insert into employee values (2, 'Joe', '555-56-5555');
insert into employee values (3, 'Fred', '555-57-5555');
insert into employee values (4, 'Mike', '555-58-5555');
insert into employee values (5, 'Cathy', '555-59-5555');
insert into employee values (6, 'Lisa', '555-70-5555');
insert into employee values (1, 'Hyman', '555-55-5555');
insert into employee values (4, 'Mike', '555-58-5555');
insert into employee values (5, 'Cathy', '555-59-5555');
insert into employee values (6 ,'Lisa', '555-70-5555');
insert into employee values (5, 'Cathy', '555-59-5555');
insert into employee values (6, 'Lisa', '555-70-5555');
I dont have any primary key in this table .But i have the above records in my table already. I want to remove the duplicate records which has the same value in EmpId and EmpSSN fields.
我在这个表中没有任何主键。但我的表中已经有上述记录。我想删除在 EmpId 和 EmpSSN 字段中具有相同值的重复记录。
Ex : Emp id 5
例如:Emp id 5
Can any one help me to frame a query to delete those duplicate records
任何人都可以帮助我构建一个查询以删除那些重复记录
Thanks in advance
提前致谢
回答by Anjib Rajkhowa
It is very simple. I tried in SQL Server 2008
这很简单。我在 SQL Server 2008 中尝试过
DELETE SUB FROM
(SELECT ROW_NUMBER() OVER (PARTITION BY EmpId, EmpName, EmpSSN ORDER BY EmpId) cnt
FROM Employee) SUB
WHERE SUB.cnt > 1
回答by cjk
Add a Primary Key (code below)
添加主键(下面的代码)
Run the correct delete (code below)
运行正确的删除(下面的代码)
Consider WHY you woudln't want to keep that primary key.
考虑为什么您不想保留该主键。
Assuming MSSQL or compatible:
假设 MSSQL 或兼容:
ALTER TABLE Employee ADD EmployeeID int identity(1,1) PRIMARY KEY;
WHILE EXISTS (SELECT COUNT(*) FROM Employee GROUP BY EmpID, EmpSSN HAVING COUNT(*) > 1)
BEGIN
DELETE FROM Employee WHERE EmployeeID IN
(
SELECT MIN(EmployeeID) as [DeleteID]
FROM Employee
GROUP BY EmpID, EmpSSN
HAVING COUNT(*) > 1
)
END
回答by Paul Morgan
Use the row number to differentiate between duplicate records. Keep the first row number for an EmpID/EmpSSN and delete the rest:
使用行号来区分重复记录。保留 EmpID/EmpSSN 的第一行号并删除其余行号:
DELETE FROM Employee a
WHERE ROW_NUMBER() <> ( SELECT MIN( ROW_NUMBER() )
FROM Employee b
WHERE a.EmpID = b.EmpID
AND a.EmpSSN = b.EmpSSN )
回答by Nirav Parikh
With duplicates
As
(Select *, ROW_NUMBER() Over (PARTITION by EmpID,EmpSSN Order by EmpID,EmpSSN) as Duplicate From Employee)
delete From duplicates
Where Duplicate > 1 ;
This will update Table and remove all duplicates from the Table!
这将更新表并从表中删除所有重复项!
回答by naga vara prasad
select distinct * into newtablename from oldtablename
Now, the newtablename
will have no duplicate records.
现在,newtablename
将没有重复的记录。
Simply change the table name(newtablename
) by pressing F2 in object explorer in sql server.
只需newtablename
在 sql server 的对象资源管理器中按 F2更改表名()。
回答by Daren Thomas
You could create a temporary table #tempemployee
containing a select distinct
of your employee
table.
Then delete from employee
.
Then insert into employee select from #tempemployee
.
您可以创建一个临时表#tempemployee
包含select distinct
您的employee
表。然后delete from employee
。然后insert into employee select from #tempemployee
。
Like Josh said - even if you know the duplicates, deleting them will be impossile since you cannot actually refer to a specific record if it is an exact duplicate of another record.
就像乔希说的 - 即使你知道重复项,删除它们也是不可能的,因为如果它是另一条记录的完全重复,你实际上无法引用特定记录。
回答by kamz kamarajan
Code
代码
DELETE DUP
FROM
(
SELECT ROW_NUMBER() OVER (PARTITION BY Clientid ORDER BY Clientid ) AS Val
FROM ClientMaster
) DUP
WHERE DUP.Val > 1
Explanation
解释
Use an inner query to construct a view over the table which includes a field based on Row_Number()
, partitioned by those columns you wish to be unique.
使用内部查询来构建表的视图,该视图包括一个基于 的字段Row_Number()
,由您希望唯一的那些列分区。
Delete from the results of this inner query, selecting anything which does not have a row number of 1; i.e. the duplicates; not the original.
从这个内部查询的结果中删除,选择行号不为 1 的任何内容;即重复;不是原版。
The order by
clause of the row_number window function is needed for a valid syntax; you can put any column name here. If you wish to change which of the results is treated as a duplicate (e.g. keep the earliest or most recent, etc), then the column(s) used here do matter; i.e. you want to specify the order such that the record you wish to keep will come first in the result.
order by
有效语法需要 row_number 窗口函数的子句;您可以在此处输入任何列名称。如果您希望更改将哪些结果视为重复结果(例如保留最早的或最近的等),那么此处使用的列很重要;即您要指定顺序,以便您希望保留的记录在结果中排在第一位。
回答by Joe
If you don't want to create a new primary key you can use the TOP command in SQL Server:
如果不想创建新的主键,可以在 SQL Server 中使用 TOP 命令:
declare @ID int
while EXISTS(select count(*) from Employee group by EmpId having count(*)> 1)
begin
select top 1 @ID = EmpId
from Employee
group by EmpId
having count(*) > 1
DELETE TOP(1) FROM Employee WHERE EmpId = @ID
end
回答by Abhishek Jaiswal
ITS easy use below query
ITS 易于使用以下查询
WITH Dups AS
(
SELECT col1,col2,col3,
ROW_NUMBER() OVER(PARTITION BY col1,col2,col3 ORDER BY (SELECT 0)) AS rn
FROM mytable
)
DELETE FROM Dups WHERE rn > 1
回答by Sudhar P
delete sub from (select ROW_NUMBER() OVer(Partition by empid order by empid)cnt from employee)sub where sub.cnt>1
delete sub from (select ROW_NUMBER() Over(Partition by empid order by empid)cnt from employee)sub where sub.cnt>1