从没有主键的 SQL 表中删除重复记录
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/985384/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Delete duplicate records from a SQL table without a primary key
提问by Shyju
I have the below table with the below records in it
我有下表,其中包含以下记录
create table employee
(
EmpId number,
EmpName varchar2(10),
EmpSSN varchar2(11)
);
insert into employee values(1, 'Hyman', '555-55-5555');
insert into employee values (2, 'Joe', '555-56-5555');
insert into employee values (3, 'Fred', '555-57-5555');
insert into employee values (4, 'Mike', '555-58-5555');
insert into employee values (5, 'Cathy', '555-59-5555');
insert into employee values (6, 'Lisa', '555-70-5555');
insert into employee values (1, 'Hyman', '555-55-5555');
insert into employee values (4, 'Mike', '555-58-5555');
insert into employee values (5, 'Cathy', '555-59-5555');
insert into employee values (6 ,'Lisa', '555-70-5555');
insert into employee values (5, 'Cathy', '555-59-5555');
insert into employee values (6, 'Lisa', '555-70-5555');
I dont have any primary key in this table .But i have the above records in my table already. I want to remove the duplicate records which has the same value in EmpId and EmpSSN fields.
我在这个表中没有任何主键。但我的表中已经有上述记录。我想删除在 EmpId 和 EmpSSN 字段中具有相同值的重复记录。
Ex : Emp id 5
例如:Emp id 5
Can any one help me to frame a query to delete those duplicate records
任何人都可以帮助我构建一个查询以删除那些重复记录
Thanks in advance
提前致谢
回答by Anjib Rajkhowa
It is very simple. I tried in SQL Server 2008
这很简单。我在 SQL Server 2008 中尝试过
DELETE SUB FROM
(SELECT ROW_NUMBER() OVER (PARTITION BY EmpId, EmpName, EmpSSN ORDER BY EmpId) cnt
FROM Employee) SUB
WHERE SUB.cnt > 1
回答by cjk
Add a Primary Key (code below)
添加主键(下面的代码)
Run the correct delete (code below)
运行正确的删除(下面的代码)
Consider WHY you woudln't want to keep that primary key.
考虑为什么您不想保留该主键。
Assuming MSSQL or compatible:
假设 MSSQL 或兼容:
ALTER TABLE Employee ADD EmployeeID int identity(1,1) PRIMARY KEY;
WHILE EXISTS (SELECT COUNT(*) FROM Employee GROUP BY EmpID, EmpSSN HAVING COUNT(*) > 1)
BEGIN
DELETE FROM Employee WHERE EmployeeID IN
(
SELECT MIN(EmployeeID) as [DeleteID]
FROM Employee
GROUP BY EmpID, EmpSSN
HAVING COUNT(*) > 1
)
END
回答by Paul Morgan
Use the row number to differentiate between duplicate records. Keep the first row number for an EmpID/EmpSSN and delete the rest:
使用行号来区分重复记录。保留 EmpID/EmpSSN 的第一行号并删除其余行号:
DELETE FROM Employee a
WHERE ROW_NUMBER() <> ( SELECT MIN( ROW_NUMBER() )
FROM Employee b
WHERE a.EmpID = b.EmpID
AND a.EmpSSN = b.EmpSSN )
回答by Nirav Parikh
With duplicates
As
(Select *, ROW_NUMBER() Over (PARTITION by EmpID,EmpSSN Order by EmpID,EmpSSN) as Duplicate From Employee)
delete From duplicates
Where Duplicate > 1 ;
This will update Table and remove all duplicates from the Table!
这将更新表并从表中删除所有重复项!
回答by naga vara prasad
select distinct * into newtablename from oldtablename
Now, the newtablenamewill have no duplicate records.
现在,newtablename将没有重复的记录。
Simply change the table name(newtablename) by pressing F2 in object explorer in sql server.
只需newtablename在 sql server 的对象资源管理器中按 F2更改表名()。
回答by Daren Thomas
You could create a temporary table #tempemployeecontaining a select distinctof your employeetable.
Then delete from employee.
Then insert into employee select from #tempemployee.
您可以创建一个临时表#tempemployee包含select distinct您的employee表。然后delete from employee。然后insert into employee select from #tempemployee。
Like Josh said - even if you know the duplicates, deleting them will be impossile since you cannot actually refer to a specific record if it is an exact duplicate of another record.
就像乔希说的 - 即使你知道重复项,删除它们也是不可能的,因为如果它是另一条记录的完全重复,你实际上无法引用特定记录。
回答by kamz kamarajan
Code
代码
DELETE DUP
FROM
(
SELECT ROW_NUMBER() OVER (PARTITION BY Clientid ORDER BY Clientid ) AS Val
FROM ClientMaster
) DUP
WHERE DUP.Val > 1
Explanation
解释
Use an inner query to construct a view over the table which includes a field based on Row_Number(), partitioned by those columns you wish to be unique.
使用内部查询来构建表的视图,该视图包括一个基于 的字段Row_Number(),由您希望唯一的那些列分区。
Delete from the results of this inner query, selecting anything which does not have a row number of 1; i.e. the duplicates; not the original.
从这个内部查询的结果中删除,选择行号不为 1 的任何内容;即重复;不是原版。
The order byclause of the row_number window function is needed for a valid syntax; you can put any column name here. If you wish to change which of the results is treated as a duplicate (e.g. keep the earliest or most recent, etc), then the column(s) used here do matter; i.e. you want to specify the order such that the record you wish to keep will come first in the result.
order by有效语法需要 row_number 窗口函数的子句;您可以在此处输入任何列名称。如果您希望更改将哪些结果视为重复结果(例如保留最早的或最近的等),那么此处使用的列很重要;即您要指定顺序,以便您希望保留的记录在结果中排在第一位。
回答by Joe
If you don't want to create a new primary key you can use the TOP command in SQL Server:
如果不想创建新的主键,可以在 SQL Server 中使用 TOP 命令:
declare @ID int
while EXISTS(select count(*) from Employee group by EmpId having count(*)> 1)
begin
select top 1 @ID = EmpId
from Employee
group by EmpId
having count(*) > 1
DELETE TOP(1) FROM Employee WHERE EmpId = @ID
end
回答by Abhishek Jaiswal
ITS easy use below query
ITS 易于使用以下查询
WITH Dups AS
(
SELECT col1,col2,col3,
ROW_NUMBER() OVER(PARTITION BY col1,col2,col3 ORDER BY (SELECT 0)) AS rn
FROM mytable
)
DELETE FROM Dups WHERE rn > 1
回答by Sudhar P
delete sub from (select ROW_NUMBER() OVer(Partition by empid order by empid)cnt from employee)sub where sub.cnt>1
delete sub from (select ROW_NUMBER() Over(Partition by empid order by empid)cnt from employee)sub where sub.cnt>1

