SQL 从 Oracle 表中删除重复行
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/529098/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Removing duplicate rows from table in Oracle
提问by juan
I'm testing something in Oracle and populated a table with some sample data, but in the process I accidentally loaded duplicate records, so now I can't create a primary key using some of the columns.
我正在 Oracle 中测试某些内容并使用一些示例数据填充表,但在此过程中我不小心加载了重复记录,因此现在我无法使用某些列创建主键。
How can I delete all duplicate rows and leave only one of them?
如何删除所有重复行并只保留其中之一?
回答by Bill the Lizard
Use the rowid
pseudocolumn.
使用rowid
伪列。
DELETE FROM your_table
WHERE rowid not in
(SELECT MIN(rowid)
FROM your_table
GROUP BY column1, column2, column3);
Where column1
, column2
, and column3
make up the identifying key for each record. You might list all your columns.
其中column1
、column2
、 和column3
组成每条记录的标识键。您可能会列出所有列。
回答by Dead Programmer
回答by Dead Programmer
DELETE FROM tablename a
WHERE a.ROWID > ANY (SELECT b.ROWID
FROM tablename b
WHERE a.fieldname = b.fieldname
AND a.fieldname2 = b.fieldname2)
回答by Mark
回答by DoOrDie
Solution 1)
解决方案 1)
delete from emp
where rowid not in
(select max(rowid) from emp group by empno);
Solution 2)
解决方案2)
delete from emp where rowid in
(
select rid from
(
select rowid rid,
row_number() over(partition by empno order by empno) rn
from emp
)
where rn > 1
);
Solution 3)
解决方案 3)
delete from emp e1
where rowid not in
(select max(rowid) from emp e2
where e1.empno = e2.empno );
回答by Mohammed khaled
create table t2 as select distinct * from t1;
创建表 t2 作为 select distinct * from t1;
回答by Nick
You should do a small pl/sql block using a cursor for loop and delete the rows you don't want to keep. For instance:
您应该使用游标 for 循环执行一个小的 pl/sql 块并删除您不想保留的行。例如:
declare
prev_var my_table.var1%TYPE;
begin
for t in (select var1 from my_table order by var 1) LOOP
-- if previous var equal current var, delete the row, else keep on going.
end loop;
end;
回答by user1799846
To select the duplicates only the query format can be:
要选择重复项,只有查询格式可以是:
SELECT GroupFunction(column1), GroupFunction(column2),...,
COUNT(column1), column1, column2...
FROM our_table
GROUP BY column1, column2, column3...
HAVING COUNT(column1) > 1
So the correct query as per other suggestion is:
因此,根据其他建议的正确查询是:
DELETE FROM tablename a
WHERE a.ROWID > ANY (SELECT b.ROWID
FROM tablename b
WHERE a.fieldname = b.fieldname
AND a.fieldname2 = b.fieldname2
AND ....so on.. to identify the duplicate rows....)
This query will keep the oldest record in the database for the criteria chosen in the WHERE CLAUSE
.
此查询将保留数据库中针对WHERE CLAUSE
.
Oracle Certified Associate (2008)
Oracle 认证助理(2008 年)
回答by user2158672
The Fastest way for really big tables
真正大桌子的最快方法
Create exception table with structure below: exceptions_table
ROW_ID ROWID OWNER VARCHAR2(30) TABLE_NAME VARCHAR2(30) CONSTRAINT VARCHAR2(30)
Try create a unique constraint or primary key which will be violated by the duplicates. You will get an error message because you have duplicates. The exceptions table will contain the rowids for the duplicate rows.
alter table add constraint unique --or primary key (dupfield1,dupfield2) exceptions into exceptions_table;
Join your table with exceptions_table by rowid and delete dups
delete original_dups where rowid in (select ROW_ID from exceptions_table);
If the amount of rows to delete is big, then create a new table (with all grants and indexes) anti-joining with exceptions_table by rowid and rename the original table into original_dups table and rename new_table_with_no_dups into original table
create table new_table_with_no_dups AS ( select field1, field2 ........ from original_dups t1 where not exists ( select null from exceptions_table T2 where t1.rowid = t2.row_id ) )
创建具有以下结构的异常表:exceptions_table
ROW_ID ROWID OWNER VARCHAR2(30) TABLE_NAME VARCHAR2(30) CONSTRAINT VARCHAR2(30)
尝试创建一个唯一约束或主键,重复项将违反该约束或主键。您将收到一条错误消息,因为您有重复项。例外表将包含重复行的 rowid。
alter table add constraint unique --or primary key (dupfield1,dupfield2) exceptions into exceptions_table;
通过 rowid 将您的表与 exceptions_table 连接起来并删除重复项
delete original_dups where rowid in (select ROW_ID from exceptions_table);
如果要删除的行数很大,则创建一个新表(包含所有授权和索引)通过rowid与exceptions_table反连接并将原始表重命名为original_dups表并将new_table_with_no_dups重命名为原始表
create table new_table_with_no_dups AS ( select field1, field2 ........ from original_dups t1 where not exists ( select null from exceptions_table T2 where t1.rowid = t2.row_id ) )
回答by Dnyaneshwar Tandale
Using rowid-
使用 rowid-
delete from emp
where rowid not in
(select max(rowid) from emp group by empno);
Using self join-
使用自连接
delete from emp e1
where rowid not in
(select max(rowid) from emp e2
where e1.empno = e2.empno );