如何删除没有临时表的 MySQL 表中的所有重复记录

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/14046355/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-31 15:56:01  来源:igfitidea点击:

How do I delete all the duplicate records in a MySQL table without temp tables

mysqlsqlduplicatessql-deleteunique-index

提问by MivaScott

I've seen a number of variations on this but nothing quite matches what I'm trying to accomplish.

我已经看到了许多变化,但没有什么与我想要完成的事情完全匹配。

I have a table, TableA, which contain the answers given by users to configurable questionnaires. The columns are member_id, quiz_num, question_num, answer_num.

我有一个表格,TableA其中包含用户对可配置问卷的回答。列是member_id, quiz_num, question_num, answer_num.

Somehow a few members got their answers submitted twice. So I need to remove the duplicated records, but make sure that one row is left behind.

不知何故,一些成员得到了两次提交的答案。所以我需要删除重复的记录,但要确保留下一行。

There is no primarycolumn so there could be two or three rows all with the exact same data.

没有列,因此可能有两行或三行具有完全相同的数据。

Is there a query to remove all the duplicates?

是否有删除所有重复项的查询?

回答by Saharsh Shah

Add Unique Indexon your table:

在您的表上添加唯一索引

ALTER IGNORE TABLE `TableA`   
ADD UNIQUE INDEX (`member_id`, `quiz_num`, `question_num`, `answer_num`);

Another way to do this would be:

另一种方法是:

Add primary key in your table then you can easily remove duplicates from your table using the following query:

在表中添加主键,然后您可以使用以下查询轻松地从表中删除重复项:

DELETE FROM member  
WHERE id IN (SELECT * 
             FROM (SELECT id FROM member 
                   GROUP BY member_id, quiz_num, question_num, answer_num HAVING (COUNT(*) > 1)
                  ) AS A
            );

回答by jveirasv

Instead of drop table TableA, you could delete all registers (delete from TableA;) and then populate original table with registers coming from TableA_Verify (insert into TAbleA select * from TAbleA_Verify). In this way you won't lost all references to original table (indexes,... )

取而代之的是drop table TableA,您可以删除所有寄存器 ( delete from TableA;),然后使用来自 TableA_Verify ( insert into TAbleA select * from TAbleA_Verify) 的寄存器填充原始表。通过这种方式,您不会丢失对原始表(索引,...)的所有引用

CREATE TABLE TableA_Verify AS SELECT DISTINCT * FROM TableA;

DELETE FROM TableA;

INSERT INTO TableA SELECT * FROM TAbleA_Verify;

DROP TABLE TableA_Verify;

回答by christoph

This doesn't use TEMP Tables, but real tables instead. If the problem is just about temp tables and not about table creation or dropping tables, this will work:

这不使用 TEMP 表,而是使用真实的表。如果问题仅与临时表有关,而不与表创建或删除表有关,则这将起作用:

SELECT DISTINCT * INTO TableA_Verify FROM TableA;

DROP TABLE TableA;

RENAME TABLE TableA_Verify TO TableA;

回答by nikolais

Thanks to jveirasv for the answer above.

感谢 jveirasv 提供上述答案。

If you need to remove duplicates of a specific sets of column, you can use this (if you have a timestamp in the table that vary for example)

如果您需要删除特定列集的重复项,您可以使用它(例如,如果您在表中有一个不同的时间戳)

CREATE TABLE TableA_Verify AS SELECT * FROM TableA WHERE 1 GROUP BY [COLUMN TO remove duplicates BY];

DELETE FROM TableA;

INSERT INTO TableA SELECT * FROM TAbleA_Verify;

DROP TABLE TableA_Verify;

回答by Dina Elwy

Add Unique Index on your table:

在您的表上添加唯一索引:

ALTER IGNORE TABLE TableA   
ADD UNIQUE INDEX (member_id, quiz_num, question_num, answer_num);

is work very well

工作得很好

回答by Sandesh Mhatre

If you are not using any primary key, then execute following queries at one single stroke. By replacing values:

如果您没有使用任何主键,则一次性执行以下查询。通过替换值:

# table_name - Your Table Name
# column_name_of_duplicates - Name of column where duplicate entries are found

create table table_name_temp like table_name;
insert into table_name_temp select distinct(column_name_of_duplicates),value,type from table_name group by column_name_of_duplicates;
delete from table_name;
insert into table_name select * from table_name_temp;
drop table table_name_temp
  1. create temporary table and store distinct(non duplicate) values
  2. make empty original table
  3. insert values to original table from temp table
  4. delete temp table
  1. 创建临时表并存储不同的(非重复)值
  2. 制作空的原始表格
  3. 从临时表向原始表插入值
  4. 删除临时表

It is always advisable to take backup of database before you play with it.

始终建议在使用数据库之前对其进行备份。

回答by juacala

As noted in the comments, the query in Saharsh Shah's answer must be run multiple times if items are duplicated more than once.

如评论中所述,如果项目重复多次,则必须多次运行 Saharsh Shah 的答案中的查询。

Here's a solution that doesn't delete any data, and keeps the data in the original table the entire time, allowing for duplicates to be deleted while keeping the table 'live':

这是一个不删除任何数据并将数据一直保留在原始表中的解决方案,允许在保持表“活动”的同时删除重复项:

alter table tableA add column duplicate tinyint(1) not null default '0';

update tableA set
duplicate=if(@member_id=member_id
             and @quiz_num=quiz_num
             and @question_num=question_num
             and @answer_num=answer_num,1,0),
member_id=(@member_id:=member_id),
quiz_num=(@quiz_num:=quiz_num),
question_num=(@question_num:=question_num),
answer_num=(@answer_num:=answer_num)
order by member_id, quiz_num, question_num, answer_num;

delete from tableA where duplicate=1;

alter table tableA drop column duplicate;

This basically checks to see if the current row is the same as the last row, and if it is, marks it as duplicate (the order statement ensures that duplicates will show up next to each other). Then you delete the duplicate records. I remove the duplicatecolumn at the end to bring it back to its original state.

这基本上检查当前行是否与最后一行相同,如果是,则将其标记为重复(订单语句确保重复项将彼此相邻显示)。然后删除重复的记录。我duplicate在最后移除了柱子,让它恢复到原来的状态。

It looks like alter table ignorealso might go away soon: http://dev.mysql.com/worklog/task/?id=7395

看起来alter table ignore也可能很快就会消失:http: //dev.mysql.com/worklog/task/?id=7395

回答by user1838915

An alternative way would be to create a new temporary table with same structure.

另一种方法是创建一个具有相同结构的新临时表。

CREATE TABLE temp_table AS SELECT * FROM original_table LIMIT 0

Then create the primary key in the table.

然后在表中创建主键。

ALTER TABLE temp_table ADD PRIMARY KEY (primary-key-field)

Finally copy all records from the original table while ignoring the duplicate records.

最后从原始表中复制所有记录,同时忽略重复记录。

INSERT IGNORE INTO temp_table AS SELECT * FROM original_table

Now you can delete the original table and rename the new table.

现在您可以删除原始表并重命名新表。

DROP TABLE original_table
RENAME TABLE temp_table TO original_table