MySQL 如何删除mysql数据库中的重复记录?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/659906/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-31 12:57:14  来源:igfitidea点击:

How to delete duplicate records in mysql database?

mysqlruby-on-railsruby

提问by nan

What's the best way to delete duplicate records in a mysql database using rails or mysql queries?

使用 rails 或 mysql 查询删除 mysql 数据库中重复记录的最佳方法是什么?

采纳答案by TStamper

What you can do is copy the distinct records into a new table by:

您可以通过以下方式将不同的记录复制到新表中:

 select distinct * into NewTable from MyTable

回答by krukid

Here's another idea in no particular language:

这是另一个没有特定语言的想法:

rs = `select a, b, count(*) as c from entries group by 1, 2 having c > 1`
rs.each do |a, b, c|
  `delete from entries where a=#{a} and b=#{b} limit #{c - 1}`
end


Edit:

编辑:

Kudos to Olaffor that "having" hint :)

荣誉对奥拉夫为“具有”提示:)

回答by Maximiliano Guzman

well, if it's a small table, from rails console you can do

好吧,如果它是一张小桌子,你可以从 rails 控制台做

class ActiveRecord::Base
  def non_id_attributes
    atts = self.attributes
    atts.delete('id')
    atts
  end
end

duplicate_groups = YourClass.find(:all).group_by { |element| element.non_id_attributes }.select{ |gr| gr.last.size > 1 }
redundant_elements = duplicate_groups.map { |group| group.last - [group.last.first] }.flatten
redundant_elements.each(&:destroy)

回答by DMin

Check for Duplicate entries :

检查重复条目:

SELECT DISTINCT(req_field) AS field, COUNT(req_field) AS fieldCount FROM 
table_name GROUP BY req_field HAVING fieldCount > 1



Remove Duplicate Queries :

删除重复查询:

DELETE FROM table_name 
USING table_name, table_name AS vtable 
WHERE 
    (table_name.id > vtable.id) 
AND (table_name.req_field=req_field)

Replace req_fieldand table_name- should work without any issues.

替换req_fieldtable_name- 应该没有任何问题。

回答by DMin

New to SQL :-) This is a classic question - often asked in interviews:-) I don't know whether it'll work in MYSQL but it works in most databases -

SQL 新手 :-) 这是一个经典问题 - 在面试中经常被问到 :-) 我不知道它是否适用于 MYSQL,但它适用于大多数数据库 -

> create table t(
>     a char(2),
>     b char(2),
>     c smallint )

> select a,b,c,count(*) from t
> group by a,b,c
> having count(*) > 1
a  b  c
-- -- ------ -----------
(0 rows affected)

> insert into t values ("aa","bb",1)
(1 row affected)

> insert into t values ("aa","bb",1)
(1 row affected)

> insert into t values ("aa","bc",1)
(1 row affected)

> select a,b,c,count(*) from t group by a,b,c having count(*) > 1
a  b  c 
-- -- ------ -----------
aa bb      1           2
(1 row affected)

回答by ravindra bhosale

If you have PK (id) in table (EMP) and want to older delete duplicate records with name column. For large data following query may be good approach.

如果您在表 (EMP) 中有 PK (id) 并且想要删除带有 name 列的重复记录。对于大数据,以下查询可能是一个好方法。

DELETE t3
FROM (
        SELECT t1.name, t1.id
        FROM (
                SELECT name
                FROM EMP
                GROUP BY name
                HAVING COUNT(name) > 1
        ) AS t0 INNER JOIN EMP t1 ON t0.name = t1.name
) AS t2 INNER JOIN EMP t3 ON t3.name = t2.name
WHERE t2.id < t3.id;

回答by Atul Maurya

suppose we have a table name tbl_productand there is duplicacy in the field p_pi_codeand p_nats_idin maximum no of count then first create a new table insert the data from existing table ...
ie from tbl_productto newtable1if anything else then newtable1to newtable2

假设我们有一个表名tbl_product并有duplicacy领域p_pi_codep_nats_id最大无计数然后先插入从现有的表...的数据创建一个新表
,即从tbl_productnewtable1如果别的然后newtable1newtable2

CREATE TABLE `newtable2` (                                  
            `p_id` int(10) unsigned NOT NULL auto_increment,         
            `p_status` varchar(45) NOT NULL,                         
            `p_pi_code` varchar(45) NOT NULL,                        
            `p_nats_id` mediumint(8) unsigned NOT NULL,              
            `p_is_special` tinyint(4) NOT NULL,                      
             PRIMARY KEY (`p_id`)                                   
      ) ENGINE=InnoDB;

INSERT INTO newtable1 (p_status, p_pi_code, p_nats_id, p_is_special) SELECT 
    p_status, p_pi_code, p_nats_id, p_is_special FROM tbl_product group by p_pi_code;

INSERT INTO newtable2 (p_status, p_pi_code, p_nats_id, p_is_special) SELECT 
    p_status, p_pi_code, p_nats_id, p_is_special FROM newtable1 group by p_nats_id;

after that we see all the duplicacy in the field is removed

之后我们看到该字段中的所有重复项都被删除了

回答by user474440

I am using Alter Table

我在用 Alter Table

ALTER IGNORE TABLE jos_city ADD UNIQUE INDEX(`city`);

回答by Abdo

I used @krukid's answer above to do the following on a table with around 70,000 entries:

我使用上面@krukid 的回答在一个包含大约 70,000 个条目的表上执行以下操作:

rs = 'select a, b, count(*) as c from table group by 1, 2 having c > 1'

# get a hashmap
dups = MyModel.connection.select_all(rs)

# convert to array
dupsarr = dups.map { |i|  [i.a, i.b, i.c] }

# delete dups
dupsarr.each do |a,b,c|
    ActiveRecord::Base.connection.execute("delete from table_name where a=#{MyModel.sanitize(a)} and b=#{MyModel.sanitize(b)} limit #{c-1}")
end

回答by Matthew Pautzke

Here is the rails solution I came up with. May not be the most efficient, but not a big deal if its a one time migration.

这是我想出的 rails 解决方案。可能不是最有效的,但如果是一次性迁移,这没什么大不了的。

distinct_records = MyTable.all.group(:distinct_column_1, :distinct_column_2).map {|mt| mt.id}
duplicates = MyTable.all.to_a.reject!{|mt| distinct_records.include? mt.id}
duplicates.each(&:destroy)

First, groups by all columns that determine uniqueness, the example shows 2 but you could have more or less

首先,按确定唯一性的所有列进行分组,示例显示 2,但您可以有更多或更少

Second, selects the inverse of that group...all other records

其次,选择该组的逆...所有其他记录

Third, Deletes all those records.

第三,删除所有这些记录。