MySQL 如何删除mysql数据库中的重复记录?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/659906/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to delete duplicate records in mysql database?
提问by nan
What's the best way to delete duplicate records in a mysql database using rails or mysql queries?
使用 rails 或 mysql 查询删除 mysql 数据库中重复记录的最佳方法是什么?
采纳答案by TStamper
What you can do is copy the distinct records into a new table by:
您可以通过以下方式将不同的记录复制到新表中:
select distinct * into NewTable from MyTable
回答by krukid
回答by Maximiliano Guzman
well, if it's a small table, from rails console you can do
好吧,如果它是一张小桌子,你可以从 rails 控制台做
class ActiveRecord::Base
def non_id_attributes
atts = self.attributes
atts.delete('id')
atts
end
end
duplicate_groups = YourClass.find(:all).group_by { |element| element.non_id_attributes }.select{ |gr| gr.last.size > 1 }
redundant_elements = duplicate_groups.map { |group| group.last - [group.last.first] }.flatten
redundant_elements.each(&:destroy)
回答by DMin
Check for Duplicate entries :
检查重复条目:
SELECT DISTINCT(req_field) AS field, COUNT(req_field) AS fieldCount FROM
table_name GROUP BY req_field HAVING fieldCount > 1
Remove Duplicate Queries :
删除重复查询:
DELETE FROM table_name
USING table_name, table_name AS vtable
WHERE
(table_name.id > vtable.id)
AND (table_name.req_field=req_field)
Replace req_fieldand table_name- should work without any issues.
替换req_field和table_name- 应该没有任何问题。
回答by DMin
New to SQL :-) This is a classic question - often asked in interviews:-) I don't know whether it'll work in MYSQL but it works in most databases -
SQL 新手 :-) 这是一个经典问题 - 在面试中经常被问到 :-) 我不知道它是否适用于 MYSQL,但它适用于大多数数据库 -
> create table t(
> a char(2),
> b char(2),
> c smallint )
> select a,b,c,count(*) from t
> group by a,b,c
> having count(*) > 1
a b c
-- -- ------ -----------
(0 rows affected)
> insert into t values ("aa","bb",1)
(1 row affected)
> insert into t values ("aa","bb",1)
(1 row affected)
> insert into t values ("aa","bc",1)
(1 row affected)
> select a,b,c,count(*) from t group by a,b,c having count(*) > 1
a b c
-- -- ------ -----------
aa bb 1 2
(1 row affected)
回答by ravindra bhosale
If you have PK (id) in table (EMP) and want to older delete duplicate records with name column. For large data following query may be good approach.
如果您在表 (EMP) 中有 PK (id) 并且想要删除带有 name 列的重复记录。对于大数据,以下查询可能是一个好方法。
DELETE t3
FROM (
SELECT t1.name, t1.id
FROM (
SELECT name
FROM EMP
GROUP BY name
HAVING COUNT(name) > 1
) AS t0 INNER JOIN EMP t1 ON t0.name = t1.name
) AS t2 INNER JOIN EMP t3 ON t3.name = t2.name
WHERE t2.id < t3.id;
回答by Atul Maurya
suppose we have a table name tbl_productand there is duplicacy in the field p_pi_codeand p_nats_idin maximum no of count then
first create a new table insert the data from existing table ...
ie from tbl_productto newtable1if anything else then newtable1to newtable2
假设我们有一个表名tbl_product并有duplicacy领域p_pi_code和p_nats_id最大无计数然后先插入从现有的表...的数据创建一个新表
,即从tbl_product到newtable1如果别的然后newtable1到newtable2
CREATE TABLE `newtable2` (
`p_id` int(10) unsigned NOT NULL auto_increment,
`p_status` varchar(45) NOT NULL,
`p_pi_code` varchar(45) NOT NULL,
`p_nats_id` mediumint(8) unsigned NOT NULL,
`p_is_special` tinyint(4) NOT NULL,
PRIMARY KEY (`p_id`)
) ENGINE=InnoDB;
INSERT INTO newtable1 (p_status, p_pi_code, p_nats_id, p_is_special) SELECT
p_status, p_pi_code, p_nats_id, p_is_special FROM tbl_product group by p_pi_code;
INSERT INTO newtable2 (p_status, p_pi_code, p_nats_id, p_is_special) SELECT
p_status, p_pi_code, p_nats_id, p_is_special FROM newtable1 group by p_nats_id;
after that we see all the duplicacy in the field is removed
之后我们看到该字段中的所有重复项都被删除了
回答by user474440
I am using Alter Table
我在用 Alter Table
ALTER IGNORE TABLE jos_city ADD UNIQUE INDEX(`city`);
回答by Abdo
I used @krukid's answer above to do the following on a table with around 70,000 entries:
我使用上面@krukid 的回答在一个包含大约 70,000 个条目的表上执行以下操作:
rs = 'select a, b, count(*) as c from table group by 1, 2 having c > 1'
# get a hashmap
dups = MyModel.connection.select_all(rs)
# convert to array
dupsarr = dups.map { |i| [i.a, i.b, i.c] }
# delete dups
dupsarr.each do |a,b,c|
ActiveRecord::Base.connection.execute("delete from table_name where a=#{MyModel.sanitize(a)} and b=#{MyModel.sanitize(b)} limit #{c-1}")
end
回答by Matthew Pautzke
Here is the rails solution I came up with. May not be the most efficient, but not a big deal if its a one time migration.
这是我想出的 rails 解决方案。可能不是最有效的,但如果是一次性迁移,这没什么大不了的。
distinct_records = MyTable.all.group(:distinct_column_1, :distinct_column_2).map {|mt| mt.id}
duplicates = MyTable.all.to_a.reject!{|mt| distinct_records.include? mt.id}
duplicates.each(&:destroy)
First, groups by all columns that determine uniqueness, the example shows 2 but you could have more or less
首先,按确定唯一性的所有列进行分组,示例显示 2,但您可以有更多或更少
Second, selects the inverse of that group...all other records
其次,选择该组的逆...所有其他记录
Third, Deletes all those records.
第三,删除所有这些记录。