删除除 MySQL 中的一行之外的所有重复行？

Question

提问by Highway of Life

How would I delete all duplicate data from a MySQL Table?

如何从 MySQL 表中删除所有重复数据？

For example, with the following data:

例如，使用以下数据：

SELECT * FROM names;

+----+--------+
| id | name   |
+----+--------+
| 1  | google |
| 2  | yahoo  |
| 3  | msn    |
| 4  | google |
| 5  | google |
| 6  | yahoo  |
+----+--------+

I would use SELECT DISTINCT name FROM names;if it were a SELECTquery.

SELECT DISTINCT name FROM names;如果它是一个SELECT查询，我会使用。

How would I do this with DELETEto only remove duplicates and keep just one record of each?

我将如何执行此操作DELETE以仅删除重复项并仅保留每个记录？

Answer 1

回答by martin.masa

Editor warning: This solution is computationally inefficient and may bring down your connection for a large table.

编辑器警告：此解决方案在计算上效率低下，可能会导致大表的连接中断。

NB - You needto do this first on a test copyof your table!

注意 - 您需要先在表的测试副本上执行此操作！

When I did it, I found that unless I also included AND n1.id <> n2.id, it deleted every row in the table.

当我这样做时，我发现除非我还包含AND n1.id <> n2.id，否则它会删除表中的每一行。

If you want to keep the row with the lowest idvalue:

DELETE n1 FROM names n1, names n2 WHERE n1.id > n2.id AND n1.name = n2.name

If you want to keep the row with the highest idvalue:

DELETE n1 FROM names n1, names n2 WHERE n1.id < n2.id AND n1.name = n2.name

如果要保留具有最低id值的行：

DELETE n1 FROM names n1, names n2 WHERE n1.id > n2.id AND n1.name = n2.name

如果要保留具有最高id值的行：

DELETE n1 FROM names n1, names n2 WHERE n1.id < n2.id AND n1.name = n2.name

I used this method in MySQL 5.1

我在 MySQL 5.1 中使用了这种方法

Not sure about other versions.

不确定其他版本。

Update: Since people Googling for removing duplicates end up here
Although the OP's question is about DELETE, please be advised that using INSERTand DISTINCTis much faster. For a database with 8 million rows, the below query took 13 minutes, while using DELETE, it took more than 2 hours and yet didn't complete.

更新：由于人们在谷歌搜索中删除重复项最终在这里
虽然 OP 的问题是关于DELETE，但请注意使用INSERT和DISTINCT速度要快得多。对于一个有 800 万行的数据库，下面的查询用了 13 分钟，而使用时DELETE，用了 2 个多小时，但没有完成。

INSERT INTO tempTableName(cellId,attributeId,entityRowId,value)
    SELECT DISTINCT cellId,attributeId,entityRowId,value
    FROM tableName;

Answer 2

回答by OMG Ponies

If you want to keep the row with the lowest idvalue:

如果要保留具有最低id值的行：

DELETE FROM NAMES
 WHERE id NOT IN (SELECT * 
                    FROM (SELECT MIN(n.id)
                            FROM NAMES n
                        GROUP BY n.name) x)

If you want the idvalue that is the highest:

如果您想要id最高的值：

DELETE FROM NAMES
 WHERE id NOT IN (SELECT * 
                    FROM (SELECT MAX(n.id)
                            FROM NAMES n
                        GROUP BY n.name) x)

The subquery in a subquery is necessary for MySQL, or you'll get a 1093 error.

MySQL 需要子查询中的子查询，否则会出现 1093 错误。

删除除 MySQL 中的一行之外的所有重复行？

提问by Highway of Life

回答by martin.masa

回答by OMG Ponies

相关推荐

最近更新

标签

删除除 MySQL 中的一行之外的所有重复行？

提问by Highway of Life

回答by martin.masa

回答by OMG Ponies

相关推荐

MySQL SQL JOIN 多对多

MySQL 搜索引擎 Lucene 与数据库搜索

MySQL SQL 创建表错误

MySQL bash - SQL 查询输出到变量

相关推荐

最近更新

标签