SQL 删除除一条重复记录以外的所有记录
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/5882649/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Delete all but one duplicate record
提问by James P.
I have a table that is supposed to keep a trace of visitors to a given profile (user id to user id pair). It turns out my SQL query was a bit off and is producing multiple pairs instead of single ones as intended. With hindsight I should have enforced a unique constraint on each id+id pair.
我有一个表,应该跟踪给定配置文件的访问者(用户 ID 到用户 ID 对)。事实证明,我的 SQL 查询有点偏离,并且正在生成多对而不是按预期生成单对。事后看来,我应该对每个 id+id 对强制执行唯一约束。
Now, how could I go about cleaning up the table? What I want to do is delete all duplicate pairs and leave just one.
现在,我该怎么收拾桌子呢?我想要做的是删除所有重复的对,只留下一个。
So for example change this:
所以例如改变这个:
23515 -> 52525 date_visited
23515 -> 52525 date_visited
23515 -> 52525 date_visited
12345 -> 54321 date_visited
12345 -> 54321 date_visited
12345 -> 54321 date_visited
12345 -> 54321 date_visited
23515 -> 52525 date_visited
...
Into this:
进入这个:
23515 -> 52525 date_visited
12345 -> 54321 date_visited
Update: Here is the table structure as requested:
更新:这是要求的表结构:
id int(10) UNSIGNED Non Aucun AUTO_INCREMENT
profile_id int(10) UNSIGNED Non 0
visitor_id int(10) UNSIGNED Non 0
date_visited timestamp Non CURRENT_TIMESTAMP
回答by Frank Schmitt
ANSI SQL Solution
ANSI SQL 解决方案
Use group by in a subquery:
在子查询中使用 group by:
delete from my_tab where id not in
(select min(id) from my_tab group by profile_id, visitor_id);
You need some kind of unique identifier(here, I'm using id).
您需要某种唯一标识符(这里,我使用的是 id)。
MySQL Solution
MySQL解决方案
As pointed out by @JamesPoulson, this causes a syntax error in MySQL; the correct solution is (as shown in James' answer):
正如@JamesPoulson 所指出的,这会导致 MySQL 中出现语法错误;正确的解决方案是(如詹姆斯的回答所示):
delete from `my_tab` where id not in
( SELECT * FROM
(select min(id) from `my_tab` group by profile_id, visitor_id) AS temp_tab
);
回答by James P.
Here's Frank Schmitt's solutionwith a small workaround utilizing a temporary table to allow his solution to work on MySQL:
这是Frank Schmitt 的解决方案,其中包含一个使用临时表的小型解决方法,以允许他的解决方案在 MySQL 上运行:
delete from `my_tab` where id not in
( SELECT * FROM
(select min(id) from `my_tab` group by profile_id, visitor_id) AS temp_tab
)
回答by Vik Wilder
This will work:
这将起作用:
With NewCTE
AS
(
Select *, Row_number() over(partition by ID order by ID)as RowNumber from
table_name
)
Delete from NewCTE where RowNumber > 1
回答by gmadd
Select all unique rows
Copy them to a new temp table
Truncate original table
Copy temp table data to original table
That's what I'd do. I'm not sure if there's 1 query that would do all this for you.
选择所有唯一行
将它们复制到新的临时表
截断原始表
将临时表数据复制到原始表
这就是我要做的。我不确定是否有 1 个查询可以为您完成所有这些工作。
回答by akshay choukekar
If you are using SQL you can manually delete the duplicate rows keeping one entry just follow this procedure:
如果您使用 SQL,您可以手动删除重复的行,保留一个条目,只需按照以下步骤操作:
- Go into your table where you have duplicate data.
- Apply the filter to segregate duplicate data for each individual id
- Select all the rows you want to delete.
- Press delete and save the result.
- Repeat the process for each id you have duplicate entries for.
- 进入您有重复数据的表。
- 应用过滤器为每个单独的 id 分离重复数据
- 选择要删除的所有行。
- 按删除并保存结果。
- 对每个有重复条目的 ID 重复该过程。
It's a long procedure but you can see the results immediately in real-time.
这是一个漫长的过程,但您可以立即实时查看结果。
Hope this solution worked for you!!
希望这个解决方案对你有用!!