如何编写 SQL 查询来识别特定字段中的重复值?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/15374902/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-01 14:12:34  来源:igfitidea点击:

How do I write an SQL query to identify duplicate values in a specific field?

sqlsql-serversql-server-2008tsql

提问by jbobbylopez

This is the table I'm working with:

这是我正在使用的表:

The table

I would like to identify only the ReviewIDs that have duplicate deduction IDs for different parameters.

桌子

我想仅识别针对不同参数具有重复扣减 ID 的 ReviewID。

For example, in the image above, ReviewID 114 has two different parameter IDs, but both records have the same deduction ID.

例如,在上图中,ReviewID 114 具有两个不同的参数 ID,但两条记录具有相同的扣减 ID。

For my purposes, this record (ReviewID 114) has an error. There should not be two or more unique parameter IDs that have the same deduction ID for a single ReviewID.

就我而言,此记录 (ReviewID 114) 有错误。对于单个 ReviewID,不应有两个或多个具有相同扣减 ID 的唯一参数 ID。

I would like write a query to identify these types of records, but my SQL skills aren't there yet. Help?

我想写一个查询来识别这些类型的记录,但我的 SQL 技能还没有。帮助?

Thanks!

谢谢!

Update 1:I'm using TSQL (SQL Server 2008) if that helps
Update 2:The output that I'm looking for would be the same as the image above, minus any records that do not match the criteria I've described.

更新 1:如果有帮助,我正在使用 TSQL (SQL Server 2008)
更新 2:我正在寻找的输出将与上图相同,减去与我描述的条件不匹配的任何记录。

Cheers!

干杯!

采纳答案by Ellesedil

SELECT * FROM table t1 INNER JOIN (
    SELECT review_id, deduction_id FROM table
    GROUP BY review_id, deduction_id
    HAVING COUNT(parameter_id) > 1
) t2 ON t1.review_id = t2.review_id AND t1.deduction_id = t2.deduction_id;

http://www.sqlfiddle.com/#!3/d858f/3

http://www.sqlfiddle.com/#!3/d858f/3

If it is possible to have exact duplicates and that is ok, you can modify the HAVING clause to COUNT(DISTINCT parameter_id).

如果可能有精确的重复项并且没问题,您可以将 HAVING 子句修改为 COUNT(DISTINCT parameter_id)。

回答by DeanOC

Select ReviewID, deduction_ID from Table
Group By ReviewID, deduction_ID
Having count(ReviewID) > 1

http://www.sqlfiddle.com/#!3/6e113/3has an example

http://www.sqlfiddle.com/#!3/6e113/3有一个例子

回答by Michael L.

If I understand the criteria: For each combination of ReviewIDand deduction_idyou can have only one parameter_id and you want a query that produces a result withoutthe ReviewIDs that break those rules (rather than identifying those rows that do). This will do that:

如果我理解的标准:对于每个组合ReviewIDdeduction_id你只能有一个parameter_id并且要产生结果的查询,而不在于打破这些规则(而非识别那些做行)ReviewIDs。这将做到这一点:

;WITH review_errors AS (
  SELECT  ReviewID
  FROM test
  GROUP BY ReviewID,deduction_ID
  HAVING COUNT(DISTINCT parameter_id) > 1
)
SELECT t.*
FROM test t
  LEFT JOIN review_errors r
    ON t.ReviewID = r.ReviewID
WHERE r.ReviewID IS NULL

To explain: review_errorsis a common table expression(think of it as a named sub-query that doesn't clutter up the main query). It selects the ReviewIDs that break the criteria. When you left join on it, it selects all rows from the left table regardless of whether they match the right table and only the rows from the right table that match the left table. Rows that do not match will have nulls in the columns for the right-hand table. By specifying WHERE r.ReviewID IS NULLyou eliminate the rows from the left hand table that match the right hand table.

解释一下:review_errors是一个公共表表达式(将其视为一个命名的子查询,不会使主查询混乱)。它选择违反标准的 ReviewID。当您对它进行左连接时,它会选择左表中的所有行,无论它们是否与右表匹配,并且仅选择右表中与左表匹配的行。不匹配的行将在右侧表的列中具有空值。通过指定,WHERE r.ReviewID IS NULL您可以从左侧表中消除与右侧表匹配的行。

SQL Fiddle

SQL小提琴