从 SQL 表中删除重复行(基于多列中的值)

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/30243945/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-01 03:37:18  来源:igfitidea点击:

Removing duplicate rows (based on values from multiple columns) from SQL table

sqlsql-servertsqljoinduplicate-removal

提问by Vikram

I have following SQL table:

我有以下 SQL 表:

AR_Customer_ShipTo

AR_Customer_ShipTo

+--------------+------------+-------------------+------------+
| ARDivisionNo | CustomerNo |   CustomerName    | ShipToCode |
+--------------+------------+-------------------+------------+
|           00 | 1234567    | Test Customer     |          1 |
|           00 | 1234567    | Test Customer     |          2 |
|           00 | 1234567    | Test Customer     |          3 |
|           00 | ARACODE    | ARACODE Customer  |          1 |
|           00 | ARACODE    | ARACODE Customer  |          2 |
|           01 | CBE1EX     | Normal Customer   |          1 |
|           02 | ZOCDOC     | Normal Customer-2 |          1 |
+--------------+------------+-------------------+------------+

(ARDivisionNo, CustomerNo,ShipToCode)form a primary key for this table.

(ARDivisionNo, CustomerNo,ShipToCode)形成这个表的主键。

If you notice first 3 rows belong to same customer (Test Customer), who has different ShipToCodes: 1, 2 and 3. Similar is the case with second customer (ARACODE Customer). Each of Normal Customer and Normal Customer-2 has only 1 record with a single ShipToCode.

如果您注意到前 3 行属于同一客户(测试客户),他们具有不同的 ShipToCode:1、2 和 3。第二个客户(ARACODE 客户)的情况类似。Normal Customer 和 Normal Customer-2 中的每一个都只有 1 条记录,其中有一个ShipToCode.

Now, I would like to get result querying on this table, where I will have only 1 record per customer. So, for any customer, where there are more than 1 records, I would like to keep the record with highest value for ShipToCode.

现在,我想在这个表上查询结果,每个客户只有 1 条记录。因此,对于任何记录超过 1 条的客户,我希望保留ShipToCode.

I tried various things:

我尝试了各种事情:

(1) I can easily get the list of customers with only one record in table.

(1)我可以很容易地得到表中只有一条记录的客户列表。

(2) With following query, I am able to get the list of all the customers, who have more than one record in the table.

(2) 通过下面的查询,我能够得到所有客户的列表,他们在表中拥有多条记录。

[Query-1]

[查询-1]

SELECT ARDivisionNo, CustomerNo
FROM AR_Customer_ShipTo 
GROUP BY ARDivisionNo, CustomerNo
HAVING COUNT(*) > 1;

(3) Now, in order to select proper ShipToCodefor each record returned by above query, I am not able to figure out, how to iterate through all the records returned by above query.

(3)现在,为了为ShipToCode上述查询返回的每条记录选择合适的,我无法弄清楚,如何遍历上述查询返回的所有记录。

If I do something like:

如果我做这样的事情:

[Query-2]

[查询-2]

SELECT TOP 1 ARDivisionNo, CustomerNo, CustomerName, ShipToCode  
FROM AR_Customer_ShipTo 
WHERE ARDivisionNo = '00' and CustomerNo = '1234567'
ORDER BY ShipToCode DESC

Then I can get the appropriate record for (00-1234567-Test Customer). Hence, if I can use all the results from query-1 in the above query (query-2), then I can get the desired single records for customers with more than one record. This can be combined with results from point (1) to achieve the desired end result.

然后我可以获得 (00-1234567-Test Customer) 的相应记录。因此,如果我可以在上面的查询 (query-2) 中使用来自 query-1 的所有结果,那么我可以为具有多条记录的客户获取所需的单条记录。这可以与点 (1) 的结果相结合,以实现所需的最终结果。

Again, this can be easier than approach I am following. Please let me know how can I do this.

同样,这可能比我遵循的方法更容易。请让我知道我该怎么做。

[Note: I have to do this using SQL queries only. I cannot use stored procedures, as I am going to execute this thing finally using 'Scribe Insight', which only allows me to write queries.]

[注意:我必须仅使用 SQL 查询来执行此操作。我不能使用存储过程,因为我将最终使用“Scribe Insight”来执行这件事,它只允许我编写查询。]

回答by HaveNoDisplayName

Sample SQL FIDDLE

Sample SQL FIDDLE

1) Use CTE to get max ship code value record based on ARDivisionNo, CustomerNo for each Customers

1) 使用 CTE 根据每个客户的 ADivisionNo、CustomerNo 获取最大船代码值记录

WITH cte AS (
  SELECT*, 
     row_number() OVER(PARTITION BY ARDivisionNo, CustomerNo ORDER BY ShipToCode desc) AS [rn]
  FROM t
)
Select * from cte WHERE [rn] = 1

2) To Delete the record use Delete query instead of Select and change Where Clause to rn > 1. Sample SQL FIDDLE

2) 要删除记录,请使用删除查询而不是选择并将 Where 子句更改为 rn > 1。 Sample SQL FIDDLE

WITH cte AS (
  SELECT*, 
     row_number() OVER(PARTITION BY ARDivisionNo, CustomerNo ORDER BY ShipToCode desc) AS [rn]
  FROM t
)
Delete from cte WHERE [rn] > 1;

select * from t;

回答by Hart CO

ROW_NUMBER()is great for this:

ROW_NUMBER()非常适合这个:

;WITH cte AS (SELECT *,ROW_NUMBER() OVER(PARTITION BY ARDivisionNo,CustomerNo ORDER BY ShipToCode DESC) AS RN 
              FROM AR_Customer_ShipTo
              )
SELECT * 
FROM  cte
WHERE RN = 1

You mention removing the duplicates, if you want to DELETEyou can simply:

您提到删除重复项,如果您愿意,DELETE您可以简单地:

;WITH cte AS (SELECT *,ROW_NUMBER() OVER(PARTITION BY ARDivisionNo,CustomerNo ORDER BY ShipToCode DESC) AS RN 
              FROM AR_Customer_ShipTo
              )
DELETE cte
WHERE RN > 1

The?ROW_NUMBER()?function assigns a number to each row.?PARTITION BY?is optional, but used to start the numbering over for each value in a given field or group of fields, ie: if you PARTITION BY Some_Datethen for each unique date value the numbering would start over at 1. ORDER BYof course is used to define how the counting should go, and is required in the ROW_NUMBER()function.

这?ROW_NUMBER()? 函数为每一行分配一个数字。PARTITION BY? 是可选的,但用于为给定字段或字段组中的每个值重新开始编号,即:如果您PARTITION BY Some_Date然后对于每个唯一的日期值,编号将从 1 开始。ORDER BY当然用于定义计数方式应该去,并且在ROW_NUMBER()函数中是必需的。

回答by dnoeth

You didn't specify the version of SQL Server, but ROW_NUMBER is probably supported:

您没有指定 SQL Server 的版本,但可能支持 ROW_NUMBER:

select *
from
 (
  select ...
     ,row_number() 
      over (partition by ARDivisionNo, CustomerNo
            order by ShipToCode desc) as rn 
  from tab
 ) as dt
where rn = 1

回答by Giorgi Nakeuri

With row_numberfunction:

row_number功能:

SELECT * FROM(
              SELECT ARDivisionNo, CustomerNo, CustomerName, ShipToCode,
              row_number() over(partition by CustomerNo order by ShipToCode desc) rn
              FROM AR_Customer_ShipTo) t
WHERE rn = 1