oracle 如何在oracle中选择最后一次出现的重复记录

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/17105792/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-19 01:43:38  来源:igfitidea点击:

How to select last occurrence of duplicating record in oracle

databaseoracle

提问by Jonathan Handoyo

I am having a problem with Oracle query where the basic goal is to get the last row of every re-occurring rows, but there's a complication that you'll understand from the data:

我在使用 Oracle 查询时遇到问题,其基本目标是获取每个重复出现的行的最后一行,但是您会从数据中了解到一个复杂的问题:

Suppose I have a table that looks like this:

假设我有一个看起来像这样的表:

ID    | COL1 | COL2 | COL3 | UPDATED_DATE
------|------|------|------|-------------
001   | a    | b    | c    | 14/05/2013
002   | a    | b    | c    | 16/05/2013
003   | a    | b    | c    | 12/05/2013

You should be able to guess that since columns 1 to 3 have the same values for all 3 rows they are re-occurring data. The problem is, I want to get the latest updated row, which means row #2.

您应该能够猜到,由于第 1 到第 3 列的所有 3 行都具有相同的值,因此它们是重复出现的数据。问题是,我想获得最新更新的行,这意味着第 2 行。

I have an existing query that works if the table is without ID column, but I still need that column, so if anybody could help me point out what I'm doing wrong, that'd be great.

如果表没有 ID 列,我有一个现有的查询,但我仍然需要该列,所以如果有人能帮我指出我做错了什么,那就太好了。

select col1,
       col2,
       col3,
       max(updated_date)
  from tbl
 order by col1, col2, col3;

The above query returns me row #2, which is correct, but I still need the ID.

上面的查询返回第 2 行,这是正确的,但我仍然需要 ID。

Note: I know that I could have encapsulate the above query with another query that selects the ID column based on the 4 columns, but since I'm dealing with millions of records, the re-query will make the app very ineffective.

注意:我知道我可以用另一个查询封装上述查询,该查询基于 4 列选择 ID 列,但由于我正在处理数百万条记录,重新查询将使应用程序非常无效。

回答by peterm

Try

尝试

WITH qry AS 
(
  SELECT ID, COL1, COL2, COL3, updated_date, 
         ROW_NUMBER() OVER (PARTITION BY COL1, COL2, COL3 ORDER BY updated_date DESC) rank
  FROM tbl
)
SELECT ID, COL1, COL2, COL3, updated_date
  FROM qry
 WHERE rank = 1

or

或者

SELECT t1.ID, t2.COL1, t2.COL2, t2.COL3, t2.updated_date
  FROM tbl t1 JOIN
(
  SELECT COL1, COL2, COL3, MAX(updated_date) updated_date
    FROM tbl
   GROUP BY COL1, COL2, COL3
) t2 ON t1.COL1 = t2.COL1
    AND t1.COL2 = t2.COL2
    AND t1.COL3 = t2.COL3
    AND t1.updated_date = t2.updated_date

Output in both cases:

两种情况下的输出:

| ID | COL1 | COL2 | COL3 |               UPDATED_DATE |
--------------------------------------------------------
|  2 |    a |    b |    c | May, 16 2013 00:00:00+0000 |

Here is SQLFiddledemo for both queries.

这是两个查询的SQLFiddle演示。