如何从基于多个表的 Oracle 结果集中删除重复项

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/8221644/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-19 00:27:41  来源:igfitidea点击:

How to remove duplicates from an Oracle result set based on multiple tables

sqloracleduplicate-data

提问by user1058946

I have 2 tables: a main APPLICATION table that holds the core data, and a STATUSTRACKING table that reflects the status changes of the core data in the APPLICATION table.

我有 2 个表:一个包含核心数据的主 APPLICATION 表,以及一个反映 APPLICATION 表中核心数据状态变化的 STATUSTRACKING 表。

There is a 1:N relationship between the APPLICATION table and the STATUSTRACKING table. Examples follow:

APPLICATION 表和STATUSTRACKING 表之间存在1:N 关系。示例如下:

-- APPLICATION
SELECT A.ID, A.DECISIONON, A.APPLICATIONSTATUS_ID AS CURRENTSTATUS
FROM APPLICATION A
WHERE A.ID=1099;

Results:
ID      DECISIONON  CURRENTSTATUS
1099    5/05/2009   5

-- STATUS TRACKING
SELECT ST.ID, ST.APPLICATION_ID AS APP_ID, ST.APPLICATIONSTATUS_ID AS STATUS, ST.CREATEDATE 
FROM STATUSTRACKING ST
WHERE ST.APPLICATION_ID=1099
ORDER BY ST.CREATEDATE DESC;

Results:
ID      APP_ID  STATUS  CREATEDATE
44466   1099    5       5/05/2009
44458   1099    7       5/05/2009
10826   1099    8       21/07/2008
9770    1099    7       9/07/2008
4410    1099    3       9/05/2008
3814    1099    2       2/05/2008
3803    1099    1       2/05/2008

Problem: I need to select the STATUSTRACKING record with the MOST RECENT DATE for a particular status. This is my attempt:

问题:我需要为特定状态选择具有最近日期的 STATUSTRACKING 记录。这是我的尝试:

SELECT A.ID AS APP_ID, A.APPLICATIONSTATUS_ID AS CURR_ST, Z.ID AS ST_ID, Z.CREATEDATE, Z.APPLICATIONSTATUS_ID AS OLD_ST
FROM APPLICATION A, STATUSTRACKING Z
WHERE A.APPLICATIONSTATUS_ID in (5,6)
    AND A.ID IN (337,1099,1404,9441)
    AND Z.APPLICATIONSTATUS_ID = 7
    AND Z.APPLICATION_ID=A.ID
ORDER BY A.ID ASC, Z.CREATEDATE DESC;

Results:
APP_ID  CURR_ST ST_ID       CREATEDATE  OLD_ST
337     6       13978       17/08/2008  7
1099    5       44458       5/05/2009   7
1099    5       9770        9/07/2008   7
1404    6       15550       28/08/2008  7
9441    5       49271       3/06/2009   7
9441    5       46058       13/05/2009  7

The problem is the duplicated rows. I must only show the row with the MOST RECENT CreateDate. In the case of App_ID 1099, it would be this record:

问题是重复的行。我必须只显示最近的 CreateDate 行。在 App_ID 1099 的情况下,它将是以下记录:

1099    5       44458       5/05/2009   7

I obviously don't want to exclude the rows that are not duplicated.

我显然不想排除不重复的行。

I thought I was on the right track with this statement which gives the row I'm looking for:

我认为我在这条语句上走在正确的轨道上,它给出了我正在寻找的行:

SELECT A.ID, A.APPLICATION_ID, A.APPLICATIONSTATUS_ID, A.CREATEDATE
FROM    (SELECT * 
            FROM STATUSTRACKING 
            WHERE APPLICATIONSTATUS_ID=7 
                AND APPLICATION_ID=1099 
            ORDER BY CREATEDATE DESC) A
WHERE ROWNUM = 1;

... but I can't seem to get it to work with my main select statement.

...但我似乎无法让它与我的主要选择语句一起使用。

The result set I want should look like this:

我想要的结果集应该是这样的:

APP_ID  CURR_ST ST_ID       CREATEDATE  OLD_ST
337     6       13978       17/08/2008  7
1099    5       44458       5/05/2009   7
1404    6       15550       28/08/2008  7
9441    5       49271       3/06/2009   7
etc. ...

I'm no Oracle/SQL expert, so any help would be appreciated.

我不是 Oracle/SQL 专家,所以任何帮助将不胜感激。

采纳答案by Justin Cave

The easiest approach is probably to use an analytic function. Something like

最简单的方法可能是使用解析函数。就像是

SELECT *
  FROM (
    SELECT A.ID AS APP_ID, 
           A.APPLICATIONSTATUS_ID AS CURR_ST, 
           Z.ID AS ST_ID, 
           Z.CREATEDATE, 
           Z.APPLICATIONSTATUS_ID AS OLD_ST,
           rank() over (partition by a.id order by z.createDate desc) rnk
      FROM APPLICATION A, 
           STATUSTRACKING Z
     WHERE A.APPLICATIONSTATUS_ID in (5,6)
       AND A.ID IN (337,1099,1404,9441)
       AND Z.APPLICATIONSTATUS_ID = 7
       AND Z.APPLICATION_ID=A.ID
    )
 WHERE rnk = 1

If there can be ties, you may want to use the ROW_NUMBERor the DENSE_RANKanalytic function rather than RANK.

如果可能存在联系,您可能希望使用ROW_NUMBERDENSE_RANK解析函数而不是RANK

回答by Sérgio Michels

If I understood correctly, you want the most recent status for the app, right? Question: your create date have the time too (hours,minutes and seconds)? Maybe this query helps you:

如果我理解正确,您想要应用程序的最新状态,对吗?问题:您的创建日期也有时间(小时、分钟和秒)?也许这个查询可以帮助你:

select a.* , st.*
  from application a
     , statustracking st
 where a.id = st.app_id
   and st.createdate = ( select max(st1.createdate)
                           from statustracing st1
                          where st1.app_id = a.id
                            and st1.application_statusid = st.application_statusid  )
   -- filters goes there
   and a.id in (....)
   and st.application_statusid in (...)