如何获得 Oracle 中不同记录的最高计数?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/13709382/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-19 01:18:31  来源:igfitidea点击:

How to get a top count for distinct records in Oracle?

sqloraclecountdistinct

提问by thursdaysgeek

I have a table with a lot of records with some fields duplicated. I want the most common of each of those duplications.

我有一个包含大量记录的表,其中一些字段是重复的。我想要每个重复中最常见的。

So, if my table has data like below:

所以,如果我的表有如下数据:

 ID     Field1     Field2  
  1      A          10  
  2      A          12 
  3      B          5  
  4      A          10  
  5      B          5  
  6      A          10  
  7      B          8
  8      B          5
  9      A          10

I can select distinct and get counts:

我可以选择不同的并获得计数:

select distinct Field1, Field2, count(Field1)
from Table
group by Field1, Field2
order by Field1, count(Field1) desc

And that will give me

这会给我

Field1    Field2     Count
A         10         4
A         12         1
B          5         3
B          8         1

However, I only want the records for each Field1 that have the highest count. I've been fighting with rank() over partition and subqueries, but haven't found the correct syntax for using two fields for uniqueness and selecting the top record by count. I've been searching, and I'm sure this has been asked, but I can't find it.

但是,我只想要具有最高计数的每个 Field1 的记录。我一直在与 rank() 对分区和子查询进行斗争,但还没有找到使用两个字段来实现唯一性和按计数选择最高记录的正确语法。我一直在寻找,我确定有人问过这个问题,但我找不到。

I want to get the following

我想得到以下

Field1     Field2       (optional) Count 
 A          10           4
 B           5           3

The goal is to look at a table that has just a little bit of incorrect data (linking between field1 and field2 wrong) and determine what it SHOULD be based on what it usually is. I don't know how many bad records there are, so eliminating Count below a certain threshold would work, but seems a bit kludgy.

目标是查看一个只有一点不正确数据的表(field1 和 field2 之间的链接错误),并根据它通常是什么来确定它应该是什么。我不知道有多少坏记录,因此将 Count 消除到某个阈值以下是可行的,但似乎有点笨拙。

If it is better, I can make a temp table to put my distinct values into and then select from there, but it doesn't seem like that should be necessary.

如果更好,我可以制作一个临时表,将我的不同值放入其中,然后从中进行选择,但这似乎没有必要。

回答by ivanatpr

I think this is what you're looking for:

我认为这就是你要找的:

select field1, field2, cnt from 
(select field1, field2, cnt, rank() over (partition by field1 order by cnt desc) rnk
from (select distinct Field1, Field2, count(Field1) cnt
            from Table1
            group by Field1, Field2
            order by Field1, count(Field1) desc) 
)
where rnk = 1;

SQL Fiddle: http://sqlfiddle.com/#!4/fe96d/3

SQL 小提琴:http://sqlfiddle.com/#!4/fe96d/3

回答by Justin Cave

It's a bit inelegant thanks to multiple layers of nested subqueries. However it should be reasonably efficient. And it should be reasonably easy to follow the steps in the SQL

由于多层嵌套子查询,这有点不雅。然而,它应该是相当有效的。并且按照 SQL 中的步骤进行操作应该相当容易

SQL> ed
Wrote file afiedt.buf

  1  with x as (
  2    select 1 id, 'A' field1, 10 field2 from dual union all
  3    select 2, 'A', 12 from dual union all
  4    select 3, 'B', 5 from dual union all
  5    select 4, 'A', 10 from dual union all
  6    select 5, 'B', 5 from dual union all
  7    select 6, 'A', 10 from dual union all
  8    select 7, 'B', 8 from dual union all
  9    select 8, 'B', 5 from dual union all
 10    select 9, 'A', 10 from dual
 11  )
 12  select field1,
 13         field2,
 14         cnt
 15    from (select field1,
 16                 field2,
 17                 cnt,
 18                 rank() over (partition by field1
 19                                  order by cnt desc) rnk
 20           from (select field1, field2, count(*) cnt
 21                   from x
 22                  group by field1, field2))
 23*  where rnk = 1
SQL> /

F     FIELD2        CNT
- ---------- ----------
A         10          4
B          5          3

回答by a_horse_with_no_name

And a third approach ;)

还有第三种方法;)

select field1,
       field2,
       max_cnt
from (
  select field1, 
         field2, 
         cnt,
         max(cnt) over (partition by field1, field2) as max_cnt,
         row_number() over (partition by field1 order by cnt desc) as rn
  from (
      select field1, 
             field2, 
             count(*) over (partition by Field1, Field2) as cnt
      from idlist
  ) t1 
) t2
where max_cnt = cnt 
and rn = 1

SQLFiddle: http://sqlfiddle.com/#!4/8461f/1

SQLFiddle:http://sqlfiddle.com/#!4/8461f/1