postgresql 试图找到列中的第二大值（postgres sql）

Question

提问by seb

I am trying to find the second largest value in a column and only the second largest value.

我试图找到列中的第二大值，并且只找到第二大值。

select a.name, max(a.word) as word
from apple a
where a.word < (select max(a.word) from apple a)
group by a.name;

For some reason, what I have now returns the second largest value AND all the lower values also but fortunately avoids the largest value.

出于某种原因，我现在所拥有的返回第二大值和所有较低的值，但幸运的是避免了最大值。

Is there a way to fix this?

有没有办法来解决这个问题？

Answer 1

回答by seb

Here's another conceptually simple solution that's been running for me in .1 milliseconds on a table of 21 million rows, according to EXPLAIN ANALYZE. It returns nothing in the case where there's only one value.

根据 EXPLAIN ANALYZE 的说法，这是另一个概念上简单的解决方案，它在 0.1 毫秒内在 2100 万行的表上为我运行。在只有一个值的情况下，它不返回任何内容。

SELECT a.name, 
(SELECT word FROM apple ap WHERE ap.name=a.name ORDER BY word ASC OFFSET 1 LIMIT 1) 
FROM apple a

Note that my table already had existing indices on name, word, and (name, word), which allows me to use ORDER BY like that.

请注意，我的表已经有关于 name、word 和 (name, word) 的现有索引，这允许我像这样使用 ORDER BY。

Answer 2

回答by Michael Buen

The simplest, albeit inefficient(array can exhaust memory):

最简单但效率低下的（数组会耗尽内存）：

select student, (array_agg(grade order by grade desc))[2]
from 
student_grades
group by student

The efficient one:

高效的一个：

create aggregate two_elements(anyelement)
(
sfunc = array_limit_two,
stype = anyarray,
initcond = '{}'
);

create or replace function array_limit_two(anyarray, anyelement) returns anyarray
as 
$$
begin
    if array_upper(,1) = 2 then
        return ;
    else
        return array_append(, );
    end if;
end;
$$ language 'plpgsql';

Test data:

测试数据：

create table student_grades
(
student text,
grade int
);



insert into student_grades values 
('john',70),
('john',80),
('john',90),
('john',100);


insert into student_grades values
('paul',20),
('paul',10),
('paul',50),
('paul',30);


insert into student_grades values
('george',40);

Test code:

测试代码：

-- second largest
select student, coalesce( (two_elements(grade order by grade desc))[2], max(grade) /* min would do too, since it's one element only */ )
from 
student_grades
group by student


-- second smallest
select student, coalesce( (two_elements(grade order by grade))[2], max(grade) /* min would do too, since it's one element only */ )
from 
student_grades
group by student

Output:

输出：

q_and_a=# -- second largest
q_and_a=# select student, coalesce( (two_elements(grade order by grade desc))[2], max(grade) /* min would do too, since it's one element only */ )
q_and_a-# from
q_and_a-# student_grades
q_and_a-# group by student;
 student | coalesce
---------+----------
 george  |       40
 john    |       90
 paul    |       30
(3 rows)


q_and_a=#
q_and_a=# -- second smallest
q_and_a=# select student, coalesce( (two_elements(grade order by grade))[2], max(grade) /* min would do too, since it's one element only */ )
q_and_a-# from
q_and_a-# student_grades
q_and_a-# group by student;
 student | coalesce
---------+----------
 george  |       40
 john    |       80
 paul    |       20
(3 rows)

EDIT@diesel The simplest(and efficient also):

编辑@diesel 最简单（也有效）：

-- second largest
select student, array_min(two_elements(grade order by grade desc))
from 
student_grades
group by student;

-- second smallest
select student, array_max(two_elements(grade order by grade))
from 
student_grades
group by student;

The array_max function:

array_max 函数：

create or replace function array_min(anyarray) returns anyelement
as
$$
select min(unnested) from( select unnest() unnested ) as x
$$ language sql;

create or replace function array_max(anyarray) returns anyelement
as
$$
select max(unnested) from( select unnest() unnested ) as x
$$ language sql;

EDIT

编辑

Could be the simplest and efficient of all, if only Postgresql would make array_max a built-in function and facilitates LIMIT clause on aggregations :-) LIMIT clause on aggregation is my dream feature on Postgresql

可能是最简单和有效的，如果只有 Postgresql 将 array_max 变成一个内置函数并促进聚合上的 LIMIT 子句:-) 聚合上的 LIMIT 子句是我在 Postgresql 上的梦想功能

select student, array_max( array_agg(grade order by grade limit 2) )
from 
student_grades
group by student;

While that LIMIT on aggregation is not yet available, use this:

虽然聚合的 LIMIT 尚不可用，但请使用：

-- second largest
select student, 

    array_min
    (

        array ( 
               select grade from student_grades 
               where student = x.student order by grade desc limit 2 )

    )

from 
student_grades x
group by student;


-- second smallest
select student, 

    array_max
    (

        array ( 
               select grade from student_grades 
               where student = x.student order by grade limit 2 )

    )

from 
student_grades x
group by student;

Answer 3

回答by Ken Downs

This is also brute force, but is guaranteed to only pass the table exactly and only once:

这也是蛮力，但保证只准确地传递表一次：

select name,word
  from (
         select name,word
              , row_number() over (partition by name 
                                       order by word desc)
                as rowNum
           from apple
       ) x
 where rowNum = 2

This version below may perform better if you have a covering index on (name,word) and there is a high count of word values per name:

如果您在 (name,word) 上有覆盖索引并且每个名称的单词值计数很高，则此版本的性能可能会更好：

with recursive myCte as
(
 select name,max(word) as word
      , 1 as rowNum
   from apple
  group by name
  union all
 select par.name
      , (select max(word) as word
           from apple 
          where name = par.name
            AND word < par.word
        ) as word
      , 2 as rowNum
   from myCte par
  where par.rowNum = 1
)
select * from myCte where rownum = 2

Answer 4

回答by a_horse_with_no_name

SELECT *
FROM (
  SELEC name, 
        dense_rank() over (partition by name order by word desc) as word_rank,
        count(*) over (partition by name) as name_count
  FROM apple
) t
WHERE (word_rank = 2 OR name_count = 1)

Edit:
The name_count = 1takes care of those cases where only a single row is present for a specific name.

编辑：
该name_count = 1负责的那些情况下，只有一行存在特定名称。

Using dense_rank()instead of rank()makes sure there isa row with word_rank = 2 as dense_rank makes sure there are no gaps

使用dense_rank()的，而不是rank()确保有是有word_rank行= 2的DENSE_RANK确保不存在任何差距

Answer 5

回答by ijw

Umm, you don't just mean:

嗯，你不仅仅是指：

select a.name, max(a.word) as word
from apple a
where a.word < (select max(b.word) from apple b WHERE a.name = b.name)
group by a.name;

do you? One row per name returning the second highest value per name (or no row if there is no second highest value).

你？每个名称一行返回每个名称的第二大值（如果没有第二大值，则没有行）。

If that's what you want, your query was just missing a constraint, although I suspect the above is probably two table scans if PostgreSQL has the sense to convert it to a JOIN.

如果这就是您想要的，那么您的查询只是缺少一个约束，尽管我怀疑如果 PostgreSQL 有将其转换为 JOIN 的意义，上述内容可能是两个表扫描。

Answer 6

回答by Michael Buen

Another approach, use RANK:

另一种方法，使用 RANK：

with ranking as
(
    select student, grade, rank() over(partition by student order by grade desc) as place
    from 
    student_grades
)
select * 
from
ranking
where 
    (student, place) 
    in 
    (
        select student, max(place)
        from ranking
        where place <= 2
        group by student
    )

Second to the MIN:

MIN 次之：

with ranking as
(
    select student, grade, 
        rank() 
        -- just change DESC to ASC
        over(partition by student order by grade ASC ) as place
    from 
    student_grades
)
select * 
from
ranking
where 
    (student, place) 
    in 
    (                
        select student, max(place) -- still max
        from ranking
        where place <= 2
        group by student
    )

Answer 7

回答by RichardTheKiwi

A very brute force query, but it works

一个非常暴力的查询，但它有效

select a.name, a.word
from apple a
where (select count(distinct b.word) from apple b
    where b.word > a.word) = 1

postgresql 试图找到列中的第二大值（postgres sql）

提问by seb

回答by seb

回答by Michael Buen

回答by Ken Downs

回答by a_horse_with_no_name

回答by ijw

回答by Michael Buen

回答by RichardTheKiwi

相关推荐

最近更新

标签

postgresql 试图找到列中的第二大值（postgres sql）

提问by seb

回答by seb

回答by Michael Buen

回答by Ken Downs

回答by a_horse_with_no_name

回答by ijw

回答by Michael Buen

回答by RichardTheKiwi

相关推荐

在 PostgreSQL 中查找重叠的日期范围

postgresql 当子查询可以为 NULL 时，SQL“IN subquery”

Hibernate，Postgresql：列“x”的类型为 oid 但表达式的类型为字节

postgresql Postgresql全文搜索部分单词

相关推荐

最近更新

标签