postgresql 从 Postgres 中的数组列聚合的不同值数组

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/14935608/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-21 00:48:24  来源:igfitidea点击:

array of distinct values aggregated from an array column in Postgres

sqlarrayspostgresqlaggregate-functions

提问by Michal Podlewski

Suppose we have (in PostgreSQL 9.1) a table with some identifier, a column of type integer[] and some other columns (at least one, although there might be more) of type integer (or any other which can be summed).

假设我们有(在 PostgreSQL 9.1 中)一个带有一些标识符的表、一个 integer[] 类型的列和一些 integer 类型的其他列(至少一个,尽管可能有更多)(或任何其他可以求和的列)。

The goal is to have an aggregate giving for each identifier sum of the "summable" column and an array of all distinct elements of the array column.

目标是为“summable”列的每个标识符总和和数组列的所有不同元素的数组提供一个聚合。

The only way I can find is to use unnest function on the array column in a subquery and than join it with another subquery aggregating the "summable" columns.

我能找到的唯一方法是在子查询中的数组列上使用 unnest 函数,然后将它与另一个聚合“summable”列的子查询连接起来。

A simple example is as follows:

一个简单的例子如下:

CREATE TEMP TABLE a (id integer, aint integer[], summable_val integer);
INSERT INTO a VALUES
(1, array[1,2,3], 5),
(2, array[2,3,4], 6),
(3, array[3,4,5], 2),
(1, array[7,8,9], 19);

WITH u AS (
SELECT id, unnest(aint) as t FROM a GROUP BY 1,2
),
d AS (
SELECT id, array_agg(distinct t) ar FROM u GROUP BY 1),
v as (
SELECT id, sum(summable_val) AS val
FROM a GROUP BY 1
)
SELECT v.id, v.val, d.ar
FROM v
JOIN d
ON   v.id = d.id;

The code above does what I intended but the question is can we do any better? Main drawback of this solution is that it reads and aggregate table twice which might be troublesome for larger tables.

上面的代码符合我的意图,但问题是我们能做得更好吗?此解决方案的主要缺点是它读取和聚合表两次,这对于较大的表可能会很麻烦。

Some other solution to the general problem is to avoid using the array column and agregate "summable" column for each array member and then use array_aggin aggregation - but at least for now I'd like to stick to this array way.

一般问题的其他一些解决方案是避免使用数组列并为每个数组成员array_agg聚合“summable”列,然后在聚合中使用- 但至少现在我想坚持这种数组方式。

Thanks in advance for any ideas.

提前感谢您的任何想法。

回答by klin

The query may be a little bit faster (I suppose) but I cannot see any remarkable optimizations:

查询可能会快一点(我想),但我看不到任何显着的优化:

select a.id, sum(summable_val) val, ar
from
    (select id, array_agg(distinct t) ar 
        from 
        (select id, unnest(aint) as t from a group by 1,2) u
    group by 1) x
    join a on x.id = a.id
group by 1,3