postgresql 按聚合聚合(ARRAY_AGG)?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/15661952/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Aggregate by aggregate (ARRAY_AGG)?
提问by Przemek
Let's say I have a simple table agg_test
with 3 columns - id
, column_1
and column_2
. Dataset, for example:
假设我有一个agg_test
包含 3 列的简单表格- id
,column_1
和column_2
. 数据集,例如:
id|column_1|column_2
--------------------
1| 1| 1
2| 1| 2
3| 1| 3
4| 1| 4
5| 2| 1
6| 3| 2
7| 4| 3
8| 4| 4
9| 5| 3
10| 5| 4
A query like this (with self join):
像这样的查询(使用自连接):
SELECT
a1.column_1,
a2.column_1,
ARRAY_AGG(DISTINCT a1.column_2 ORDER BY a1.column_2)
FROM agg_test a1
JOIN agg_test a2 ON a1.column_2 = a2.column_2 AND a1.column_1 <> a2.column_1
WHERE a1.column_1 = 1
GROUP BY a1.column_1, a2.column_1
Will produce a result like this:
会产生这样的结果:
column_1|column_1|array_agg
---------------------------
1| 2| {1}
1| 3| {2}
1| 4| {3,4}
1| 5| {3,4}
We can see that for values 4 and 5 from the joined table we have the same result in the last column. So, is it possible to somehow group the results by it, e.g:
我们可以看到,对于连接表中的值 4 和 5,我们在最后一列中得到了相同的结果。那么,是否可以通过它以某种方式对结果进行分组,例如:
column_1|column_1|array_agg
---------------------------
1| {2}| {1}
1| {3}| {2}
1| {4,5}| {3,4}
Thanks for any answers. If anything isn't clear or can be presented in a better way - tell me in the comments and I'll try to make this question as readable as I can.
感谢您提供任何答案。如果有任何不清楚或可以以更好的方式呈现 - 在评论中告诉我,我会尽量使这个问题更具可读性。
采纳答案by Gordon Linoff
I'm not sure if you can aggregate by an array. If you can here is one approach:
我不确定您是否可以通过数组聚合。如果可以,这里是一种方法:
select col1, array_agg(col2), ar
from (SELECT a1.column_1 as col1, a2.column_1 as col2,
ARRAY_AGG(DISTINCT a1.column_2 ORDER BY a1.column_2) as ar
FROM agg_test a1 JOIN
agg_test a2
ON a1.column_2 = a2.column_2 AND a1.column_1 <> a2.column_1
WHERE a1.column_1 = 1
GROUP BY a1.column_1, a2.column_1
) t
group by col1, ar
The alternative is to use array_dims
to convert the array values into a string.
另一种方法是使用array_dims
将数组值转换为字符串。
回答by Andriy M
You could also try something like this:
你也可以尝试这样的事情:
SELECT DISTINCT
a1.column_1,
ARRAY_AGG(a2.column_1) OVER (
PARTITION BY
a1.column_1,
ARRAY_AGG(DISTINCT a1.column_2 ORDER BY a1.column_2)
) AS "a2.column_1 agg",
ARRAY_AGG(DISTINCT a1.column_2 ORDER BY a1.column_2)
FROM agg_test a1
JOIN agg_test a2 ON a1.column_2 = a2.column_2 AND a1.column_1 a2.column_1
WHERE a1.column_1 = 1
GROUP BY a1.column_1, a2.column_1
;
(Highlighted are the parts that are different from the query you've posted in your question.)
(突出显示的是与您在问题中发布的查询不同的部分。)
The above uses a window ARRAY_AGG
to combine the values of a2.column_1
alongside the other other ARRAY_AGG
, using the latter's result as one of the partitioning criteria. Without the DISTINCT
, it would produce two {4,5}
rows for your example. So, DISTINCT
is needed to eliminate the duplicates.
上面使用一个窗口ARRAY_AGG
将 的值a2.column_1
与另一个 other组合在一起ARRAY_AGG
,使用后者的结果作为分区标准之一。如果没有DISTINCT
,它将{4,5}
为您的示例生成两行。因此,DISTINCT
需要消除重复项。
Here's a SQL Fiddle demo: http://sqlfiddle.com/#!1/df5c3/4
这是一个 SQL Fiddle 演示:http://sqlfiddle.com/#!1/df5c3/4
Note, though, that the window ARRAY_AGG
cannot have an ORDER BY
like it's "normal" counterpart. That means the order of a2.column_1
values in the list would be indeterminate, although in the linked demo it does happen to match the one in your expected output.
但是请注意,该窗口ARRAY_AGG
不能ORDER BY
像它的“正常”对应物那样。这意味着a2.column_1
列表中值的顺序将是不确定的,尽管在链接的演示中它确实与您预期输出中的顺序匹配。