postgresql 按聚合聚合（ARRAY_AGG）？

Question

提问by Przemek

Let's say I have a simple table agg_testwith 3 columns - id, column_1and column_2. Dataset, for example:

假设我有一个agg_test包含 3 列的简单表格- id,column_1和column_2. 数据集，例如：

id|column_1|column_2
--------------------
 1|       1|       1
 2|       1|       2
 3|       1|       3
 4|       1|       4
 5|       2|       1
 6|       3|       2
 7|       4|       3
 8|       4|       4
 9|       5|       3
10|       5|       4

A query like this (with self join):

像这样的查询（使用自连接）：

SELECT
  a1.column_1,
  a2.column_1,
  ARRAY_AGG(DISTINCT a1.column_2 ORDER BY a1.column_2)
FROM agg_test a1
JOIN agg_test a2 ON a1.column_2 = a2.column_2 AND a1.column_1 <> a2.column_1
WHERE a1.column_1 = 1
GROUP BY a1.column_1, a2.column_1

Will produce a result like this:

会产生这样的结果：

column_1|column_1|array_agg
---------------------------
       1|       2|      {1}
       1|       3|      {2}
       1|       4|    {3,4}
       1|       5|    {3,4}

We can see that for values 4 and 5 from the joined table we have the same result in the last column. So, is it possible to somehow group the results by it, e.g:

我们可以看到，对于连接表中的值 4 和 5，我们在最后一列中得到了相同的结果。那么，是否可以通过它以某种方式对结果进行分组，例如：

column_1|column_1|array_agg
---------------------------
       1|     {2}|      {1}
       1|     {3}|      {2}
       1|   {4,5}|    {3,4}

Thanks for any answers. If anything isn't clear or can be presented in a better way - tell me in the comments and I'll try to make this question as readable as I can.

感谢您提供任何答案。如果有任何不清楚或可以以更好的方式呈现 - 在评论中告诉我，我会尽量使这个问题更具可读性。

Answer 1

采纳答案by Gordon Linoff

I'm not sure if you can aggregate by an array. If you can here is one approach:

我不确定您是否可以通过数组聚合。如果可以，这里是一种方法：

select col1, array_agg(col2), ar
from (SELECT a1.column_1 as col1, a2.column_1 as col2,
             ARRAY_AGG(DISTINCT a1.column_2 ORDER BY a1.column_2) as ar
      FROM agg_test a1 JOIN
           agg_test a2
           ON a1.column_2 = a2.column_2 AND a1.column_1 <> a2.column_1
      WHERE a1.column_1 = 1
      GROUP BY a1.column_1, a2.column_1
     ) t
group by col1, ar

The alternative is to use array_dimsto convert the array values into a string.

另一种方法是使用array_dims将数组值转换为字符串。

Answer 2

回答by Andriy M

You could also try something like this:

你也可以尝试这样的事情：

SELECT DISTINCT
  a1.column_1,
  ARRAY_AGG(a2.column_1) OVER (
    PARTITION BY
      a1.column_1,
      ARRAY_AGG(DISTINCT a1.column_2 ORDER BY a1.column_2)
  ) AS "a2.column_1 agg",
  ARRAY_AGG(DISTINCT a1.column_2 ORDER BY a1.column_2)
FROM agg_test a1
JOIN agg_test a2 ON a1.column_2 = a2.column_2 AND a1.column_1  a2.column_1
WHERE a1.column_1 = 1
GROUP BY a1.column_1, a2.column_1
;

(Highlighted are the parts that are different from the query you've posted in your question.)

（突出显示的是与您在问题中发布的查询不同的部分。）

The above uses a window ARRAY_AGGto combine the values of a2.column_1alongside the other other ARRAY_AGG, using the latter's result as one of the partitioning criteria. Without the DISTINCT, it would produce two {4,5}rows for your example. So, DISTINCTis needed to eliminate the duplicates.

上面使用一个窗口ARRAY_AGG将的值a2.column_1与另一个 other组合在一起ARRAY_AGG，使用后者的结果作为分区标准之一。如果没有DISTINCT，它将{4,5}为您的示例生成两行。因此，DISTINCT需要消除重复项。

Here's a SQL Fiddle demo: http://sqlfiddle.com/#!1/df5c3/4

这是一个 SQL Fiddle 演示：http://sqlfiddle.com/#!1/df5c3/4

Note, though, that the window ARRAY_AGGcannot have an ORDER BYlike it's "normal" counterpart. That means the order of a2.column_1values in the list would be indeterminate, although in the linked demo it does happen to match the one in your expected output.

但是请注意，该窗口ARRAY_AGG不能ORDER BY像它的“正常”对应物那样。这意味着a2.column_1列表中值的顺序将是不确定的，尽管在链接的演示中它确实与您预期输出中的顺序匹配。

postgresql 按聚合聚合（ARRAY_AGG）？

提问by Przemek

采纳答案by Gordon Linoff

回答by Andriy M

相关推荐

最近更新

标签

postgresql 按聚合聚合（ARRAY_AGG）？

提问by Przemek

采纳答案by Gordon Linoff

回答by Andriy M

相关推荐

postgresql postgres 中可以创建多少个模式

postgresql RESTRICT 和 NO ACTION 之间的区别

生成 postgresql 用户密码

postgresql 从 Postgres 中的数组列聚合的不同值数组

相关推荐

最近更新

标签