postgresql 在分组/聚合期间连接/合并数组值
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/24153498/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Concatenate/merge array values during grouping/aggregation
提问by Yana K.
I have a table with the an array column type:
我有一个数组列类型的表:
title tags
"ridealong";"{comedy,other}"
"ridealong";"{comedy,tragedy}"
"freddyjason";"{horror,silliness}"
I would like to write a query that produces a single array per title(in an ideal world it would be a set/deduplicated array)
我想编写一个查询,为每个标题生成一个数组(在理想情况下,它将是一个集合/去重数组)
e.g.
例如
select array_cat(tags),title from my_test group by title
The above query doesn't work of course, but I would like to produce 2 rows:
上面的查询当然不起作用,但我想生成 2 行:
"ridealong";"{comedy,other,tragedy}"
"freddyjason";"{horror,silliness}"
Any help or pointers would be very much appreciated (I am using Postgres 9.1)
非常感谢任何帮助或指示(我使用的是 Postgres 9.1)
Based on Craig's help I ended up with the following (slightly altered syntax since 9.1 complains about the query exactly as he shows it)
基于 Craig 的帮助,我最终得到了以下内容(自 9.1 以来略微改变了语法,完全按照他显示的方式抱怨查询)
SELECT t1.title, array_agg(DISTINCT tag.tag)
FROM my_test t1, (select unnest(tags) as tag,title from my_test) as tag
where tag.title=t1.title
GROUP BY t1.title;
回答by Craig Ringer
Custom aggregate
自定义聚合
Approach 1: define a custom aggregate. Here's one I wrote earlier.
方法一:定义自定义聚合。这是我之前写的一篇。
CREATE TABLE my_test(title text, tags text[]);
INSERT INTO my_test(title, tags) VALUES
('ridealong', '{comedy,other}'),
('ridealong', '{comedy,tragedy}'),
('freddyjason', '{horror,silliness}');
CREATE AGGREGATE array_cat_agg(anyarray) (
SFUNC=array_cat,
STYPE=anyarray
);
select title, array_cat_agg(tags) from my_test group by title;
LATERAL query
横向查询
... or since you don't want to preserve order and want to deduplicate, you could use a LATERAL
query like:
...或者由于您不想保留订单并希望进行重复数据删除,您可以使用如下LATERAL
查询:
SELECT title, array_agg(DISTINCT tag ORDER BY tag)
FROM my_test, unnest(tags) tag
GROUP BY title;
in which case you don't need the custom aggregate. This one is probably a fair bit slower for big data sets due to the deduplication. Removing the ORDER BY
if not required may help, though.
在这种情况下,您不需要自定义聚合。由于重复数据删除,这对于大数据集来说可能要慢一些。不过,删除ORDER BY
if not required 可能会有所帮助。
回答by pozs
The obvious solution would be the LATERAL
join(which also suggested by @CraigRinger), but that is added to PostgreSQL in 9.3.
显而易见的解决方案是LATERAL
连接(@CraigRinger 也建议这样做),但它已在 9.3 中添加到 PostgreSQL。
In 9.1 you cannot avoid the sub-query, but you can simplify it:
在 9.1 中你不能避免子查询,但你可以简化它:
SELECT title, array_agg(DISTINCT tag)
FROM (SELECT title, unnest(tags) FROM my_test) AS t(title, tag)
GROUP BY title;