postgresql 如何在postgres中比较两个数组并仅选择不匹配的元素
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/6533029/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to compare two arrays and pick only the non matching elements In postgres
提问by ggvvkk
How can I pick only the non matching elements between two arrays.
如何仅选择两个数组之间的不匹配元素。
Example:
例子:
base_array [12,3,5,7,8]
temp_array [3,7,8]
So here I want to compare both the arrays and remove the matching elements from the base array.
所以在这里我想比较两个数组并从基本数组中删除匹配的元素。
Now base_array should be like [12,5]
现在 base_array 应该像 [12,5]
回答by Denis de Bernardy
I'd approach this with the array operator.
我会用数组运算符来解决这个问题。
select array(select unnest(:arr1) except select unnest(:arr2));
If :arr1 and :arr2 don't intersect, using array_agg() leads to a null.
如果 :arr1 和 :arr2 不相交,则使用 array_agg() 会导致空值。
回答by a_horse_with_no_name
select array_agg(elements)
from (
select unnest(array[12,3,5,7,8])
except
select unnest(array[3,7,8])
) t (elements)
回答by Joshua Burns
I've constructed a set of functions to deal specifically with these types of issues: https://github.com/JDBurnZ/anyarray
我已经构建了一组函数来专门处理这些类型的问题:https: //github.com/JDBurnZ/anyarray
The greatest thing is these functions work across ALL data-types, not JUST integers, as intarray
is limited to.
最重要的是这些函数适用于所有数据类型,而不仅仅是整数,intarray
仅限于此。
After loading loading the functions defined in those SQL files from GitHub, all you'd need to do is:
从 GitHub 加载这些 SQL 文件中定义的函数后,您需要做的就是:
SELECT
ANYARRAY_DIFF(
ARRAY[12, 3, 5, 7, 8],
ARRAY[3, 7, 8]
)
Returns something similar to: ARRAY[12, 5]
返回类似于: ARRAY[12, 5]
If you also need to return the values sorted:
如果您还需要返回排序后的值:
SELECT
ANYARRAY_SORT(
ANYARRAY_DIFF(
ARRAY[12, 3, 5, 7, 8],
ARRAY[3, 7, 8]
)
)
Returns exactly: ARRAY[5, 12]
准确返回: ARRAY[5, 12]
回答by peufeu
Let's try the unnest() / except :
让我们试试 unnest() / except :
EXPLAIN ANALYZE SELECT array(select unnest(ARRAY[1,2,3,n]) EXCEPT SELECT unnest(ARRAY[2,3,4,n])) FROM generate_series( 1,10000 ) n;
Function Scan on generate_series n (cost=0.00..62.50 rows=1000 width=4) (actual time=1.373..140.969 rows=10000 loops=1)
SubPlan 1
-> HashSetOp Except (cost=0.00..0.05 rows=1 width=0) (actual time=0.011..0.011 rows=1 loops=10000)
-> Append (cost=0.00..0.04 rows=2 width=0) (actual time=0.002..0.008 rows=8 loops=10000)
-> Subquery Scan "*SELECT* 1" (cost=0.00..0.02 rows=1 width=0) (actual time=0.002..0.003 rows=4 loops=10000)
-> Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.001..0.002 rows=4 loops=10000)
-> Subquery Scan "*SELECT* 2" (cost=0.00..0.02 rows=1 width=0) (actual time=0.001..0.003 rows=4 loops=10000)
-> Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.001..0.002 rows=4 loops=10000)
Total runtime: 142.531 ms
And the intarray special operator :
和 intarray 特殊运算符:
EXPLAIN ANALYZE SELECT ARRAY[1,2,3,n] - ARRAY[2,3,4,n] FROM generate_series( 1,10000 ) n;
Function Scan on generate_series n (cost=0.00..15.00 rows=1000 width=4) (actual time=1.338..11.381 rows=10000 loops=1)
Total runtime: 12.306 ms
Baseline :
基线:
EXPLAIN ANALYZE SELECT ARRAY[1,2,3,n], ARRAY[2,3,4,n] FROM generate_series( 1,10000 ) n;
Function Scan on generate_series n (cost=0.00..12.50 rows=1000 width=4) (actual time=1.357..7.139 rows=10000 loops=1)
Total runtime: 8.071 ms
Time per array intersection :
每个阵列交点的时间:
intarray - : 0.4 μs
unnest() / intersect : 13.4 μs
Of course the intarray way is much faster, but I find it amazing that postgres can zap a dependent subquery (which contains a hash and other stuff) in 13.4 μs...
当然, intarray 方式要快得多,但我发现 postgres 可以在 13.4 μs 内快速处理依赖子查询(包含哈希和其他内容)令人惊讶......
回答by Flimzy
The contrib/intarraymodule provides this functionality--for arrays of integers, anyway. For other data types, you may have to write your own functions (or modify the ones provided with intarray).
所述的contrib / intarray模块提供此功能-为整数数组,无论如何。对于其他数据类型,您可能需要编写自己的函数(或修改 intarray 提供的函数)。
回答by MichaelG
An extension to Denis' answer that returns the difference, regardless of which array was entered first. It's not the most concise query, maybe someone has a tidier way.
Denis 答案的扩展返回差异,无论首先输入哪个数组。这不是最简洁的查询,也许有人有更整洁的方式。
select array_cat(
(select array(select unnest(a.b::int[]) except select unnest(a.c::int[]))),
(select array(select unnest(a.c::int[]) except select unnest(a.b::int[]))))
from (select '{1,2}'::int[] b,'{1,3}'::int[] c) as a;
Returns:
返回:
{2,3}
回答by danjuggler
I would create a function using the same except logic as described by @a_horse_with_no_name:
我将使用与@a_horse_with_no_name 描述的逻辑相同的逻辑创建一个函数:
CREATE FUNCTION array_subtract(a1 int[], a2 int[]) RETURNS int[] AS $$
DECLARE
ret int[];
BEGIN
IF a1 is null OR a2 is null THEN
return a1;
END IF;
SELECT array_agg(e) INTO ret
FROM (
SELECT unnest(a1)
EXCEPT
SELECT unnest(a2)
) AS dt(e);
RETURN ret;
END;
$$ language plpgsql;
Then you can use this function to change your base_array variable accordingly:
然后您可以使用此函数相应地更改 base_array 变量:
base_array := array_subtract(base_array, temp_array);
Using the @Denis's faster solution, and only SQL, we can express a generic function as
使用@Denis 更快的解决方案,并且只有 SQL,我们可以将泛型函数表示为
CREATE FUNCTION array_subtract(anyarray,anyarray) RETURNS anyarray AS $f$
SELECT array(
SELECT unnest()
EXCEPT
SELECT unnest()
)
$f$ language SQL IMMUTABLE;