postgresql 如何在postgres中比较两个数组并仅选择不匹配的元素

Question

提问by ggvvkk

How can I pick only the non matching elements between two arrays.

如何仅选择两个数组之间的不匹配元素。

Example:

例子：

base_array [12,3,5,7,8]
temp_array [3,7,8]

So here I want to compare both the arrays and remove the matching elements from the base array.

所以在这里我想比较两个数组并从基本数组中删除匹配的元素。

Now base_array should be like [12,5]

现在 base_array 应该像 [12,5]

Answer 1

回答by Denis de Bernardy

I'd approach this with the array operator.

我会用数组运算符来解决这个问题。

select array(select unnest(:arr1) except select unnest(:arr2));

If :arr1 and :arr2 don't intersect, using array_agg() leads to a null.

如果 :arr1 和 :arr2 不相交，则使用 array_agg() 会导致空值。

Answer 2

回答by a_horse_with_no_name

select array_agg(elements)
from (
  select unnest(array[12,3,5,7,8])
  except
  select unnest(array[3,7,8])
) t (elements)

Answer 3

回答by Joshua Burns

I've constructed a set of functions to deal specifically with these types of issues: https://github.com/JDBurnZ/anyarray

我已经构建了一组函数来专门处理这些类型的问题：https: //github.com/JDBurnZ/anyarray

The greatest thing is these functions work across ALL data-types, not JUST integers, as intarrayis limited to.

最重要的是这些函数适用于所有数据类型，而不仅仅是整数，intarray仅限于此。

After loading loading the functions defined in those SQL files from GitHub, all you'd need to do is:

从 GitHub 加载这些 SQL 文件中定义的函数后，您需要做的就是：

SELECT
  ANYARRAY_DIFF(
    ARRAY[12, 3, 5, 7, 8],
    ARRAY[3, 7, 8]
  )

Returns something similar to: ARRAY[12, 5]

返回类似于： ARRAY[12, 5]

If you also need to return the values sorted:

如果您还需要返回排序后的值：

SELECT
  ANYARRAY_SORT(
    ANYARRAY_DIFF(
      ARRAY[12, 3, 5, 7, 8],
      ARRAY[3, 7, 8]
    )
  )

Returns exactly: ARRAY[5, 12]

准确返回： ARRAY[5, 12]

Answer 4

回答by peufeu

Let's try the unnest() / except :

让我们试试 unnest() / except ：

EXPLAIN ANALYZE SELECT array(select unnest(ARRAY[1,2,3,n]) EXCEPT SELECT unnest(ARRAY[2,3,4,n])) FROM generate_series( 1,10000 ) n;
 Function Scan on generate_series n  (cost=0.00..62.50 rows=1000 width=4) (actual time=1.373..140.969 rows=10000 loops=1)
   SubPlan 1
     ->  HashSetOp Except  (cost=0.00..0.05 rows=1 width=0) (actual time=0.011..0.011 rows=1 loops=10000)
           ->  Append  (cost=0.00..0.04 rows=2 width=0) (actual time=0.002..0.008 rows=8 loops=10000)
                 ->  Subquery Scan "*SELECT* 1"  (cost=0.00..0.02 rows=1 width=0) (actual time=0.002..0.003 rows=4 loops=10000)
                       ->  Result  (cost=0.00..0.01 rows=1 width=0) (actual time=0.001..0.002 rows=4 loops=10000)
                 ->  Subquery Scan "*SELECT* 2"  (cost=0.00..0.02 rows=1 width=0) (actual time=0.001..0.003 rows=4 loops=10000)
                       ->  Result  (cost=0.00..0.01 rows=1 width=0) (actual time=0.001..0.002 rows=4 loops=10000)
 Total runtime: 142.531 ms

And the intarray special operator :

和 intarray 特殊运算符：

EXPLAIN ANALYZE SELECT ARRAY[1,2,3,n] - ARRAY[2,3,4,n] FROM generate_series( 1,10000 ) n;
 Function Scan on generate_series n  (cost=0.00..15.00 rows=1000 width=4) (actual time=1.338..11.381 rows=10000 loops=1)
 Total runtime: 12.306 ms

Baseline :

基线：

EXPLAIN ANALYZE SELECT ARRAY[1,2,3,n], ARRAY[2,3,4,n] FROM generate_series( 1,10000 ) n;
 Function Scan on generate_series n  (cost=0.00..12.50 rows=1000 width=4) (actual time=1.357..7.139 rows=10000 loops=1)
 Total runtime: 8.071 ms

Time per array intersection :

每个阵列交点的时间：

intarray -           :  0.4 μs
unnest() / intersect : 13.4 μs

Of course the intarray way is much faster, but I find it amazing that postgres can zap a dependent subquery (which contains a hash and other stuff) in 13.4 μs...

当然， intarray 方式要快得多，但我发现 postgres 可以在 13.4 μs 内快速处理依赖子查询（包含哈希和其他内容）令人惊讶......

Answer 5

回答by Flimzy

The contrib/intarraymodule provides this functionality--for arrays of integers, anyway. For other data types, you may have to write your own functions (or modify the ones provided with intarray).

所述的contrib / intarray模块提供此功能-为整数数组，无论如何。对于其他数据类型，您可能需要编写自己的函数（或修改 intarray 提供的函数）。

Answer 6

回答by MichaelG

An extension to Denis' answer that returns the difference, regardless of which array was entered first. It's not the most concise query, maybe someone has a tidier way.

Denis 答案的扩展返回差异，无论首先输入哪个数组。这不是最简洁的查询，也许有人有更整洁的方式。

select array_cat(
   (select array(select unnest(a.b::int[]) except select unnest(a.c::int[]))),
   (select array(select unnest(a.c::int[]) except select unnest(a.b::int[]))))
from (select '{1,2}'::int[] b,'{1,3}'::int[] c) as a;

Returns:

返回：

{2,3}

Answer 7

回答by danjuggler

I would create a function using the same except logic as described by @a_horse_with_no_name:

我将使用与@a_horse_with_no_name 描述的逻辑相同的逻辑创建一个函数：

CREATE FUNCTION array_subtract(a1 int[], a2 int[]) RETURNS int[] AS $$
DECLARE
    ret int[];
BEGIN
    IF a1 is null OR a2 is null THEN
        return a1;
    END IF;
    SELECT array_agg(e) INTO ret
    FROM (
        SELECT unnest(a1)
        EXCEPT
        SELECT unnest(a2)
    ) AS dt(e);
    RETURN ret;
END;
$$ language plpgsql;

Then you can use this function to change your base_array variable accordingly:

然后您可以使用此函数相应地更改 base_array 变量：

base_array := array_subtract(base_array, temp_array);

Using the @Denis's faster solution, and only SQL, we can express a generic function as

使用@Denis 更快的解决方案，并且只有 SQL，我们可以将泛型函数表示为

CREATE FUNCTION array_subtract(anyarray,anyarray) RETURNS anyarray AS $f$
  SELECT array(
    SELECT unnest()
    EXCEPT
    SELECT unnest()
  )
$f$ language SQL IMMUTABLE;

postgresql 如何在postgres中比较两个数组并仅选择不匹配的元素

提问by ggvvkk

回答by Denis de Bernardy

回答by a_horse_with_no_name

回答by Joshua Burns

回答by peufeu

回答by Flimzy

回答by MichaelG

回答by danjuggler

相关推荐

最近更新

标签

postgresql 如何在postgres中比较两个数组并仅选择不匹配的元素

提问by ggvvkk

回答by Denis de Bernardy

回答by a_horse_with_no_name

回答by Joshua Burns

回答by peufeu

回答by Flimzy

回答by MichaelG

回答by danjuggler

相关推荐

是否有用于 postgresql 的 PHP mysql_real_escape_string？

postgresql 在 Postgres 中使用复制命令时出错（错误：日期类型的输入语法无效：“”）

确保唯一 ID 的 PostgreSQL 序列

Windows 7 上的 PostgreSQL ODBC 驱动程序未显示

相关推荐

最近更新

标签