postgresql 比较数组的相等性,忽略元素的顺序

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/12870105/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-21 00:25:12  来源:igfitidea点击:

Compare arrays for equality, ignoring order of elements

arrayspostgresql

提问by user766987

I have a table with 4 array columns.. the results are like:

我有一个包含 4 个数组列的表。结果如下:

ids       signed_ids   new_ids   new_ids_signed
{1,2,3} | {2,1,3}    | {4,5,6} | {6,5,4}

Anyway to compare idsand signed_idsso that they come out equal, by ignoring the order of the elements?

无论如何,通过忽略元素的顺序来比较idssigned_ids让它们相等?

采纳答案by Craig Ringer

The simplest thing to do is sort them and compare them sorted. See sorting arrays in PostgreSQL.

最简单的方法是对它们进行排序并比较它们的排序。请参阅PostgreSQL 中的排序数组

Given sample data:

给定样本数据:

CREATE TABLE aa(ids integer[], signed_ids integer[]);
INSERT INTO aa(ids, signed_ids) VALUES (ARRAY[1,2,3], ARRAY[2,1,3]);

the best thing to do is to if the array entries are always integers is to use the intarray extension, as Erwin explains in his answer. It's a lotfaster than any pure-SQL formulation.

最好的办法是,如果数组条目始终是整数,则使用 intarray 扩展名,正如Erwin 在他的回答中所解释的那样。这是一个很多比任何纯SQL配方更快。

Otherwise, for a general version that works for any data type, define an array_sort(anyarray):

否则,对于适用于任何数据类型的通用版本,定义一个array_sort(anyarray)

CREATE OR REPLACE FUNCTION array_sort(anyarray) RETURNS anyarray AS $$
SELECT array_agg(x order by x) FROM unnest() x;
$$ LANGUAGE 'SQL';

and use it sort and compare the sorted arrays:

并使用它对排序后的数组进行排序和比较:

SELECT array_sort(ids) = array_sort(signed_ids) FROM aa;

There's an important caveat:

有一个重要的警告:

SELECT array_sort( ARRAY[1,2,2,4,4] ) = array_sort( ARRAY[1,2,4] );

will be false. This may or may not be what you want, depending on your intentions.

会是假的。这可能是也可能不是您想要的,这取决于您的意图。



Alternately, define a function array_compare_as_set:

或者,定义一个函数array_compare_as_set

CREATE OR REPLACE FUNCTION array_compare_as_set(anyarray,anyarray) RETURNS boolean AS $$
SELECT CASE
  WHEN array_dims() <> array_dims() THEN
    'f'
  WHEN array_length(,1) <> array_length(,1) THEN
    'f'
  ELSE
    NOT EXISTS (
        SELECT 1
        FROM unnest() a 
        FULL JOIN unnest() b ON (a=b) 
        WHERE a IS NULL or b IS NULL
    )
  END
$$ LANGUAGE 'SQL' IMMUTABLE;

and then:

接着:

SELECT array_compare_as_set(ids, signed_ids) FROM aa;

This is subtly different from comparing two array_sorted values. array_compare_as_setwill eliminate duplicates, making array_compare_as_set(ARRAY[1,2,3,3],ARRAY[1,2,3])true, whereas array_sort(ARRAY[1,2,3,3]) = array_sort(ARRAY[1,2,3])will be false.

这与比较两个array_sorted 值略有不同。array_compare_as_set将消除重复项,使array_compare_as_set(ARRAY[1,2,3,3],ARRAY[1,2,3])真,而array_sort(ARRAY[1,2,3,3]) = array_sort(ARRAY[1,2,3])将假。

Both of these approaches will have pretty bad performance. Consider ensuring that you always store your arrays sorted in the first place.

这两种方法都会有非常糟糕的性能。考虑确保始终将数组存储在首位。

回答by Selman Tunc Yilmaz

You can use contained by operator:

您可以使用由运算符包含:

(array1 <@ array2 and array1 @> array2)

(array1 <@array2 and array1 @> array2)

回答by Erwin Brandstetter

Dealing with arrays of integeryou can install the extension intarray.

处理整数数组,您可以安装扩展intarray

Install once per database with (in Postgres 9.1 or later):

每个数据库安装一次(在 Postgres 9.1 或更高版本中):

CREATE EXTENSION intarray;

Then you can just:

然后你可以:

SELECT uniq(sort(ids)) = uniq(sort(signed_ids));

Or:

或者:

SELECT ids @> signed_ids AND ids <@ signed_ids;

Bold emphasis on functions and operators from intarray. Both expressions will ignore order and duplicity of elements. More about these functions and operators in the helpful manual here.

大胆强调 intarray 中的函数和运算符。这两个表达式都将忽略元素的顺序和重复性。有关这些功能和运算符的更多信息,请参阅此处的有用手册。

Notes:

笔记:

  • intarrayoperators only work for arrays of integer, not bigintor smallintor any other data type.
  • You can use containment operators @>and <@without installing intarraybecause there are generic variants for array types in the standard Postgres distribution. intarrayinstalls specialized operators for int[]only, which are typically faster.
  • Unlike the generic operators, the intarrayones do not accept NULL values in arrays, which can be confusing: now you get an error message if you have NULL in any involved array.
    If you need to work with NULL values, you can default to the standard, generic operatorsby schema-qualifying the operator with the OPERATORconstruct:

    SELECT ARRAY[1,4,null,3]::int[] OPERATOR(pg_catalog.@>) ARRAY[3,1]::int[]

    Related:

  • The generic operators can't use indexes with an intarrayoperator class and vice versa.

  • intarray运算符仅适用于integer, notbigintsmallint或任何其他数据类型的数组。
  • 您可以使用包含操作符@><@无需安装,intarray因为在标准 Postgres 发行版中有数组类型的通用变体。仅intarray安装专门的运算符int[],通常速度更快。
  • 与泛型运算符不同,这些运算符intarray不接受数组中的 NULL 值,这可能会令人困惑:现在,如果任何涉及的数组中有 NULL 值,则会收到错误消息。
    如果您需要使用 NULL 值,您可以通过以下构造对运算进行模式限定来默认使用标准的通用运算OPERATOR

    SELECT ARRAY[1,4,null,3]::int[] OPERATOR(pg_catalog.@>) ARRAY[3,1]::int[]

    有关的:

  • 泛型运算符不能将索引与intarray运算符类一起使用,反之亦然。

回答by Hardik

select (string_agg(a,',' order by a) = string_agg(b,',' order by b)) from (select unnest(array[1,2,3,2])::text as a,unnest(array[2,2,3,1])::text as b) A

select (string_agg(a,',' order by a) = string_agg(b,',' order by b)) from (select unnest(array[1,2,3,2])::text as a,unnest(数组[2,2,3,1])::text as b) A