bash bash中的数组交集

Question

提问by dabest1

How do you compare two arrays in bash to find all intersecting values?

你如何比较 bash 中的两个数组以找到所有相交的值？

Let's say:
array1 contains values 1 and 2
array2 contains values 2 and 3

假设：
array1 包含值 1 和 2
array2 包含值 2 和 3

I should get back 2 as a result.

结果我应该拿回2。

My own answer, which I can't post yet due to small reputation:

我自己的答案，由于名气不大，我还不能发布：

for item1 in $array1; do
    for item2 in $array2; do
        if [[ $item1 = $item2 ]]; then
            result=$result" "$item1
        fi
    done
done

I'm looking for alternate solutions as well.

我也在寻找替代解决方案。

Answer 1

回答by Fritz G. Mehner

The elements of list 1 are used as regular expression looked up in list2 (expressed as string: ${list2[*]} ):

列表 1 的元素用作在列表 2 中查找的正则表达式（表示为字符串： ${list2[*]} ）：

list1=( 1 2 3 4   6 7 8 9 10 11 12)
list2=( 1 2 3   5 6   8 9    11 )

l2=" ${list2[*]} "                    # add framing blanks
for item in ${list1[@]}; do
  if [[ $l2 =~ " $item " ]] ; then    # use $item as regexp
    result+=($item)
  fi
done
echo  ${result[@]}

The result is

结果是

1 2 3 6 8 9 11

Answer 2

回答by nhed

Taking @Raihan's answer and making it work with non-files (though FDs are created) I know it's a bit of a cheat but seemed like good alternative

接受@Raihan 的回答并使其适用于非文件（尽管创建了 FD）我知道这有点作弊，但似乎是不错的选择

Side effect is that the output array will be lexicographically sorted, hope thats okay (also don't kno what type of data you have, so I just tested with numbers, there may be additional work needed if you have strings with special chars etc)

副作用是输出数组将按字典顺序排序，希望没问题（也不知道你有什么类型的数据，所以我只是用数字测试，如果你有带有特殊字符的字符串等，可能需要额外的工作）

result=($(comm -12 <(for X in "${array1[@]}"; do echo "${X}"; done|sort)  <(for X in "${array2[@]}"; do echo "${X}"; done|sort)))

Testing:

测试：

$ array1=(1 17 33 99 109)
$ array2=(1 2 17 31 98 109)

result=($(comm -12 <(for X in "${array1[@]}"; do echo "${X}"; done|sort)  <(for X in "${array2[@]}"; do echo "${X}"; done|sort)))

$ echo ${result[@]}
1 109 17

p.s. I'm sure there was a way to get the array to out one value per line w/o the forloop, I just forget it (IFS?)

ps我确定有一种方法可以让数组在没有for循环的情况下每行输出一个值，我只是忘记了（IFS？）

Answer 3

回答by Raihan

If it was two files (instead of arrays) you were looking for intersecting lines, you could use the commcommand.

如果您正在寻找相交线的两个文件（而不是数组），则可以使用该comm命令。

$ comm -12 file1 file2

Answer 4

回答by ruakh

Your answer won't work, for two reasons:

您的回答无效，原因有二：

$array1just expands to the first element of array1. (At least, in my installed version of Bash that's how it works. That doesn't seem to be a documented behavior, so it may be a version-dependent quirk.)
After the first element gets added to result, resultwill then contain a space, so the next run of result=$result" "$item1will misbehave horribly. (Instead of appending to result, it will run the command consisting of the first two items, with the environment variable resultbeing set to the empty string.) Correction:Turns out, I was wrong about this one: word-splitting doesn't take place inside assignments. (See comments below.)

$array1只是扩展到的第一个元素array1。（至少，在我安装的 Bash 版本中，它是这样工作的。这似乎不是记录在案的行为，所以它可能是一个依赖于版本的怪癖。）
在第一个元素被添加到之后result，result将包含一个空格，所以下一次运行的result=$result" "$item1将表现得非常糟糕。（而不是附加到result，它将运行由前两项组成的命令，环境变量result被设置为空字符串。）更正：结果，我错了：不发生分词内部作业。（见下面的评论。）

What you want is this:

你想要的是这个：

result=()
for item1 in "${array1[@]}"; do
    for item2 in "${array2[@]}"; do
        if [[ $item1 = $item2 ]]; then
            result+=("$item1")
        fi
    done
done

Answer 5

回答by ruakh

Now that I understand what you mean by "array", I think -- first of all -- that you should consider using actual Bash arrays. They're much more flexible, in that (for example) array elements can contain whitespace, and you can avoid the risk that *and ?will trigger filename expansion.

现在我明白了“数组”的意思，我认为——首先——你应该考虑使用实际的 Bash 数组。他们更灵活，在（例如）数组元素可以包含空格，你能避免这种风险*，并?会触发文件名扩展。

But if you prefer to use your existing approach of whitespace-delimited strings, then I agree with RHT's suggestion to use Perl:

但是，如果您更喜欢使用现有的以空格分隔的字符串方法，那么我同意 RHT 使用 Perl 的建议：

result=$(perl -e 'my %array2 = map +($_ => 1), split /\s+/, $ARGV[1];
                  print join " ", grep $array2{$_}, split /\s+/, $ARGV[0]
                 ' "$array1" "$array2")

(The line-breaks are just for readability; you can get rid of them if you want.)

（换行符只是为了可读性；如果你愿意，你可以去掉它们。）

In the above Bash command, the embedded Perl program creates a hash named %array2containing the elements of the second array, and then it prints any elements of the first array that exist in %array2.

在上面的 Bash 命令中，嵌入式 Perl 程序创建一个名为的散列，%array2其中包含第二个数组的元素，然后打印第一个数组中存在于%array2.

This will behave slightly differently from your code in how it handles duplicate values in the second array; in your code, if array1contains xtwice and array2contains xthree times, then resultwill contain xsix times, whereas in my code, resultwill contain xonly twice. I don't know if that matters, since I don't know your exact requirements.

这将与您的代码在处理第二个数组中的重复值的方式上略有不同；在您的代码中，如果array1包含x两次并array2包含x三次，result则将包含x六次，而在我的代码中，result将x只包含两次。我不知道这是否重要，因为我不知道您的确切要求。

bash bash中的数组交集

提问by dabest1

回答by Fritz G. Mehner

回答by nhed

回答by Raihan

回答by ruakh

回答by ruakh

相关推荐

最近更新

标签

bash bash中的数组交集

提问by dabest1

回答by Fritz G. Mehner

回答by nhed

回答by Raihan

回答by ruakh

回答by ruakh

相关推荐

无法在 redhat linux 中执行 bash 脚本

bash 是否可以仅 grep 一列并同时打印其他 SELECTED 输出

bash + expect，在后台运行

bash 期待脚本问题

相关推荐

最近更新

标签