bash bash中两个数组的比较/差异
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/2312762/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Compare/Difference of two arrays in bash
提问by Kiran
Is it possible to take the difference of two arrays in bash.
Would be really great if you could suggest me the way to do it.
是否可以在bash中取两个数组的差异。
如果你能建议我这样做的方法,那就太好了。
Code :
代码 :
Array1=( "key1" "key2" "key3" "key4" "key5" "key6" "key7" "key8" "key9" "key10" )
Array2=( "key1" "key2" "key3" "key4" "key5" "key6" )
Array3 =diff(Array1, Array2)
Array3 ideally should be :
Array3=( "key7" "key8" "key9" "key10" )
Appreciate your help.
感谢你的帮助。
采纳答案by ephemient
If you strictly want Array1 - Array2
, then
如果你严格想要Array1 - Array2
,那么
Array1=( "key1" "key2" "key3" "key4" "key5" "key6" "key7" "key8" "key9" "key10" )
Array2=( "key1" "key2" "key3" "key4" "key5" "key6" )
Array3=()
for i in "${Array1[@]}"; do
skip=
for j in "${Array2[@]}"; do
[[ $i == $j ]] && { skip=1; break; }
done
[[ -n $skip ]] || Array3+=("$i")
done
declare -p Array3
Runtime might be improved with associative arrays, but I personally wouldn't bother. If you're manipulating enough data for that to matter, shell is the wrong tool.
使用关联数组可能会改进运行时,但我个人不会打扰。如果您要处理足够的数据,那么 shell 是错误的工具。
For a symmetric difference like Dennis's answer, existing tools like comm
work, as long as we massage the input and output a bit (since they work on line-based files, not shell variables).
对于像丹尼斯的答案这样的对称差异comm
,只要我们稍微调整输入和输出(因为它们适用于基于行的文件,而不是 shell 变量),现有的工具就可以工作。
Here, we tell the shell to use newlines to join the array into a single string, and discard tabs when reading lines from comm
back into an array.
在这里,我们告诉 shell 使用换行符将数组连接成单个字符串,并在从comm
数组中读取行时丢弃制表符。
$ oldIFS=$IFS IFS=$'\n\t' $ Array3=($(comm -3 <(echo "${Array1[*]}") <(echo "${Array2[*]}"))) comm: file 1 is not in sorted order $ IFS=$oldIFS $ declare -p Array3 declare -a Array3='([0]="key7" [1]="key8" [2]="key9" [3]="key10")'
It complains because, by lexographical sorting, key1 < … < key9 > key10
. But since both input arrays are sorted similarly, it's fine to ignore that warning. You can use --nocheck-order
to get rid of the warning, or add a | sort -u
inside the <(…)
process substitution if you can't guarantee order&uniqueness of the input arrays.
它抱怨是因为,通过字典排序,key1 < … < key9 > key10
. 但是由于两个输入数组的排序方式相似,因此可以忽略该警告。如果您不能保证输入数组的顺序和唯一性,您可以使用--nocheck-order
来消除警告,或者| sort -u
在<(…)
进程内部添加一个替换。
回答by Ilya Bystrov
echo ${Array1[@]} ${Array2[@]} | tr ' ' '\n' | sort | uniq -u
Output
输出
key10
key7
key8
key9
You can add sorting if you need
如果需要,您可以添加排序
回答by SiegeX
Anytime a question pops up dealing with unique values that may not be sorted, my mind immediately goes to awk. Here is my take on it.
每当出现处理可能无法排序的唯一值的问题时,我的思绪都会立即进入 awk。这是我的看法。
Code
代码
#!/bin/bash
diff(){
awk 'BEGIN{RS=ORS=" "}
{NR==FNR?a[$ ./diffArray.sh
key10 key7 key8 key9
]++:a[ARR1=("key1" "key2" "key3" "key4" "key5" "key6" "key7" "key8" "key9" "key10")
ARR2=("key1" "key2" "key3" "key4" "key5" "key6")
mapfile -t RESULT < \
<(comm -23 \
<(IFS=$'\n'; echo "${ARR1[*]}" | sort) \
<(IFS=$'\n'; echo "${ARR2[*]}" | sort) \
)
echo "${RESULT[@]}" # outputs "key10 key7 key8 key9"
]--}
END{for(k in a)if(a[k])print k}' <(echo -n "${!1}") <(echo -n "${!2}")
}
Array1=( "key1" "key2" "key3" "key4" "key5" "key6" "key7" "key8" "key9" "key10" )
Array2=( "key1" "key2" "key3" "key4" "key5" "key6" )
Array3=($(diff Array1[@] Array2[@]))
echo ${Array3[@]}
Output
输出
function array_diff {
eval local ARR1=\(\"${[@]}\"\)
eval local ARR2=\(\"${[@]}\"\)
local IFS=$'\n'
mapfile -t < <(comm -23 <(echo "${ARR1[*]}" | sort) <(echo "${ARR2[*]}" | sort))
}
# usage:
array_diff RESULT ARR1 ARR2
echo "${RESULT[@]}" # outputs "key10 key7 key8 key9"
*Note**: Like other answers given, if there are duplicate keys in an array they will only be reported once; this may or may not be the behavior you are looking for. The awk code to handle that is messier and not as clean.
*注意**:与给出的其他答案一样,如果数组中有重复的键,它们只会被报告一次;这可能是也可能不是您正在寻找的行为。处理这个问题的 awk 代码更混乱,而且不那么干净。
回答by Alex Offshore
Having ARR1
and ARR2
as arguments, use comm
to do the job and mapfile
to put it back into RESULT
array:
将ARR1
和ARR2
作为参数,用于comm
完成工作mapfile
并将其放回RESULT
数组:
declare -A temp # associative array
for element in "${Array1[@]}" "${Array2[@]}"
do
((temp[$element]++))
done
for element in "${!temp[@]}"
do
if (( ${temp[$element]} > 1 ))
then
unset "temp[$element]"
fi
done
Array3=(${!temp[@]}) # retrieve the keys as values
Note that result may not meet source order.
请注意,结果可能不符合源顺序。
Bonus aka "that's what you are here for":
奖金又名“这就是你来这里的目的”:
declare -A temp1 temp2 # associative arrays
for element in "${Array1[@]}"
do
((temp1[$element]++))
done
for element in "${Array2[@]}"
do
((temp2[$element]++))
done
for element in "${!temp1[@]}"
do
if (( ${temp1[$element]} >= 1 && ${temp2[$element]-0} >= 1 ))
then
unset "temp1[$element]" "temp2[$element]"
fi
done
Array3=(${!temp1[@]} ${!temp2[@]})
Using those tricky evals is the least worst option among others dealing with array parameters passing in bash.
使用那些棘手的 eval 是处理传入 bash 的数组参数的最不糟糕的选择。
Also, take a look at comm
manpage; based on this code it's very easy to implement, for example, array_intersect
: just use -12 as comm options.
另外,看看comm
联机帮助页;基于此代码,它很容易实现,例如array_intersect
:只需使用 -12 作为通信选项。
回答by Paused until further notice.
In Bash 4:
在 Bash 4 中:
list1=( 1 2 3 4 6 7 8 9 10 11 12)
list2=( 1 2 3 5 6 8 9 11 )
l2=" ${list2[*]} " # add framing blanks
for item in ${list1[@]}; do
if ! [[ $l2 =~ " $item " ]] ; then # use $item as regexp
result+=($item)
fi
done
echo ${result[@]}:
Edit:
编辑:
ephemientpointed out a potentially serious bug. If an element exists in one array with one or more duplicates and doesn't exist at all in the other array, it will be incorrectly removed from the list of unique values. The version below attempts to handle that situation.
ephemient指出了一个潜在的严重错误。如果一个元素存在于一个具有一个或多个重复项的数组中,而在另一个数组中根本不存在,则它将被错误地从唯一值列表中删除。下面的版本试图处理这种情况。
$ bash diff-arrays.sh
4 7 10 12
回答by Denis Gois
It is possible to use regex too (based on another answer: Array intersection in bash):
也可以使用正则表达式(基于另一个答案:bash 中的数组交集):
Array1=( "key1" "key2" "key3" "key4" "key5" "key6" "key7" "key8" "key9" "key10" )
Array2=( "key1" "key2" "key3" "key4" "key5" "key6" )
Array3=( "key1" "key2" "key3" "key4" "key5" "key6" "key11" )
a1=${Array1[@]};a2=${Array2[@]}; a3=${Array3[@]}
diff(){
a1=""
a2=""
awk -va1="$a1" -va2="$a2" '
BEGIN{
m= split(a1, A1," ")
n= split(a2, t," ")
for(i=1;i<=n;i++) { A2[t[i]] }
for (i=1;i<=m;i++){
if( ! (A1[i] in A2) ){
printf A1[i]" "
}
}
}'
}
Array4=( $(diff "$a1" "$a2") ) #compare a1 against a2
echo "Array4: ${Array4[@]}"
Array4=( $(diff "$a3" "$a1") ) #compare a3 against a1
echo "Array4: ${Array4[@]}"
Result:
结果:
$ ./shell.sh
Array4: key7 key8 key9 key10
Array4: key11
回答by ghostdog74
output
输出
##代码##