bash:如何根据模式从数组中删除元素

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/3578584/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-09 19:37:17  来源:igfitidea点击:

bash: how to delete elements from an array based on a pattern

arraysbashlist

提问by kynan

Say I have a bash array (e.g. the array of all parameters) and want to delete all parameters matching a certain pattern or alternatively copy all remaining elements to a new array. Alternatively, the other way round, keep elements matching a pattern.

假设我有一个 bash 数组(例如,所有参数的数组)并且想要删除与某个模式匹配的所有参数,或者将所有剩余元素复制到新数组中。或者,反过来,保持元素匹配模式。

An example for illustration:

举例说明:

x=(preffoo bar foo prefbaz baz prefbar)

and I want to delete everything starting with prefin order to get

我想删除所有开头的内容pref以获得

y=(bar foo baz)

(the order is not relevant)

(订单不相关)

What if I want the same thing for a list of words separated by whitespace?

如果我想要一个由空格分隔的单词列表的相同内容怎么办?

x="preffoo bar foo prefbaz baz prefbar"

and again delete everything starting with prefin order to get

并再次删除所有开头的内容pref以获得

y="bar foo baz"

采纳答案by camh

To strip a flat string (Hulk has already given the answer for arrays), you can turn on the extglobshell option and run the following expansion

要剥离扁平字符串(Hulk 已经给出了数组的答案),可以打开extglobshell 选项并运行以下扩展

$ shopt -s extglob
$ unset x
$ x="preffoo bar foo prefbaz baz prefbar"
$ echo ${x//pref*([^ ])?( )}
bar foo baz

The extgloboption is needed for the *(pattern-list)and ?(pattern-list)forms. This allows you to use regular expressions (although in a different form to most regular expressions) instead of just pathname expansion (*?[).

和表格extglob需要该选项。这允许您使用正则表达式(尽管与大多数正则表达式的形式不同)而不仅仅是路径名扩展 ( )。*(pattern-list)?(pattern-list)*?[

The answer that Hulk has given for arrays will work only on arrays. If it appears to work on flat strings, its only because in testing the array was not unset first.

Hulk 为数组给出的答案仅适用于数组。如果它似乎适用于扁平字符串,那只是因为在测试数组时没有先取消设置。

e.g.

例如

$ x=(preffoo bar foo prefbaz baz prefbar)
$ echo ${x[@]//pref*/}
bar foo baz
$ x="preffoo bar foo prefbaz baz prefbar"
$ echo ${x[@]//pref*/}
bar foo baz
$ unset x
$ x="preffoo bar foo prefbaz baz prefbar"
$ echo ${x[@]//pref*/}

$

回答by Adam Badura

Filtering an array is tricky if you consider possibility of elements containing spaces (not to mention even "weirder" characters). In particular answers given so far (referring to various forms of ${x[@]//pref*/}) will fail with such arrays.

如果您考虑元素包含空格的可能性(更不用说甚至“更奇怪”的字符),则过滤数组会很棘手。特别是到目前为止给出的答案(指的是 的各种形式${x[@]//pref*/})将因此类数组而失败。

I have investigated this issue somewhat and found a solution however it is not a nice one-liner. But at least it is.

我对这个问题进行了一些调查并找到了一个解决方案,但它不是一个很好的单线。但至少是这样。

For illustration examples let's assume ARRnames the array we want to filter. We shall start with the core expression:

作为说明示例,让我们假设ARR命名我们要过滤的数组。我们将从核心表达式开始:

for index in "${!ARR[@]}" ; do [[ …condition… ]] && unset -v 'ARR[$index]' ; done
ARR=("${ARR[@]}")

There are already few elements worth mentioning:

已经有几个元素值得一提:

  1. "${!ARR[@]}"evaluates to indexes of the array (as opposed to elements).
  2. The form "${!ARR[@]}"is a must. You must not skip quotes or change @to *. Or else the expression will break on associative arrays where keys contain spaces (for example).
  3. The part after docan be whatever you want. The idea is only that you must do unsetas shown for the elements that you don't want to have in the array.
  4. It is advised or even neededto use -vand quotes with unsetor else bad things may happen.
  5. If the part after dois as suggested above, you can use either &&or ||to filter out the elements that either pass or fail the condition.
  6. The second line, reassignment of ARR, is needed only with non-associative arrays and will break with associative arrays. (I didn't quickly came out with a generic expression that will handle both while I don't need one…). For ordinary arrays it is needed if you want to have consecutive indexes. Because unseton an array element does not modify (drop by one) elements of higher indexes - it just makes a hole in the indexes. Now if you only iterate over the array (or expand it as a whole) this makes no problem. But for other cases you need to reassign indexes. Note also that if you had any hole in the indexes before it will be removed as well. So if you need to preserve existing holes more logic has to be done beside the unsetand final reassignment.
  1. "${!ARR[@]}"计算数组的索引(而不是元素)。
  2. 表格"${!ARR[@]}"是必须的。您不得跳过引号或更改@*. 否则表达式将在键包含空格的关联数组上中断(例如)。
  3. 后面的部分do可以是任何你想要的。这个想法只是您必须unset按照所示为您不想在数组中包含的元素执行操作。
  4. 建议甚至需要使用-v和引用,unset否则可能会发生不好的事情。
  5. 如果后面的部分do如上所示,您可以使用&&||过滤掉通过或失败条件的元素。
  6. 第二行,重新分配 of ARR,仅在非关联数组中需要,并且会与关联数组中断。(我没有很快想出一个通用表达式来处理这两种情况,而我不需要……)。对于普通数组,如果您想要连续索引,则需要它。因为unset在数组元素上不会修改(删除一个)更高索引的元素 - 它只会在索引中留下一个洞。现在,如果您只遍历数组(或将其扩展为一个整体),这没有问题。但是对于其他情况,您需要重新分配索引。另请注意,如果您在索引中有任何漏洞,它也会被删除。因此,如果您需要保留现有的漏洞,除了unset最终的重新分配之外,还必须完成更多的逻辑。

Now as it comes to the condition. The [[ ]]expression is an easy way if you can use it. (See here.) In particular it supports regular expression matching using the Extended Regular Expressions. (See here.) Also be careful with using grepor any other line-based tool for this if you expect that array elements can contain not only spaces but also new lines. (While a very nasty file name could have a new line character I think…)

现在说到条件。[[ ]]如果您可以使用该表达式,则它是一种简单的方法。(请参阅此处。)特别是它支持使用Extended Regular Expressions 的正则表达式匹配。(请参阅此处。)grep如果您希望数组元素不仅可以包含空格还可以包含新行,请注意使用或任何其他基于行的工具。(虽然我认为一个非常讨厌的文件名可能会有一个换行符......)



Referring to the question itself the [[ ]]expression would have to be:

提到问题本身,[[ ]]表达式必须是:

[[ ${ARR[$index]} =~ ^pref ]]

(with && unsetas above)

&& unset如上)



Let's finally see how this works with those difficult cases. First we construct the array:

让我们最终看看这如何处理那些困难的情况。首先我们构造数组:

declare -a ARR='([0]="preffoo" [1]="bar" [2]="foo" [3]="prefbaz" [4]="baz" [5]="prefbar" [6]="pref with spaces")'
ARR+=($'pref\nwith\nnew line')
ARR+=($'\npref with new line before')

we can see that we have all the complex cases by running declare -p ARRand getting:

我们可以通过运行declare -p ARR和获取来看到我们拥有所有复杂的情况:

declare -a ARR='([0]="preffoo" [1]="bar" [2]="foo" [3]="prefbaz" [4]="baz" [5]="prefbar" [6]="pref with spaces" [7]="pref
with
new line" [8]="
pref with new line before")'

Now we run the filter expression:

现在我们运行过滤器表达式:

for index in "${!ARR[@]}" ; do [[ ${ARR[$index]} =~ ^pref ]] && unset -v 'ARR[$index]' ; done

and another test (declare -p ARR) gives expected:

另一个测试 ( declare -p ARR) 给出了预期:

declare -a ARR='([1]="bar" [2]="foo" [4]="baz" [8]="
pref with new line before")'

note how all elements starting with prefwere removed but indexes did not change. Note also that ${ARRAY[8]}is still there since it starts with new line rather than pref.

注意所有以 开头的元素是如何pref被删除的,但索引没有改变。另请注意,${ARRAY[8]}它仍然存在,因为它以新行而不是pref.

Now for the final reassignment:

现在进行最后的重新分配:

ARR=("${ARR[@]}")

and check (declare -p ARR):

并检查 ( declare -p ARR):

declare -a ARR='([0]="bar" [1]="foo" [2]="baz" [3]="
pref with new line before")'

which is exactly what was expected.

这正是预期的结果。



For the closing notes. It would be nice if this could be changed into a flexible one-liner. But I don't think there is a way to get it shorter and simpler as it is now without defining functions or alike.

对于结束语。如果可以将其更改为灵活的单线,那就太好了。但我不认为有办法让它更短更简单,因为它现在不需要定义函数或类似的东西。

As for the function it would be nice as well to have it accept array, return array and have easy to configure test to exclude or keep. But I'm not good enough with Bash to do it now.

至于函数,让它接受数组,返回数组并易于配置测试以排除或保留也很好。但是我现在用 Bash 还不够好。

回答by Paused until further notice.

Another way to strip a flat string is to convert it to an array then use the array method:

另一种去除扁平字符串的方法是将其转换为数组,然后使用数组方法:

x="preffoo bar foo prefbaz baz prefbar"
x=($x)
x=${x[@]//pref*}

Contrast this with starting and ending with an array:

将此与以数组开头和结尾的情况进行对比:

x=(preffoo bar foo prefbaz baz prefbar)
x=(${x[@]//pref*})

回答by Hulk

You can do this:

你可以这样做:

Delete all occurrences of substring.

删除所有出现的子串。

# Not specifing a replacement defaults to 'delete' ...
echo ${x[@]//pref*/}      # one two three four ve ve
#               ^^          # Applied to all elements of the array.

Edit:

编辑:

For white spaces it's kind of same

对于白色空间,它有点相同

x="preffoo bar foo prefbaz baz prefbar"
echo ${x[@]//pref*/}

Output:

输出:

bar foo baz

酒吧富巴兹

回答by Marcin

Here's a way using grep:

这是使用 grep 的一种方法:

(IFS=$'\n' && echo "${MY_ARR[*]}") | grep '[^.]*.pattern/[^.]*.txt'

The meat here is that IFS=$'\n'causes "${MY_ARR[*]}"to expand with newlines separating the items, so it can be piped through grep.

这里的主要内容是用换行符分隔项目IFS=$'\n'导致"${MY_ARR[*]}"扩展,因此它可以通过 grep 进行管道传输。

In particular, this will handle spaces embedded inside the items of the array.

特别是,这将处理嵌入在数组项中的空格。

回答by Kshitiz Sharma

I defined and used following function:

我定义并使用了以下函数:

# Removes elements from an array based on a given regex pattern.
# Usage: filter_arr pattern array
# Usage: filter_arr pattern element1 element2 ...
filter_arr() {  
    arr=($@)
    arr=(${arr[@]:1})
    dirs=($(for i in ${arr[@]}
        do echo $i
    done | grep -v ))
    echo ${dirs[@]}
}

Example usage:

用法示例:

$ arr=(chicken egg hen omelette)
$ filter_arr "n$" ${arr[@]}

Output:

输出:

egg omelette

煎蛋

The output from function is a string. To convert it back to an array:

函数的输出是一个字符串。要将其转换回数组:

$ arr2=(`filter_arr "n$" ${arr[@]}`)