bash 使用空格而不是双引号内的空格将字符串(存储在变量中)拆分为多个单词
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/17338863/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Split a string (stored in a variable) into multiple words using spaces but not the spaces within double quotes
提问by Zettt
I'm trying to write a, for me, complicated script where my goal is to do the following. I have a string coming in that looks like this:
我正在尝试为我编写一个复杂的脚本,我的目标是执行以下操作。我有一个字符串,看起来像这样:
2012 2013 "multiple words"
My goal is to put each of these onto an array split by spaces, but only for single word matches, not those surrounded by double quotes. Those should be considered one word. So my idea was to do this in two steps. First match those words that are multiples, remove those from the string, then in another iteration split by white space.
Unfortunately I can't find help on how to echothe match only. So far I have this:
我的目标是将它们中的每一个放到一个由空格分割的数组中,但仅用于单个单词匹配,而不是那些被双引号包围的匹配。那些应该被认为是一个词。所以我的想法是分两步完成。首先匹配那些是倍数的单词,从字符串中删除它们,然后在另一个迭代中用空格分割。
不幸的是,我无法找到有关如何echo匹配的帮助。到目前为止,我有这个:
array=$(echo $tags | sed -nE 's/"(.+)"//p')
But this would result in (on OS X):
但这会导致(在 OS X 上):
2012 2013 multiple words
Expected result:
预期结果:
array[1]="2012"
array[2]="2013"
array[3]="multiple words"
How would I go about this sort of problem?
我将如何解决此类问题?
Thanks.
谢谢。
回答by iruvar
evalis evil, but this may be one of those cases where it comes handy
eval是邪恶的,但这可能是它派上用场的情况之一
str='2012 2013 "multiple words"'
eval x=($str)
echo ${x[2]}
multiple words
Or with more recent versions of bash(tested on 4.3)
或者使用更新版本的bash(在 4.3 上测试)
s='2012 2013 "multiple words"'
declare -a 'a=('"$s"')'
printf "%s\n" "${a[@]}"
2012
2013
multiple words
回答by l0b0
$ grep -Eo '"[^"]*"|[^" ]*' <<< '2012 2013 "multiple words"'
2012
2013
"multiple words"
That is, print onlythe strings matching either
也就是说,只打印匹配的字符串
- a quote followed by any number (even zero) non-quotes followed by a quote or
- a series of characters not containing a quote or space.
- 引号后跟任意数字(甚至零)非引号后跟引号或
- 不包含引号或空格的一系列字符。
Of course, this does nothandle complicated cases like quotes spanning multiple lines or escaped quotes (using either double quotes like SQL or backslash like the shell).
当然,这不能处理复杂的情况,例如跨多行的引号或转义引号(使用 SQL 之类的双引号或 shell 之类的反斜杠)。
回答by anubhava
You can directly do:
你可以直接这样做:
arr=(2012 2013 "multiple words")
echo ${#arr[@]} # gives 3
echo ${arr[2]} # gives "multiple words"
EDIT:Not sure if it helps the OP but following will also workL
编辑:不确定它是否对 OP 有帮助,但以下也将起作用
str='2012 2013 "multiple\ words"'
read -a arr <<< $str
echo ${#arr[@]} # gives 3
echo ${arr[2]} # gives "multiple words"
回答by zekus
The following will produce the result you want:
以下将产生您想要的结果:
tags='2012 2013 "multiple words"'
IFS=$'\n'; array=($(echo $tags | egrep -o '"[^"]*"|\S+'))
result in ZSH:
结果在 ZSH:
echo ${array[1]} # 2012
echo ${array[2]} # 2013
echo ${array[3]} # "multiple words"
result in BASH:
结果在 BASH:
echo ${array[0]} # 2012
echo ${array[1]} # 2013
echo ${array[2]} # "multiple words"
works in OSX.
在 OSX 中工作。
回答by dawg
Here is a small Python script to parse space delimited csv while respecting quoted fields:
这是一个小的 Python 脚本,用于在尊重引用字段的同时解析空格分隔的 csv:
$ python -c '
import csv, fileinput
for line in csv.reader(fileinput.input(), delimiter=" "):
for word in line:
print word
' test.csv
2012
2013
multiple words
Since this uses the fileinput module, works in a pipeline (or a string in a variable) as well:
由于这使用了 fileinput 模块,因此也适用于管道(或变量中的字符串):
$ str='2012 2013 "multiple words"'
$ echo $str | python -c '
import csv, fileinput
for line in csv.reader(fileinput.input(), delimiter=" "):
for word in line:
print word
'
2012
2013
multiple words

