bash 使用空格而不是双引号内的空格将字符串（存储在变量中）拆分为多个单词

Question

提问by Zettt

I'm trying to write a, for me, complicated script where my goal is to do the following. I have a string coming in that looks like this:

我正在尝试为我编写一个复杂的脚本，我的目标是执行以下操作。我有一个字符串，看起来像这样：

2012 2013 "multiple words"

My goal is to put each of these onto an array split by spaces, but only for single word matches, not those surrounded by double quotes. Those should be considered one word. So my idea was to do this in two steps. First match those words that are multiples, remove those from the string, then in another iteration split by white space.
Unfortunately I can't find help on how to echothe match only. So far I have this:

我的目标是将它们中的每一个放到一个由空格分割的数组中，但仅用于单个单词匹配，而不是那些被双引号包围的匹配。那些应该被认为是一个词。所以我的想法是分两步完成。首先匹配那些是倍数的单词，从字符串中删除它们，然后在另一个迭代中用空格分割。
不幸的是，我无法找到有关如何echo匹配的帮助。到目前为止，我有这个：

array=$(echo $tags | sed -nE 's/"(.+)"//p')

But this would result in (on OS X):

但这会导致（在 OS X 上）：

2012 2013 multiple words

Expected result:

预期结果：

array[1]="2012"
array[2]="2013"
array[3]="multiple words"

How would I go about this sort of problem?

我将如何解决此类问题？

Thanks.

谢谢。

Answer 1

回答by iruvar

evalis evil, but this may be one of those cases where it comes handy

eval是邪恶的，但这可能是它派上用场的情况之一

str='2012 2013 "multiple words"'
eval x=($str)
echo ${x[2]}
multiple words

Or with more recent versions of bash(tested on 4.3)

或者使用更新版本的bash（在 4.3 上测试）

s='2012 2013 "multiple words"'
declare -a 'a=('"$s"')'
printf "%s\n" "${a[@]}"
2012
2013
multiple words

Answer 2

回答by l0b0

$ grep -Eo '"[^"]*"|[^" ]*' <<< '2012 2013 "multiple words"'
2012
2013
"multiple words"

That is, print onlythe strings matching either

也就是说，只打印匹配的字符串

a quote followed by any number (even zero) non-quotes followed by a quote or
a series of characters not containing a quote or space.

引号后跟任意数字（甚至零）非引号后跟引号或
不包含引号或空格的一系列字符。

Of course, this does nothandle complicated cases like quotes spanning multiple lines or escaped quotes (using either double quotes like SQL or backslash like the shell).

当然，这不能处理复杂的情况，例如跨多行的引号或转义引号（使用 SQL 之类的双引号或 shell 之类的反斜杠）。

Answer 3

回答by anubhava

You can directly do:

你可以直接这样做：

arr=(2012 2013 "multiple words")

echo ${#arr[@]} # gives 3
echo ${arr[2]} # gives "multiple words"

EDIT:Not sure if it helps the OP but following will also workL

编辑：不确定它是否对 OP 有帮助，但以下也将起作用

str='2012 2013 "multiple\ words"'
read -a arr <<< $str
echo ${#arr[@]} # gives 3
echo ${arr[2]} # gives "multiple words"

Answer 4

回答by zekus

The following will produce the result you want:

以下将产生您想要的结果：

tags='2012 2013 "multiple words"'
IFS=$'\n'; array=($(echo $tags | egrep -o '"[^"]*"|\S+'))

result in ZSH:

结果在 ZSH：

echo ${array[1]} # 2012
echo ${array[2]} # 2013
echo ${array[3]} # "multiple words"

result in BASH:

结果在 BASH：

echo ${array[0]} # 2012
echo ${array[1]} # 2013
echo ${array[2]} # "multiple words"

works in OSX.

在 OSX 中工作。

Answer 5

回答by dawg

Here is a small Python script to parse space delimited csv while respecting quoted fields:

这是一个小的 Python 脚本，用于在尊重引用字段的同时解析空格分隔的 csv：

$ python -c '
import csv, fileinput
for line in csv.reader(fileinput.input(), delimiter=" "):
   for word in line:
      print word
' test.csv
2012
2013
multiple words

Since this uses the fileinput module, works in a pipeline (or a string in a variable) as well:

由于这使用了 fileinput 模块，因此也适用于管道（或变量中的字符串）：

$ str='2012 2013 "multiple words"'
$ echo $str | python -c '
import csv, fileinput
for line in csv.reader(fileinput.input(), delimiter=" "):
   for word in line:
      print word
' 
2012
2013
multiple words

bash 使用空格而不是双引号内的空格将字符串（存储在变量中）拆分为多个单词

提问by Zettt

回答by iruvar

回答by l0b0

回答by anubhava

回答by zekus

回答by dawg

相关推荐

最近更新

标签

bash 使用空格而不是双引号内的空格将字符串（存储在变量中）拆分为多个单词

提问by Zettt

回答by iruvar

回答by l0b0

回答by anubhava

回答by zekus

回答by dawg

相关推荐

bash if else 语句中出现意外的文件结束错误

将 Bash 字符串文字与局部变量进行比较

从 bash 运行 Python 脚本：找不到命令错误

bash 使用bash在文件中查找字符串

相关推荐

最近更新

标签