将 JSON 解析为 shell 脚本中的数组

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/38364261/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-03 18:30:09  来源:igfitidea点击:

Parse JSON to array in a shell script

jsonbashshellparsing

提问by unconditionalcoder

I'm trying to parse a JSON object within a shell script into an array.

我正在尝试将 shell 脚本中的 JSON 对象解析为数组。

e.g.: [Amanda, 25, http://mywebsite.com]

例如:[阿曼达,25,http://mywebsite.com]

The JSON looks like:

JSON 看起来像:

{
  "name"       : "Amanda", 
  "age"        : "25",
  "websiteurl" : "http://mywebsite.com"
}

I do not want to use any libraries, it would be best if I could use a regular expression or grep. I have done:

我不想使用任何库,最好能使用正则表达式或 grep。我已经做好了:

myfile.json | grep name

This gives me "name" : "Amanda". I could do this in a loop for each line in the file, and add it to an array but I only need the right side and not the entire line.

这给了我“名字”:“阿曼达”。我可以在文件中的每一行循环中执行此操作,并将其添加到数组中,但我只需要右侧而不是整行。

回答by mklement0

If you really cannot use a proper JSON parser such as jq[1], try an awk-based solution:

如果您确实无法使用适当的 JSON 解析器,例如[1],请尝试基于 - 的解决方案jqawk

Bash 4.x:

重击 4.x:

readarray -t values < <(awk -F\" 'NF>=3 {print }' myfile.json)

Bash 3.x:

Bash 3.x:

IFS=$'\n' read -d '' -ra values < <(awk -F\" 'NF>=3 {print }' myfile.json)

This stores all property valuesin Bash array ${values[@]}, which you can inspect with
declare -p values.

这将所有属性存储在 Bash 数组中${values[@]},您可以使用它进行检查
declare -p values

These solutions have limitations:

这些解决方案有局限性:

  • each property must be on its own line,
  • all values must be double-quoted,
  • embedded escaped double quotes are not supported.
  • 每个属性都必须在自己的行上,
  • 所有值都必须用双引号引起来,
  • 不支持嵌入的转义双引号。

All these limitations reinforce the recommendation to use a proper JSON parser.

所有这些限制都强化了使用适当 JSON 解析器的建议。



Note: The following alternative solutions use the Bash 4.x+ readarray -t valuescommand, but they also work with the Bash 3.x alternative, IFS=$'\n' read -d '' -ra values.

注意:以下替代解决方案使用 Bash 4.x+readarray -t values命令,但它们也适用于 Bash 3.x 替代方案IFS=$'\n' read -d '' -ra values.

grep+ cutcombination: A single grepcommand won't do (unless you use GNUgrep- see below), but adding cuthelps:

grep+cut组合:单个grep命令不起作用(除非您使用GNUgrep- 见下文),但添加cut帮助:

readarray -t values < <(grep '"' myfile.json | cut -d '"' -f4)


GNUgrep: Using -Pto support PCREs, which support \Kto drop everything matched so far (a more flexible alternative to a look-behind assertion) as well as look-ahead assertions ((?=...)):

GNUgrep-P用于支持 PCRE,它支持\K删除迄今为止匹配的所有内容(后视断言的更灵活替代方案)以及前瞻断言 ((?=...)):

readarray -t values < <(grep -Po ':\s*"\K.+(?="\s*,?\s*$)' myfile.json)


Finally, here's a pure Bash (3.x+) solution:

最后,这是一个纯 Bash (3.x+) 解决方案

What makes this a viable alternative in terms of performance is that no external utilities are called in each loop iteration; however, for larger input files, a solution based on external utilities will be much faster.

使其成为性能方面可行的替代方案的原因在于,在每次循环迭代中都不会调用任何外部实用程序;但是,对于较大的输入文件,基于外部实用程序的解决方案会快得多。

#!/usr/bin/env bash

declare -a values # declare the array                                                                                                                                                                  

# Read each line and use regex parsing (with Bash's `=~` operator)
# to extract the value.
while read -r line; do
  # Extract the value from between the double quotes
  # and add it to the array.
  [[ $line =~ :[[:blank:]]+\"(.*)\" ]] && values+=( "${BASH_REMATCH[1]}" )
done < myfile.json                                                                                                                                          

declare -p values # print the array


[1] Here's what a robust jq-based solutionwould look like (Bash 4.x):
readarray -t values < <(jq -r '.[]' myfile.json)

[1] 以下是基于强大jq的解决方案的样子(Bash 4.x):
readarray -t values < <(jq -r '.[]' myfile.json)

回答by Dr_Hope

jq is good enough to solve this problem

jq 足以解决这个问题

paste -s <(jq '.files[].name' YourJsonString) <(jq '.files[].age' YourJsonString) <( jq '.files[].websiteurl' YourJsonString) 

So that you get a table and you can grep any rows or awk print any columns you want

这样你就可以得到一个表格,你可以 grep 任何行或 awk 打印你想要的任何列

回答by Dr_Hope

You can use a sed one liner to achieve this:

您可以使用 sed one liner 来实现此目的:

array=( $(sed -n "/{/,/}/{s/[^:]*:[[:blank:]]*//p;}" json ) )

Result:

结果:

$ echo ${array[@]}
"Amanda" "25" "http://mywebsite.com"

If you do not need/want the quotation marks then the following sed will do away with them:

如果您不需要/想要引号,那么以下 sed 将取消它们:

array=( $(sed -n '/{/,/}/{s/[^:]*:[^"]*"\([^"]*\).*//p;}' json) )

Result:

结果:

$ echo ${array[@]}
Amanda 25 http://mywebsite.com

It will also work if you have multiple entries, like

如果您有多个条目,它也将起作用,例如

$ cat json
{
  "name"       : "Amanda" 
  "age"        : "25"
  "websiteurl" : "http://mywebsite.com"
}

{
   "name"       : "samantha"
   "age"        : "31"
   "websiteurl" : "http://anotherwebsite.org"
}

$ echo ${array[@]}
Amanda 25 http://mywebsite.com samantha 31 http://anotherwebsite.org

UPDATE:

更新:

As pointed out by mklement0 in the comments, there might be an issue if the file contains embedded whitespace, e.g., "name" : "Amanda lastname". In this case Amandaand lastnamewould both be read into seperate array fields each. To avoid this you can use readarray, e.g.,

正如 mklement0 在评论中指出的那样,如果文件包含嵌入的空格,例如"name" : "Amanda lastname". 在这种情况下,Amandalastname都将被读入单独的数组字段。为避免这种情况,您可以使用readarray,例如,

readarray -t array < <(sed -n '/{/,/}/{s/[^:]*:[^"]*"\([^"]*\).*//p;}' json2)

This will also take care of any globbing issues, also mentioned in the comments.

这也将处理评论中提到的任何通配问题。