使用 jq 或替代命令行工具比较 JSON 文件
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/31930041/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Using jq or alternative command line tools to compare JSON files
提问by Amelio Vazquez-Reina
Are there any command line utilities that can be used to find if two JSON files are identical with invariance to within-dictionary-key and within-list-element ordering?
是否有任何命令行实用程序可用于查找两个 JSON 文件是否具有相同的字典键内和列表元素内排序的不变性?
Could this be done with jqor some other equivalent tool?
这可以用jq或其他一些等效的工具来完成吗?
Examples:
例子:
These two JSON files are identical
这两个 JSON 文件是相同的
A:
A:
{
"People": ["John", "Bryan"],
"City": "Boston",
"State": "MA"
}
B:
B:
{
"People": ["Bryan", "John"],
"State": "MA",
"City": "Boston"
}
but these two JSON files are different:
但是这两个 JSON 文件是不同的:
A:
A:
{
"People": ["John", "Bryan", "Carla"],
"City": "Boston",
"State": "MA"
}
C:
C:
{
"People": ["Bryan", "John"],
"State": "MA",
"City": "Boston"
}
That would be:
那将是:
$ some_diff_command A.json B.json
$ some_diff_command A.json C.json
The files are not structurally identical
采纳答案by Amelio Vazquez-Reina
Since jq's comparison already compares objects without taking into account key ordering, all that's left is to sort all lists inside the object before comparing them. Assuming your two files are named a.jsonand b.json, on the latest jq nightly:
由于 jq 的比较已经比较对象而不考虑键顺序,剩下的就是在比较之前对对象内的所有列表进行排序。假设你的两个文件被命名为a.jsonand b.json,在最新的 jq nightly 上:
jq --argfile a a.json --argfile b b.json -n '($a | (.. | arrays) |= sort) as $a | ($b | (.. | arrays) |= sort) as $b | $a == $b'
This program should return "true" or "false" depending on whether or not the objects are equal using the definition of equality you ask for.
这个程序应该返回“true”或“false”,这取决于对象是否使用您要求的相等定义相等。
EDIT: The (.. | arrays) |= sortconstruct doesn't actually work as expected on some edge cases. This GitHub issueexplains why and provides some alternatives, such as:
编辑:(.. | arrays) |= sort在某些边缘情况下,该构造实际上并未按预期工作。这个 GitHub 问题解释了原因并提供了一些替代方案,例如:
def post_recurse(f): def r: (f | select(. != null) | r), .; r; def post_recurse: post_recurse(.[]?); (post_recurse | arrays) |= sort
Applied to the jq invocation above:
应用于上面的 jq 调用:
jq --argfile a a.json --argfile b b.json -n 'def post_recurse(f): def r: (f | select(. != null) | r), .; r; def post_recurse: post_recurse(.[]?); ($a | (post_recurse | arrays) |= sort) as $a | ($b | (post_recurse | arrays) |= sort) as $b | $a == $b'
回答by Erik
In principle, if you have access to bash or some other advanced shell, you could do something like
原则上,如果您可以访问 bash 或其他一些高级 shell,则可以执行以下操作
cmp <(jq -cS . A.json) <(jq -cS . B.json)
using subprocesses. This will format the json with sorted keys, and consistent representation of floating points. Those are the only two reasons I can think of for why json with the same content would be printed differently. Therefore doing a simple string comparison afterwards will results in a proper test. It's probably also worth noting that if you can't use bash you can get the same results with temporary files, it's just not as clean.
使用子进程。这将使用排序的键和浮点的一致表示来格式化 json。这是我能想到的为什么具有相同内容的 json 会以不同方式打印的唯一两个原因。因此,之后进行简单的字符串比较将导致正确的测试。可能还值得注意的是,如果您不能使用 bash,您可以使用临时文件获得相同的结果,只是没有那么干净。
This doesn't quite answer your question, because in the way you stated the question you wanted ["John", "Bryan"]and ["Bryan", "John"]to compare identically. Since json doesn't have the concept of a set, only a list, those should be considered distinct. Order is important for lists. You would have to write some custom comparison if you wanted them to compare equally, and to do that you would need to define what you mean by equality. Does order matter for all lists or only some? What about duplicate elements? Alternatively if you want them to be represented as a set, and the elements are strings, you could put them in objects like {"John": null, "Bryan": null}. Order will not matter when comparing those for equality.
这并不能完全回答您的问题,因为以您陈述您想要的问题的方式["John", "Bryan"]并["Bryan", "John"]进行相同的比较。由于 json 没有集合的概念,只有列表,因此应该将它们视为不同的。顺序对于列表很重要。如果您希望它们相等地比较,则必须编写一些自定义比较,为此您需要定义相等的含义。顺序对所有列表重要还是仅对某些列表重要?重复元素呢?或者,如果您希望它们被表示为一个集合,并且元素是字符串,您可以将它们放在像{"John": null, "Bryan": null}. 在比较相等性时,顺序无关紧要。
Update
更新
From the comment discussion: If you want to get a better idea of why the the json isn't the same, then
来自评论讨论:如果您想更好地了解为什么 json 不一样,那么
diff <(jq -S . A.json) <(jq -S . B.json)
will produce more interpretable output. vimdiffmight be preferable to diff depending on tastes.
将产生更多可解释的输出。vimdiff根据口味,可能比差异更可取。
回答by Joe Burnett
Use jdwith the -setoption:
jd与-set选项一起使用:
No output means no difference.
没有输出意味着没有区别。
$ jd -set A.json B.json
Differences are shown as an @ path and + or -.
差异显示为@ 路径和+ 或-。
$ jd -set A.json C.json
@ ["People",{}]
+ "Carla"
The output diffs can also be used as patch files with the -poption.
输出差异也可以用作带有-p选项的补丁文件。
$ jd -set -o patch A.json C.json; jd -set -p patch B.json
{"City":"Boston","People":["John","Carla","Bryan"],"State":"MA"}
回答by peak
Here is a solution using the generic function walk/1:
这是使用通用函数walk/1的解决方案:
# Apply f to composite entities recursively, and to atoms
def walk(f):
. as $in
| if type == "object" then
reduce keys[] as $key
( {}; . + { ($key): ($in[$key] | walk(f)) } ) | f
elif type == "array" then map( walk(f) ) | f
else f
end;
def normalize: walk(if type == "array" then sort else . end);
# Test whether the input and argument are equivalent
# in the sense that ordering within lists is immaterial:
def equiv(x): normalize == (x | normalize);
Example:
例子:
{"a":[1,2,[3,4]]} | equiv( {"a": [[4,3], 2,1]} )
produces:
产生:
true
And wrapped up as a bash script:
并打包为 bash 脚本:
#!/bin/bash
JQ=/usr/local/bin/jq
BN=$(basename diff \
<(jq -S 'def post_recurse(f): def r: (f | select(. != null) | r), .; r; def post_recurse: post_recurse(.[]?); (. | (post_recurse | arrays) |= sort)' "$original_json") \
<(jq -S 'def post_recurse(f): def r: (f | select(. != null) | r), .; r; def post_recurse: post_recurse(.[]?); (. | (post_recurse | arrays) |= sort)' "$changed_json")
)
function help {
cat <<EOF
Syntax: $ echo '[{"name": "John", "age": 56}, {"name": "Mary", "age": 67}]' > file1.json
$ echo '[{"age": 56, "name": "John"}, {"name": "Mary", "age": 61}]' > file2.json
$ diff -u --color \
<(jq -cS . file1.json | js-beautify -f -) \
<(jq -cS . file2.json | js-beautify -f -)
--- /dev/fd/63 2016-10-18 13:03:59.397451598 +0200
+++ /dev/fd/62 2016-10-18 13:03:59.397451598 +0200
@@ -2,6 +2,6 @@
"age": 56,
"name": "John Smith"
}, {
- "age": 67,
+ "age": 61,
"name": "Mary Stuart"
}]
file1 file2
The two files are assumed each to contain one JSON entity. This
script reports whether the two entities are equivalent in the sense
that their normalized values are equal, where normalization of all
component arrays is achieved by recursively sorting them, innermost first.
This script assumes that the jq of interest is $JQ if it exists and
otherwise that it is on the PATH.
EOF
exit
}
if [ ! -x "$JQ" ] ; then JQ=jq ; fi
function die { echo "$BN: $@" >&2 ; exit 1 ; }
if [ $# != 2 -o "" = -h -o "" = --help ] ; then help ; exit ; fi
test -f "" || die "unable to find "
test -f "" || die "unable to find "
$JQ -r -n --argfile A "" --argfile B "" -f <(cat<<"EOF"
# Apply f to composite entities recursively, and to atoms
def walk(f):
. as $in
| if type == "object" then
reduce keys[] as $key
( {}; . + { ($key): ($in[$key] | walk(f)) } ) | f
elif type == "array" then map( walk(f) ) | f
else f
end;
def normalize: walk(if type == "array" then sort else . end);
# Test whether the input and argument are equivalent
# in the sense that ordering within lists is immaterial:
def equiv(x): normalize == (x | normalize);
if $A | equiv($B) then empty else "\($A) is not equivalent to \($B)" end
EOF
)
POSTSCRIPT: walk/1 is a built-in in versions of jq > 1.5, and can therefore be omitted if your jq includes it, but there is no harm in including it redundantly in a jq script.
POSTSCRIPT: walk/1 是 jq > 1.5 版本的内置,因此如果您的 jq 包含它,则可以省略它,但在 jq 脚本中冗余包含它没有坏处。
POST-POSTSCRIPT: The builtin version of walkhas recently been changed so that it no longer sorts the keys within an object. Specifically, it uses keys_unsorted. For the task at hand, the version using keysshould be used.
POST-POSTSCRIPT: 的内置版本walk最近已更改,因此不再对对象中的键进行排序。具体来说,它使用keys_unsorted. 对于手头的任务,keys应该使用使用的版本。
回答by Maikon
There's an answer for this herethat would be useful.
有这个答案在这里,将是有益的。
Essentially you can use the Git difffunctionality (even for non-Git tracked files) which also includes colour in the output:
本质上,您可以使用 Gitdiff功能(即使对于非 Git 跟踪文件),它还在输出中包含颜色:
git diff --no-index payload_1.json payload_2.json
git diff --no-index payload_1.json payload_2.json
回答by Shivraj
Perhaps you could use this sort and diff tool: http://novicelab.org/jsonsortdiff/which first sorts the objects semantically and then compares it. It is based on https://www.npmjs.com/package/jsonabc
也许您可以使用这种排序和差异工具:http: //novicelab.org/jsonsortdiff/,它首先对对象进行语义排序,然后进行比较。它基于https://www.npmjs.com/package/jsonabc
回答by Andrew
Pulling in the best from the top two answers to get a jqbased json diff:
从前两个答案中提取最佳答案以获得jq基于 json 的差异:
This takes the elegant array sorting solution from https://stackoverflow.com/a/31933234/538507(which allows us to treat arrays as sets) and the clean bash redirection into difffrom https://stackoverflow.com/a/37175540/538507This addresses the case where you want a diff of two json files and the order of the array contents is not relevant.
这需要优雅的数组排序从溶液https://stackoverflow.com/a/31933234/538507(它允许我们治疗数组作为套)和清洁的bash重定向到diff从https://stackoverflow.com/a/37175540/ 538507这解决了您想要两个 json 文件的差异并且数组内容的顺序不相关的情况。
回答by Acapulco
One more tool for those to which the previous answers are not a good fit, you can try jdd.
对于以前的答案不适合的人来说,还有一个工具,您可以尝试jdd。
It's HTML based so you can either use it online at www.jsondiff.comor, if you prefer running it locally, just download the project and open the index.html.
它是基于 HTML 的,因此您可以在www.jsondiff.com 上在线使用它,或者,如果您更喜欢在本地运行它,只需下载项目并打开 index.html。
回答by tokland
If you also want to see the differences, using @Erik's answer as inspiration and js-beautify:
如果您还想查看差异,请使用@Erik 的答案作为灵感和js-beautify:
##代码##
