bash 使用命令行工具对 JSON/JavaScript 元组进行排序

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/22739800/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-18 10:03:28  来源:igfitidea点击:

Sort JSON/JavaScript tuples using command-line tools

jsonbashsortingawk

提问by Philippe

I have a list of JavaScript tuples in a file, one per line, as such:

我在文件中有一个 JavaScript 元组列表,每行一个,如下所示:

{ x : 12, y : -1.0, as : [ 2, 0, 0 ], str : "xxx", d : 0.041 },
{ x : 27, y : 11.4, as : [ 1, 1, 7 ], str : "yyy", d : 0.235 },
{ x : -4, y :  2.0, as : [ 7, 8, 3 ], str : "zzz", d : 0.002 },
{ x : 44, y :  5.4, as : [ 9, 4, 6 ], str : "kkk", d : 0.176 },

I would like to sort them according to the value of a given field (the dfield in my example), preferably using command-line tools (this is part of a process with many steps).

我想根据给定字段(d我示例中的字段)的值对它们进行排序,最好使用命令行工具(这是包含许多步骤的过程的一部分)。

If it makes any difference, we can assume that all lines have exactly the same length (I can know the start and end index of the dvalue), although I would prefer a solution that doesn't rely on this.

如果有什么不同,我们可以假设所有行的长度完全相同(我可以知道值的开始和结束索引d),尽管我更喜欢不依赖于此的解决方案。

采纳答案by Teemu Ikonen

If you can guarantee that all fields are same size, you can use sort command. For example, this sorts by column x value numerically.

如果可以保证所有字段的大小相同,则可以使用 sort 命令。例如,这按列 x 值进行数字排序。

cat <your file.dat> | sort -n -k 5,7

Data you have here as example is not valid JSON but javascript syntax. One way is to wrap the file so it's valid javascript program and run it in node.js command line,

作为示例,您在此处拥有的数据不是有效的 JSON,而是 javascript 语法。一种方法是包装文件,使其成为有效的 javascript 程序并在 node.js 命令行中运行,

var l = [
    { x : 12, y : -1.0, as : [ 2, 0, 0 ], str : "xxx", d : 0.041 },
    { x : 27, y : 11.4, as : [ 1, 1, 7 ], str : "yyy", d : 0.235 },
    ...
]
l.sort(function(o1, o2) {?return o1.d < o2.d ? -1 : 1 });
console.log(l);

回答by Ashley Coolman

Some time has passed since this question was asked and answered.

自从提出和回答这个问题以来,已经过去了一段时间。

These days, a non-hacky way would be to use something like jq:

这些天,一个非hacky的方法是使用像jq这样的东西:

cat data.json | jq 'sort_by(.d)' >> data_sorted.json

For more info check the site:

有关更多信息,请查看网站:

jq is like sed for JSON data - you can use it to slice and filter and map and transform structured data with the same ease that sed, awk, grep and friends let you play with text.

-https://stedolan.github.io/jq/

jq 就像用于 JSON 数据的 sed - 您可以使用它来切片、过滤、映射和转换结构化数据,就像 sed、awk、grep 和朋友让您处理文本一样轻松。

- https://stedolan.github.io/jq/

If for some reason you don't like jq, there are many alternatives

如果由于某种原因你不喜欢 jq,还有很多选择

回答by j_random_hacker

It's a hack, but if each JSON record is one line, and you know that the value for dbegins after the same number of whitespace-separated tokens on each line, then you can just use

这是一个黑客,但如果每个 JSON 记录是一行,并且您知道 for 的值d在每行上相同数量的空格分隔标记之后开始,那么您可以使用

sort -g -k 20 < in > out

which will compare lines numerically based on the 20th whitespace-separated component. For increased comfort you could specify a different delimiter with -t(perhaps :) and adjust the argument to -kas necessary, but it's still a hack :)

它将根据第 20 个空格分隔的组件以数字方式比较行。为了增加舒适度,您可以使用-t(也许:)指定不同的分隔符并-k根据需要调整参数,但这仍然是一个黑客:)

sortis generally carefully optimised for speed, so you're unlikely to find something faster.

sort通常针对速度进行了仔细优化,因此您不太可能找到更快的东西。

回答by kmundnic

You could also use GNU's sortas follows:

您还可以sort按如下方式使用 GNU :

$ sort -t: -k6 -n test.csv
{ x : -4, y :  2.0, as : [ 7, 8, 3 ], str : "zzz", d : 0.002 },
{ x : 12, y : -1.0, as : [ 2, 0, 0 ], str : "xxx", d : 0.041 },
{ x : 44, y :  5.4, as : [ 9, 4, 6 ], str : "kkk", d : 0.176 },
{ x : 27, y : 11.4, as : [ 1, 1, 7 ], str : "yyy", d : 0.235 },

The -kflag takes the column index. -t:is to use the :as separators, and -nis for numbers.

-k标志采用列索引。-t:是使用:作为分隔符,并且-n是数字。

Of course, this solution as is would not work if you add another field after d. If that were the case, you could change the value of -kto consider only specific characters, such as -k6.2,6.6, but this would assume that the number of digits after the .is exactly 3.

当然,如果您在d. 如果是这种情况,您可以更改 的值-k以仅考虑特定字符,例如-k6.2,6.6,但这将假设 后的位数.恰好为 3。