bash 2个字段数字顺序的unix排序
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 
原文地址: http://stackoverflow.com/questions/11443815/
Warning: these are provided under cc-by-sa 4.0 license.  You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
unix sort for 2 fields numeric order
提问by jdex
I need to sort some data with unix sort but I can't figure exactly the right syntax, the data looks like
我需要使用 unix sort 对一些数据进行排序,但我无法确定正确的语法,数据看起来像
3.9.1 Step 10:
3.9.1 Step 20:
3.8.10 Step 20:
3.10.2 Step 10:
3.8.4 Step 90:
3.8.4 Step 100:
3.8.4 Step 10:
I want to sort it using first the major number, then the step number, e.g. the data sorted above would look like.
我想首先使用主编号对其进行排序,然后使用步骤编号,例如上面排序的数据看起来像。
3.8.4 Step 10:
3.8.4 Step 90:
3.8.4 Step 100:
3.8.10 Step 20:
3.9.1 Step 10:
3.9.1 Step 20:
3.10.2 Step 10:
I have found the way to sort by first number on this site:
我在这个网站上找到了按第一个数字排序的方法:
sort -t. -k 1,1n -k 2,2n -k 3,3n
but I am struggling to now sort by the 3rd column Step number without disturbing the first sort
但我现在正在努力按第 3 列 Step number 排序而不干扰第一个排序
采纳答案by Jonathan Leffler
There's a fascinating article on re-engineering the Unix sort('Theory and Practice in the Construction of a Working Sort Routine', J P Linderman, AT&T Bell Labs Tech Journal, Oct 1984) which is not, unfortunately, available on the internet, AFAICT (I looked a year or so ago and did not find it; I looked again just now, and can find references to it, but not the article itself).  Amongst other things, the article demonstrated that for Unix sort, the comparison time far outweighs the cost of moving data (not very surprising when you consider that the comparison has to compare fields determined per row, but moving 'data' is simply a question of switching pointers around).  One upshot of that was that they recommend doing what danfuzzsuggests; mapping keys to make comparisons easy.  They showed that even a simple scripted solution could save time compared with making sort work really hard.
有一篇关于重新设计 Unix 的引人入胜的文章sort('Theory and Practice in the Construction of a Working Sort Routine',JP Linderman,AT&T Bell Labs Tech Journal,1984 年 10 月),不幸的是,它在互联网上找不到,AFAICT(一年多前找的,没找到;刚才又看了,能找到参考,但找不到文章本身)。除其他事项外,该文章表明,对于 Unix sort,比较时间远远超过移动数据的成本(当您考虑比较必须比较每行确定的字段时,这并不奇怪,但移动“数据”只是切换的问题周围的指针)。这样做的一个结果是他们建议做danfuzz建议;映射键使比较容易。他们表明,与使排序工作变得非常困难相比,即使是简单的脚本化解决方案也可以节省时间。
So, you could think in terms of using a character that's unlikely to appear in the data file naturally (such as Control-A) as the key field separator.
因此,您可以考虑使用不太可能在数据文件中自然出现的字符(例如Control-A)作为关键字段分隔符。
sed 's/^\([^.]*\)[.]\([^.]*\)[.]\([^ ]*\) Step \([0-9]*\):.*/^A^A^A^A&/' file |
sort -t'^A' -k1,1n -k2,2n -k3,3n -k4,4n |
sed 's/^.*^A//'
The first command is the hard one.  It identifies the 4 numeric fields, and outputs them separated by the chosen character (written ^Aabove, typed as Control-A), and then outputs a copy of the original line.  The sort then works on the first four fields numerically, and the final sedcommands strips off the front of each line up to and including the last Control-A, giving you the original line back again.
第一个命令是困难的。它识别 4 个数字字段,并输出由所选字符分隔的它们(写^A在上面,键入为Control-A),然后输出原始行的副本。排序然后以数字方式处理前四个字段,最后的sed命令将每行的前面去掉,直到并包括最后一个Control-A,再次返回原始行。
回答by danfuzz
How about transforming the Stepand :on the way into sort, and then transforming back afterwards? I believe this gets the results you're looking for:
如何将Step和:在途中转换为sort,然后再转换回来?我相信这会得到您正在寻找的结果:
cat your-file.txt \
    | sed -e 's/ Step \(.*\):$/./g' \
    | sort -t. -k1,1n -k2,2n -k3,3n -k4,4n \
    | sed -e 's/\(.*\)\.\(.*\)$/ Step :/g'
(Just using cathere for expository purposes. If it's just a regular file, then it could be passed to the first sed.)
(cat此处仅用于说明目的。如果它只是一个常规文件,则可以将其传递给第一个sed.)
回答by potong
This might work for you:
这可能对你有用:
 sort -k3,3n file | sort -nst. -k1,1 -k2,2 -k3,3
or a very iffy:
或者一个非常不确定的:
 sort -nt. -k1,1 -k2,2 -k3,3 -k3.7 file
The first uses two sorts:
第一种使用两种类型:
sort -k3,3nsorts by stepssort -nst. -k1,1 -k2,2 -k3,3sorts by major numbers but keeps the step order
sort -k3,3n按步骤排序sort -nst. -k1,1 -k2,2 -k3,3按主要数字排序但保持步骤顺序
The second works but only if the 3rd major number remains below 100.
第二个有效,但前提是第三个主要数字保持在 100 以下。
or perhaps:
也许:
sed 's/ /./2' file | sort -nt. -k1,1 -k2,2 -k3,3 -k4,4 | sed 's/\./ /3'
回答by Levon
UPDATED:
更新:
This will generate the output you specified:
这将生成您指定的输出:
sed 's/Step /Step./' data|sort -t. -n -k1,1 -k2,2 -k3,3 -k4|sed 's/Step./Step /'
result:
结果:
3.8.4 Step 10:
3.8.4 Step 90:
3.8.4 Step 100:
3.8.10 Step 20:
3.9.1 Step 10:
3.9.1 Step 20:
3.10.2 Step 10:
The challenge with this sort is that the sorting fields are defined by both '.'(for the version numbers) and the default whitespace (for the Step numbers). You can't specify several/different field separators for the same sort command. Combining several sorts with different field separators did not yield the right output.
这种排序的挑战在于排序字段由'.'(对于版本号)和默认空格(对于步骤号)两者定义。您不能为同一个排序命令指定多个/不同的字段分隔符。将几种带有不同字段分隔符的排序组合不会产生正确的输出。
This solution works by replacing the blank space after the Stepfield temporarilywith a '.'so that all sorting fields can be separated with the same character ('.'). After the sort is done, the '.'is replaced with a blank again.
此解决方案的工作原理是用a临时替换Step字段后面的空格,以便所有排序字段都可以用相同的字符 ( )分隔。排序完成后,将再次替换为空白。'.''.''.'

