bash unix 排序,带有主键和辅助键
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/3193720/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
unix sorting, with primary and secondary keys
提问by zseder
I would like to sort a file on more fields. A sample tab separated file is:
我想在更多字段上对文件进行排序。示例制表符分隔文件是:
a 1 1.0
b 2 0.1
c 3 0.3
a 4 0.001
c 5 0.5
a 6 0.01
b 7 0.01
a 8 0.35
b 9 2.3
c 10 0.1
c 11 1.0
b 12 3.1
a 13 2.1
And i would like to have it sorted alphabetically by field 1 (with -d), and when field1 is the same, sort by field 3 (with the -goption).
我希望它按字段 1(带-d)的字母顺序排序,当字段 1 相同时,按字段 3(带-g选项)排序。
A didn't succeed in doing this. My attemps were (with a real TAB character instead of <TAB>):
A没有成功做到这一点。我的尝试是(使用真正的 TAB 字符而不是<TAB>):
cat tst | sort -t"<TAB>" -k1 -k3n
cat tst | sort -t"<TAB>" -k1d -k3n
cat tst | sort -t"<TAB>" -k3n -k1d
None of these are working. I'm not sure if sort is even able to do this. I'll write a script for workaround, so I'm just curious whether there is a solution using only sort.
这些都不起作用。我不确定 sort 是否能够做到这一点。我将编写一个解决方法的脚本,所以我很好奇是否有仅使用sort的解决方案。
采纳答案by Janick Bernet
The manualshows some examples.
该手册显示了一些示例。
In accordance with zseder's comment, this works:
根据 zseder 的评论,这有效:
sort -t"<TAB>" -k1,1d -k3,3g
Tab should theoretically work also like this sort -t"\t".
Tab 理论上也应该像这样工作sort -t"\t"。
If none of the above work to delimit by tab, this is an ugly workaround:
如果以上方法都不能按制表符分隔,这是一个丑陋的解决方法:
TAB=`echo -e "\t"`
sort -t"$TAB"
回答by Philipp
Here is a Python script that you might use as a starting point:
这是一个 Python 脚本,您可以将其用作起点:
#!/usr/bin/env python2.6
import sys
import string
def main():
fname = sys.argv[1]
data = []
with open(fname, "rt") as stream:
for line in stream:
line = line.strip()
a, b, c = line.split()
data.append((a, int(b), float(c)))
data.sort(key=my_key)
print data
def my_key(item):
a, b, c = item
return c, lexicographical_key(a)
def lexicographical_key(a):
# poor man's attempt, should use Unicode classification etc.
return a.translate(None, string.punctuation)
if __name__ == "__main__":
main()

