bash 查找序列号中的差距
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/15867557/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Finding gaps in sequential numbers
提问by Shaun
I don't do this stuff for a living so forgive me if it's a simple question (or more complicated than I think). I‘ve been digging through the archives and found a lot of tips that are close but being a novice I'm not sure how to tweak for my needs or they are way beyond my understanding.
我不以做这些东西为生,所以如果这是一个简单的问题(或者比我想象的更复杂),请原谅我。我一直在挖掘档案,发现了很多接近的技巧,但作为新手,我不确定如何调整以满足我的需求,或者它们超出了我的理解范围。
I have some large data files that I can parse out to generate a list of coordinate that are mostly sequential
我有一些大数据文件,我可以解析它们以生成一个主要是顺序的坐标列表
5
6
7
8
15
16
17
25
26
27
What I want is a list of the gaps
我想要的是差距清单
1-4
9-14
18-24
I don't know perl, SQLor anything fancy but thought I might be able to do something that would subtract one number from the next. I could then at least grepthe output where the difference was not 1or -1and work with that to get the gaps.
我不知道perl、SQL或任何花哨的东西,但我认为我可以做一些可以从下一个数字中减去一个数字的事情。然后我至少可以grep输出差异不是1或-1的输出,并使用它来获得差距。
回答by Gilles Quenot
With awk:
使用awk:
awk '!=p+1{print p+1"-"-1}{p=}' file.txt
explanations
解释
$1is the first column from current input linepis the previous value of the last line- so
($1!=p+1)is a condition : if$1is different than previous value +1, then : - this part is executed :
{print p+1 "-" $1-1}: print previous value +1, the-character and fist columns + 1 {p=$1}is executed for each lines :pis assigned to the current 1st column
$1是当前输入行的第一列p是最后一行的前一个值($1!=p+1)条件$1也是如此:如果与之前的值 +1 不同,则:- 这部分执行::
{print p+1 "-" $1-1}打印前一个值+1,-字符和第一列+1 {p=$1}为每一行执行:p分配给当前的第一列
回答by Todd A. Jacobs
A Ruby Answer
红宝石答案
Perhaps someone else can give you the Bash or Awk solution you asked for. However, I think any shell-based answer is likely to be extremely localized for your data set, and not very extendable. Solving the problem in Ruby is fairly simple, and provides you with flexible formatting and more options for manipulating the data set in other ways down the road. YMMV.
也许其他人可以为您提供您要求的 Bash 或 Awk 解决方案。但是,我认为任何基于 shell 的答案都可能针对您的数据集进行了高度本地化,并且无法扩展。在 Ruby 中解决这个问题相当简单,并且为您提供了灵活的格式设置和更多选项,以便在以后以其他方式处理数据集。天啊。
#!/usr/bin/env ruby
# You could read from a file if you prefer,
# but this is your provided corpus.
nums = [5, 6, 7, 8, 15, 16, 17, 25, 26, 27]
# Find gaps between zero and first digit.
nums.unshift 0
# Create array of arrays containing missing digits.
missing_nums = nums.each_cons(2).map do |array|
(array.first.succ...array.last).to_a unless
array.first.succ == array.last
end.compact
# => [[1, 2, 3, 4], [9, 10, 11, 12, 13, 14], [18, 19, 20, 21, 22, 23, 24]]
# Format the results any way you want.
puts missing_nums.map { |ary| "#{ary.first}-#{ary.last}" }
Given your current corpus, this yields the following on standard output:
鉴于您当前的语料库,这会在标准输出上产生以下内容:
1-4
9-14
18-24
1-4
9-14
18-24
回答by choroba
Just remember the previous number and verify that the current one is the previous plus one:
只需记住前一个数字并验证当前数字是前一个加一:
#! /bin/bash
previous=0
while read n ; do
if (( n != previous + 1 )) ; then
echo $(( previous + 1 ))-$(( n - 1 ))
fi
previous=$n
done
You might need to add some checking to prevent lines like 28-28for single number gaps.
您可能需要添加一些检查以防止出现诸如28-28单个数字间隙之类的行。
回答by Kent
interesting question.
有趣的问题。
sputnick's awk one-liner is nice. I cannot write a simpler one than his. I just add another way using diff:
sputnick 的 awk one-liner 很好。我写不出比他更简单的。我只是使用 diff 添加另一种方式:
seq $(tail -1 file)|diff - file|grep -Po '.*(?=d)'
the output with your example would be:
您的示例的输出将是:
1,4
9,14
18,24
I knew that there is comma in it, instead of -. you could replace the grep with sed to get -, grep cannot change the input text... but the idea is same.
我知道里面有逗号,而不是-. 你可以用 sed 替换 grep 来获取-,grep 不能改变输入文本......但想法是一样的。
hope it helps.
希望能帮助到你。
回答by Chris Koknat
Perl solution similar to awk solution from StardustOne:
Perl 解决方案类似于 StardustOne 的 awk 解决方案:
perl -ane 'if ($F[0] != $p+1) {printf "%d-%d\n",$p+1,$F[0]-1}; $p=$F[0]' file.txt
These command-line options are used:
使用这些命令行选项:
-nloop around every line of the input file, do not automatically print every line-aautosplit mode – split input lines into the @F array. Defaults to splitting on whitespace. Fields are indexed starting with 0.-eexecute the perl code
-n循环输入文件的每一行,不要自动打印每一行-a自动拆分模式 – 将输入行拆分为 @F 数组。默认为在空白处拆分。字段从 0 开始索引。-e执行perl代码
回答by agc
Given input file, use the numintervalutiland pasteits output beside file, then munge it with tr, xargs, sedand printf:
给定的输入文件,使用numintervalUTIL和paste它旁边的输出文件,然后用Munge时间它tr,xargs,sed和printf:
gaps() { paste <(echo; numinterval "" | tr 1 '-' | tr -d '[02-9]') "" |
tr -d '[:blank:]' | xargs echo |
sed 's/ -/-/g;s/-[^ ]*-/-/g' | xargs printf "%s\n" ; }
Output of gaps file:
的输出gaps file:
5-8
15-17
25-27
How it works. The output of paste <(echo; numinterval file) filelooks like:
这个怎么运作。的输出paste <(echo; numinterval file) file看起来像:
5
1 6
1 7
1 8
7 15
1 16
1 17
8 25
1 26
1 27
From there we mainly replace things in column #1, and tweak the spacing. The 1s are replaced with -s, and the higher numbers are blanked. Remove some blanks with tr. Replace runs of hyphens like "5-6-7-8" with a single hyphen "5-8", and that's the output.
从那里我们主要替换第 1 列中的内容,并调整间距。该1s的替换-S,具有较高的数字消隐。删除一些空格tr。用一个连字符“ 5-8”替换像“ 5-6-7-8”这样的连字符,这就是输出。

