bash 如何使用时间戳进行 grep 计数

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/13955466/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-18 04:03:12  来源:igfitidea点击:

How can i do a grep count by using timestamp

bashshellunixterminalgrep

提问by user1916191

How can I do a grepcount by using timestamp?

如何grep使用时间戳进行计数?

Example: If I have a file in which I search a value xyzeverytime. The file gets updated regularly.

示例:如果我有一个文件,我xyz每次都在其中搜索一个值。该文件会定期更新。

20121912-07:15:55 abc cbfr xyz
20121912-07:16:40 mni cbfr xyz
-----------
-----------
-----------


20121912-08:15:55 gty cbfr xyz
20121912-08:20:55 jui uio xyz

I want to find out the occurences of xyzafter 20121912-08:15:55which in this case should be 2.

我想找出的出现次数xyz后,20121912-08:15:55在这种情况下应该是2

Doing a grep -c "xyz" filenamereads the entire file and gives the result. I want to do it after the last update or using a timestamp.

执行 agrep -c "xyz" filename读取整个文件并给出结果。我想在上次更新或使用时间戳之后执行此操作。

回答by Kent

try this one-liner:

试试这个单线:

awk '$NF=="xyz"&&>="20121912-08:15:55"{x++;}END{print x}' file

回答by Steve

I'm assuming you want to find the occurrences of pattern: 'xyz' where the date/time value is greater than or equal to a specified date/time: '20121912-08:15:55'. Here's what I'd do using GNU awk. Run like:

我假设您想找到模式的出现:'xyz',其中日期/时间值大于或等于指定的日期/时间:'20121912-08:15:55'。这是我会使用GNU awk. 运行如下:

awk -v pattern="xyz" -v time="20121912-08:15:55" -f script.awk file

Contents of script.awk:

内容script.awk

BEGIN {
    stamp = convert(time)
}

2
~ pattern && convert() >= stamp { i++ } END { print i } function convert(var) { x = "(....)(..)(..)-(..):(..):(..)" y = "\1 \3 \2 \4 \5 \6" return mktime(gensub(x,y,"",var)) }

Results:

结果:

awk -v pattern="xyz" -v time="20121912-08:15:55" 'BEGIN { stamp = convert(time) } 
ts="20121912-08:15:55" patt="xyz" perl -lane  '
    BEGIN {
        ($wanted_ts = $ENV{ts}) =~ s/^(....)(..)(..)//;
        $pattern = qr{$ENV{patt}};
    }
    ($this_ts = $F[0]) =~ s/^(....)(..)(..)//;
    $count++ if $this_ts ge $wanted_ts and /$pattern/;
    END {print $count}
'
~ pattern && convert() >= stamp { i++ } END { print i } function convert(var) { return mktime(gensub(/(....)(..)(..)-(..):(..):(..)/,"\1 \3 \2 \4 \5 \6","",var)) }' file

Alternatively, here's the one-liner:

或者,这是单线:

$ sed -n '/20121912-08:15:55/,$p' input.txt | grep -c xyz

回答by glenn Hymanman

Taking inspiration from Kent's answer, here's some Perl that manipulates the odd timestamp into YYYYMMDD format:

从 Kent 的回答中汲取灵感,这里有一些 Perl 将奇数时间戳处理为 YYYYMMDD 格式:

$ fgrep -A 100 '20121912-08:15:55' file | fgrep -c 'xyz'
2

回答by holygeek

You can tell sed to print lines from a file given a range (start and stop point) - the range can be regex or line number notation.

您可以告诉 sed 从给定范围(起点和终点)的文件中打印行 - 范围可以是正则表达式或行号表示法。

For your need this should do it:

根据您的需要,这应该这样做:

$ sed -n '/20121912-08:15:55/,$p' file | fgrep -c 'xyz'
2

Here the start point is given by the date, treated as a regular expression and the end point is the last line symbol $. ptells sed to print the lines within the range given. The -noption to sed tells it to not print the lines that it is processing.

这里的起点由日期给出,被视为正则表达式,终点是最后一行符号$p告诉 sed 打印给定范围内的行。-nsed的选项告诉它不打印它正在处理的行。

回答by Chris Seymour

This is kind of a hack but just grepfor the earliest date you want and print all lines after that using -Aand then pipe to grep -c xyz:

这是一种黑客行为,但仅grep针对您想要的最早日期-A,然后使用然后通过管道打印所有行grep -c xyz

$ sort file | sed -n '/20121912-08:15:55/,$p' | fgrep -c 'xyz'
2

Note: fgrepis just fixed string grepas you're not using regexpatterns, it's the same as doing grep -F.

注意:fgrep只是固定字符串,grep因为您没有使用regex模式,它与执行grep -F.

As less hacky way would be to use sedto print all lines from the date, this way you wouldn't need to make sure the value to -Awould cover the length of the file:

由于使用不那么笨拙的方法sed来打印日期中的所有行,这样您就不需要确保 to 的值-A将覆盖文件的长度:

 grep xyz filename | sed -r 's/^([^ ]+).*/ 20121912-08:15:55 <= /' | sed -r 's/([0-9]{4})([0-9]{2})([0-9]{2})//g' | sed 's/[-:]//g' | bc | grep 1 | wc -l

This assumes of course you file is in sorted order by timestamps if it's not then:

这当然假设您的文件按时间戳排序,如果不是那么:

  grep xyz filename                                  //gets all interseting lines
| sed -r 's/^([^ ]+).*/ 20121912-08:15:55 <= /'    //transform them into 
                                                       //comparison with the 
                                                       //date you want
| sed -r 's/([0-9]{4})([0-9]{2})([0-9]{2})//g' //invert day and month
| sed 's/[-:]//g'                                    //remove separators
| bc                                                 //ask bc result 
| grep 1                                             //get true results only
| wc -l                                              //and finally count them

回答by psycho

Hmmm, quickly written one :

嗯,快写一篇:

20121912-08:20:55 jui uio xyz                  //grep 'xyz'
20121912-08:15:55 <= 20121912-08:20:55         //sed
20121219-08:15:55 <= 20121219-08:20:55         
20121219081555 <= 20121219082055               
1                                              //result from bc

It's pretty ugly (I'm not a sed nor command line master) and may probably be shortened, but it's a way to do it. Explanation below :

它非常丑陋(我不是 sed 也不是命令行大师)并且可能会被缩短,但这是一种方法。解释如下:

##代码##

For last line of your example, the steps would give :

对于示例的最后一行,步骤将给出:

##代码##

HTH

HTH