如何使用 Bash 在两个时间戳之间的文件中搜索行

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/23697958/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-10 00:51:13  来源:igfitidea点击:

How to search for lines in a file between two timestamps using Bash

bash

提问by herder

In bash I am trying to read a log file and will print only the lines that have a timestamp between two specific times. The time format is hh:mm:ss. For example, I would be searching for lines that would fall between 12:52:33 to 12:59:33.

在 bash 中,我试图读取一个日志文件,并且只打印在两个特定时间之间具有时间戳的行。时间格式为 hh:mm:ss。例如,我将搜索介于 12:52:33 到 12:59:33 之间的行。

I want to use regular expression because I can use it in grepfunction.

我想使用正则表达式,因为我可以在grep函数中使用它。

Each log line begins with some_nr 2014-05-15 21:58:00,000000 rest_of_line.

每个日志行都以some_nr 2014-05-15 21:58:00,000000 rest_of_line.

My solution gives me lines with 1 min margin. I cut out ssand take all lines with hh:mm:[0-9]{2}. $2 has format filename_hh:mm:;for example: "24249_16:05:;24249_16:05:;24249_16:07:;24249_16:07:;24249_16:08:"

我的解决方案给了我 1 分钟边距的线条。我剪掉了ss所有的行hh:mm:[0-9]{2}。$2 的格式filename_hh:mm:;例如:"24249_16:05:;24249_16:05:;24249_16:07:;24249_16:07:;24249_16:08:"

My code:

我的代码:

B=  

for line in ${B//;/ } ;
do  
    TENT=`echo $line | awk '{split(
1002143 1002143 2014/15/05 22:09:52.937004 bla 
1002130         2014/15/05 22:09:44.786002 bla bla
1001667         2014/15/05 22:09:44.592009 bl a bla
1001667 1001667 2014/15/05 22:09:44.592009 bl a bla
,numbers,"_"); print numbers[1]}'`"_logs.txt" TIME=`echo $line | awk '{split(
sed -rne '/<timestamp>/,/<timestamp>/ p' <file>
,numbers,"_"); print numbers[2]}'`"[0-9]{2}" grep -iE ${TIME} ${TENT} >> ${FILE1} done

I need a solution with 15 sec margin for any time not 60. I want to have input in format filename_hh:mm:ssand take lines for hh:mm:ss +/- 15s or filename_hh:mm:ss(1)_hh:mm:ss(2)and take lines between hh:mm:ss(1) and hh:mm:ss(2). For sometime there is no lines so the solution should 'recognize' if sometimes match inputted interval or not.

我需要一个 15 秒余量的解决方案,而不是 60。我想输入格式filename_hh:mm:ss并为 hh:mm:ss +/- 15sfilename_hh:mm:ss(1)_hh:mm:ss(2)取行,或者在 hh:mm:ss(1) 和 hh 之间取行:毫米:秒(2)。有时没有行,因此解决方案应该“识别”有时是否匹配输入的间隔。

Log files look like this:

日志文件如下所示:

tiago@dell:~$ sed -rne '/08:17:38/,/08:24:36/ p' /var/log/syslog 
May 16 08:17:38 dell AptDaemon.Worker: INFO: Processing transaction /org/debian/apt/transaction/08a244f7b8ce4fad9f6b304aca9eae7a
May 16 08:17:50 dell AptDaemon.Worker: INFO: Finished transaction /org/debian/apt/transaction/08a244f7b8ce4fad9f6b304aca9eae7a
May 16 08:18:50 dell AptDaemon.PackageKit: INFO: Initializing PackageKit transaction
May 16 08:18:50 dell AptDaemon.Worker: INFO: Simulating trans: /org/debian/apt/transaction/37c3ef54a6ba4933a561c49b3fac5f6e
May 16 08:18:50 dell AptDaemon.Worker: INFO: Processing transaction /org/debian/apt/transaction/37c3ef54a6ba4933a561c49b3fac5f6e
May 16 08:18:51 dell AptDaemon.PackageKit: INFO: Get updates()
May 16 08:18:52 dell AptDaemon.Worker: INFO: Finished transaction /org/debian/apt/transaction/37c3ef54a6ba4933a561c49b3fac5f6e
May 16 08:24:36 dell AptDaemon: INFO: Quitting due to inactivity

回答by Tiago Lopo

I believe sed is the best option:

我相信 sed 是最好的选择:

awk -v from="12:52:33" -v to="12:59:33" '>=from && <=to' foo.log

ex:

前任:

egrep '12:5[2-9]:33' file.log

回答by Kent

log file is usually sorted by timestamp, assume the timestamp is on the first column, you could:

日志文件通常按时间戳排序,假设时间戳在第一列,您可以:

% date --date="2014-05-15 21:58:00 15 sec ago" +'%Y-%m-%d %H:%M:%S'
2014-05-15 21:57:45
% date --date="2014-05-15 21:58:00 15 sec" +'%Y-%m-%d %H:%M:%S' 
2014-05-15 21:58:15

in this way, you can change the from and toto get different set of log entries. regex is not a good tool to do number calculation/comparison.

通过这种方式,您可以更改from and to以获得不同的日志条目集。正则表达式不是进行数字计算/比较的好工具。

回答by anubhava

You can use this regex in egrep:

您可以在egrep以下位置使用此正则表达式:

#! /usr/bin/perl

use warnings;
use strict;
use Time::Piece;
use autodie;

my $arg=shift;
my @a =split("_",$arg);
my $fn=shift @a;

my $dfmt='%Y/%d/%m';
my $fmt=$dfmt.' %H:%M:%S';
my $t = localtime;
my $date=$t->strftime($dfmt);
my $t1; my $t2;
if (@a == 1) {
   my $d=$date.' '.$a[0];
   my $tt=Time::Piece->strptime($d, $fmt);
   $t1=$tt-15;
   $t2=$tt+15;
} elsif (@a == 2) {
   $t1=Time::Piece->strptime($date.' '.$a[0], $fmt);
   $t2=Time::Piece->strptime($date.' '.$a[1], $fmt);
} else {
   die "Unexpected input argument!";
}

$fn=$fn.'_logs.txt';
doGrep($fn,$t1,$t2,$fmt);

sub doGrep { 
   my ($fn,$t1,$t2,$fmt) = @_;

   open (my $fh, "<", $fn);
   while (my $line=<$fh>) {
      my ($d1,$d2) = $line=~/\S+\s+(\S+)\s+(\d\d:\d\d:\d\d)/;
      my $d=$d1.' '.$d2;
      my $t=Time::Piece->strptime($d, $fmt);
      print $line if ($t>$t1 && $t<$t2);
   }
   close ($fh);
}

回答by Stefan Schmiedl

You are using the wrong tool for this task. Once you have a regular expression like the one given by @anubhava, you can easily find a time interval that is not matched by it. grepand regexps might work for a few special cases, but they do not scale to the general case.

您为此任务使用了错误的工具。一旦你有一个像@anubhava 给出的正则表达式,你可以很容易地找到一个不匹配的时间间隔。grep和正则表达式可能适用于一些特殊情况,但它们不能扩展到一般情况。

Can you use some tool that can actually "understand" the timestamps? Probably every scripting language out there (perl, python, ruby, lua) has builtin or library support for parsing time and date.

你能使用一些可以真正“理解”时间戳的工具吗?可能所有的脚本语言(perl、python、ruby、lua)都有用于解析时间和日期的内置或库支持。

However, you might be able to employ the powers of GNU date:

但是,您也许可以使用 GNU date 的功能:

##代码##

and plug that into Tiago's sed filter idea.

并将其插入 Tiago 的 sed 过滤器想法中。

回答by H?kon H?gland

You can try the following perl script:

您可以尝试以下 perl 脚本:

##代码##

Run it from command line using syntax : ./p.pl A_22:09:14.

使用语法从命令行运行它:./p.pl A_22:09:14