如何使用 Bash 在两个时间戳之间的文件中搜索行
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/23697958/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to search for lines in a file between two timestamps using Bash
提问by herder
In bash I am trying to read a log file and will print only the lines that have a timestamp between two specific times. The time format is hh:mm:ss. For example, I would be searching for lines that would fall between 12:52:33 to 12:59:33.
在 bash 中,我试图读取一个日志文件,并且只打印在两个特定时间之间具有时间戳的行。时间格式为 hh:mm:ss。例如,我将搜索介于 12:52:33 到 12:59:33 之间的行。
I want to use regular expression because I can use it in grep
function.
我想使用正则表达式,因为我可以在grep
函数中使用它。
Each log line begins with some_nr 2014-05-15 21:58:00,000000 rest_of_line
.
每个日志行都以some_nr 2014-05-15 21:58:00,000000 rest_of_line
.
My solution gives me lines with 1 min margin. I cut out ss
and take all lines with hh:mm:[0-9]{2}
. $2 has format filename_hh:mm:;
for example: "24249_16:05:;24249_16:05:;24249_16:07:;24249_16:07:;24249_16:08:"
我的解决方案给了我 1 分钟边距的线条。我剪掉了ss
所有的行hh:mm:[0-9]{2}
。$2 的格式filename_hh:mm:;
例如:"24249_16:05:;24249_16:05:;24249_16:07:;24249_16:07:;24249_16:08:"
My code:
我的代码:
B=
for line in ${B//;/ } ;
do
TENT=`echo $line | awk '{split(1002143 1002143 2014/15/05 22:09:52.937004 bla
1002130 2014/15/05 22:09:44.786002 bla bla
1001667 2014/15/05 22:09:44.592009 bl a bla
1001667 1001667 2014/15/05 22:09:44.592009 bl a bla
,numbers,"_"); print numbers[1]}'`"_logs.txt"
TIME=`echo $line | awk '{split(sed -rne '/<timestamp>/,/<timestamp>/ p' <file>
,numbers,"_"); print numbers[2]}'`"[0-9]{2}"
grep -iE ${TIME} ${TENT} >> ${FILE1}
done
I need a solution with 15 sec margin for any time not 60. I want to have input in format filename_hh:mm:ss
and take lines for hh:mm:ss +/- 15s or filename_hh:mm:ss(1)_hh:mm:ss(2)
and take lines between hh:mm:ss(1) and hh:mm:ss(2). For sometime there is no lines so the solution should 'recognize' if sometimes match inputted interval or not.
我需要一个 15 秒余量的解决方案,而不是 60。我想输入格式filename_hh:mm:ss
并为 hh:mm:ss +/- 15sfilename_hh:mm:ss(1)_hh:mm:ss(2)
取行,或者在 hh:mm:ss(1) 和 hh 之间取行:毫米:秒(2)。有时没有行,因此解决方案应该“识别”有时是否匹配输入的间隔。
Log files look like this:
日志文件如下所示:
tiago@dell:~$ sed -rne '/08:17:38/,/08:24:36/ p' /var/log/syslog May 16 08:17:38 dell AptDaemon.Worker: INFO: Processing transaction /org/debian/apt/transaction/08a244f7b8ce4fad9f6b304aca9eae7a May 16 08:17:50 dell AptDaemon.Worker: INFO: Finished transaction /org/debian/apt/transaction/08a244f7b8ce4fad9f6b304aca9eae7a May 16 08:18:50 dell AptDaemon.PackageKit: INFO: Initializing PackageKit transaction May 16 08:18:50 dell AptDaemon.Worker: INFO: Simulating trans: /org/debian/apt/transaction/37c3ef54a6ba4933a561c49b3fac5f6e May 16 08:18:50 dell AptDaemon.Worker: INFO: Processing transaction /org/debian/apt/transaction/37c3ef54a6ba4933a561c49b3fac5f6e May 16 08:18:51 dell AptDaemon.PackageKit: INFO: Get updates() May 16 08:18:52 dell AptDaemon.Worker: INFO: Finished transaction /org/debian/apt/transaction/37c3ef54a6ba4933a561c49b3fac5f6e May 16 08:24:36 dell AptDaemon: INFO: Quitting due to inactivity
回答by Tiago Lopo
I believe sed is the best option:
我相信 sed 是最好的选择:
awk -v from="12:52:33" -v to="12:59:33" '>=from && <=to' foo.log
ex:
前任:
egrep '12:5[2-9]:33' file.log
回答by Kent
log file is usually sorted by timestamp, assume the timestamp is on the first column, you could:
日志文件通常按时间戳排序,假设时间戳在第一列,您可以:
% date --date="2014-05-15 21:58:00 15 sec ago" +'%Y-%m-%d %H:%M:%S'
2014-05-15 21:57:45
% date --date="2014-05-15 21:58:00 15 sec" +'%Y-%m-%d %H:%M:%S'
2014-05-15 21:58:15
in this way, you can change the from and to
to get different set of log entries. regex is not a good tool to do number calculation/comparison.
通过这种方式,您可以更改from and to
以获得不同的日志条目集。正则表达式不是进行数字计算/比较的好工具。
回答by anubhava
You can use this regex in egrep
:
您可以在egrep
以下位置使用此正则表达式:
#! /usr/bin/perl
use warnings;
use strict;
use Time::Piece;
use autodie;
my $arg=shift;
my @a =split("_",$arg);
my $fn=shift @a;
my $dfmt='%Y/%d/%m';
my $fmt=$dfmt.' %H:%M:%S';
my $t = localtime;
my $date=$t->strftime($dfmt);
my $t1; my $t2;
if (@a == 1) {
my $d=$date.' '.$a[0];
my $tt=Time::Piece->strptime($d, $fmt);
$t1=$tt-15;
$t2=$tt+15;
} elsif (@a == 2) {
$t1=Time::Piece->strptime($date.' '.$a[0], $fmt);
$t2=Time::Piece->strptime($date.' '.$a[1], $fmt);
} else {
die "Unexpected input argument!";
}
$fn=$fn.'_logs.txt';
doGrep($fn,$t1,$t2,$fmt);
sub doGrep {
my ($fn,$t1,$t2,$fmt) = @_;
open (my $fh, "<", $fn);
while (my $line=<$fh>) {
my ($d1,$d2) = $line=~/\S+\s+(\S+)\s+(\d\d:\d\d:\d\d)/;
my $d=$d1.' '.$d2;
my $t=Time::Piece->strptime($d, $fmt);
print $line if ($t>$t1 && $t<$t2);
}
close ($fh);
}
回答by Stefan Schmiedl
You are using the wrong tool for this task. Once you have a regular expression like the one given by @anubhava, you can easily find a time interval that is not matched by it. grep
and regexps might work for a few special cases, but they do not scale to the general case.
您为此任务使用了错误的工具。一旦你有一个像@anubhava 给出的正则表达式,你可以很容易地找到一个不匹配的时间间隔。grep
和正则表达式可能适用于一些特殊情况,但它们不能扩展到一般情况。
Can you use some tool that can actually "understand" the timestamps? Probably every scripting language out there (perl, python, ruby, lua) has builtin or library support for parsing time and date.
你能使用一些可以真正“理解”时间戳的工具吗?可能所有的脚本语言(perl、python、ruby、lua)都有用于解析时间和日期的内置或库支持。
However, you might be able to employ the powers of GNU date:
但是,您也许可以使用 GNU date 的功能:
##代码##and plug that into Tiago's sed filter idea.
并将其插入 Tiago 的 sed 过滤器想法中。
回答by H?kon H?gland
You can try the following perl script:
您可以尝试以下 perl 脚本:
##代码##Run it from command line using syntax : ./p.pl A_22:09:14
.
使用语法从命令行运行它:./p.pl A_22:09:14
。