bash 从最近 3 分钟的日志文件中获取/提取数据?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/21042534/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Get/extract the data from log file of last 3 minutes?
提问by Gaurav Katkamwar
I have agent.log file. This file is updating as regular interval.
我有 agent.log 文件。此文件定期更新。
Entries are as follows 2014-01-07 03:43:35,223 INFO ...some data
条目如下 2014-01-07 03:43:35,223 INFO ...some data
I want to extract data of last 3 minutes, Is there any way so that I will get this data using bash script?
我想提取最后 3 分钟的数据,有什么办法可以使用 bash 脚本获取这些数据?
回答by Vlad.Bachurin
Try the solution below:
尝试以下解决方案:
awk \
-v start="$(date +"%F %R" --date=@$(expr `date +%s` - 180))" \
-v end="$(date "+%F %R")" \
'NOW=$(date +"%F %R")
M1=$(date --date="@$(($(date +"%s") - 1*60))" +"%F %R")
M2=$(date --date="@$(($(date +"%s") - 2*60))" +"%F %R")
grep '^'"$NOW\|$M1\|$M2" agent.log
~ start, #!/bin/bash
# this script expects descending dates in a log file (reverse as real life examples)!!!
FILE=
INTV=180 # sec
while read LINE
do
if [ -z $LAST_LOG_LINE ]
then
# interval stat line
LAST_LOG_LINE=$(date --date="$( echo "$LINE" | sed -e 's/INFO.*//')" +%s)
# mod
#continue
fi
ACT_LOG_LINE=$(date --date="$( echo "$LINE" | sed -e 's/INFO.*//')" +%s)
# print line if not greater than $INTV (180s)
# else break the reading and exit
if [ $(($LAST_LOG_LINE-$ACT_LOG_LINE)) -gt $INTV ]
then
break
fi
# actual print
echo "$LINE"
done < $FILE
~ end' \
agent.log
In the start
variable there is the time stamp 3 minutes (180 seconds) before the current time.
在start
变量中有当前时间前 3 分钟(180 秒)的时间戳。
In the end
there is the current time.
在end
有当前时间。
$0 ~ start, $0 ~ end
selects the lines between start
and end
$0 ~ start, $0 ~ end
选择之间的线start
和end
回答by dargaud
date +"%F %R"
gives you the current time down to the minute.
date +"%F %R"
为您提供精确到分钟的当前时间。
grep '^'"$(date +"%F %R")" agent.log
will select the last minute from the file
grep '^'"$(date +"%F %R")" agent.log
将从文件中选择最后一分钟
Now for the previous two minutes it's more tricky... I have developed some scripts that can do complete time manipulation in relative or absolute, and it may be simpler than fiddling with date
...
现在前两分钟更棘手...我已经开发了一些脚本,可以在相对或绝对时间进行完整的操作,而且它可能比摆弄date
...更简单...
2 minutes ago in the right format: date --date="@$(($(date +"%s") - 2*60))" +"%F %R"
2 分钟前以正确的格式: date --date="@$(($(date +"%s") - 2*60))" +"%F %R"
Merge all 3:
合并所有 3:
2014-01-07 03:43:35,223 INFO ...some data
2014-01-07 03:42:35,223 INFO ...some data
2014-01-07 03:41:35,223 INFO ...some data
2014-01-07 03:40:35,223 INFO ...some data
2014-01-07 02:43:35,223 INFO ...some data
2014-01-07 01:43:35,223 INFO ...some data
2014-01-06 03:43:35,223 INFO ...some data
回答by csikos.balint
my answer considers the followings:
我的回答考虑了以下几点:
- using bash and UNIX/Linux commands
- the last log line is the start time not the actual server time
- there is no expectation about the lines' date (minutes, days, years, etc.)
the whole script should be expandable to the inverse, or a specified from-to interval
#!/bin/bash # this script expects descending dates in a log file (reverse as real life examples)!!! FILE= INTV=180 # sec while read LINE do if [ -z $LAST_LOG_LINE ] then # interval stat line LAST_LOG_LINE=$(date --date="$( echo "$LINE" | sed -e 's/INFO.*//')" +%s) # mod #continue fi ACT_LOG_LINE=$(date --date="$( echo "$LINE" | sed -e 's/INFO.*//')" +%s) # print line if not greater than $INTV (180s) # else break the reading and exit if [ $(($LAST_LOG_LINE-$ACT_LOG_LINE)) -gt $INTV ] then break fi # actual print echo "$LINE" done < $FILE
Testing:
2014-01-07 03:43:35,223 INFO ...some data 2014-01-07 03:42:35,223 INFO ...some data 2014-01-07 03:41:35,223 INFO ...some data 2014-01-07 03:40:35,223 INFO ...some data 2014-01-07 02:43:35,223 INFO ...some data 2014-01-07 01:43:35,223 INFO ...some data 2014-01-06 03:43:35,223 INFO ...some data
- 使用 bash 和 UNIX/Linux 命令
- 最后一行是开始时间而不是实际的服务器时间
- 对行的日期(分钟、天、年等)没有期望
整个脚本应该可以扩展到逆向,或指定的起始间隔
$ /tmp/stack.sh /tmp/log 2014-01-07 03:42:35,223 INFO ...some data 2014-01-07 03:41:35,223 INFO ...some data 2014-01-07 03:40:35,223 INFO ...some data $
测试:
#!/usr/bin/env python from datetime import datetime, timedelta with open('agent.log') as f: for line in f: logdate = datetime.strptime(line.split(',')[0], '%Y-%m-%d %H:%M:%S') if logdate >= datetime.now() - timedelta(minutes=3): print(line)
#! /usr/bin/env ruby
require 'date'
require 'pathname'
if ARGV.length != 4
$stderr.print "usage: #{##代码##} -d|-h|-m|-s time expression log_file\n"
exit 1
end
begin
total_amount = Integer ARGV[1]
rescue ArgumentError
$stderr.print "error: parameter 'time' must be an Integer\n"
$stderr.print "usage: #{##代码##} -d|-h|-m|-s time expression log_file\n"
end
if ARGV[0] == "-m"
gap = Rational(60, 86400)
time_str = "%Y-%m-%d %H:%M"
elsif ARGV[0] == "-s"
gap = Rational(1, 86400)
time_str = "%Y-%m-%d %H:%M:%S"
elsif ARGV[0] == "-h"
gap = Rational(3600, 86400)
time_str = "%Y-%m-%d %H"
elsif ARGV[0] == "-d"
time_str = "%Y-%m-%d"
gap = 1
else
$stderr.print "usage: #{##代码##} -d|-h|-m|-s time expression log_file\n"
exit 1
end
pn = Pathname.new(ARGV[3])
if pn.exist?
log = (pn.directory?) ? ARGV[3] + "/*" : ARGV[3]
else
$stderr.print "error: file '" << ARGV[3] << "' does not exist\n"
$stderr.print "usage: #{##代码##} -d|-h|-m|-s time expression log_file\n"
end
search_str = ARGV[2]
now = DateTime.now
total_amount.times do
now -= gap
system "cat " << log << " | grep '" << now.strftime(time_str) << ".*" << search_str << "'"
end
回答by benjwadams
I think you may be somewhat better off using Python in this case. Even if this script doesn't find a date exactly 3 minutes ago, it will still get any log entries in between the time the script was called and 3 minutes ago. This is both concise and more robust than some of the previous solutions offered.
我认为在这种情况下使用 Python 可能会更好一些。即使此脚本没有找到恰好 3 分钟前的日期,它仍然会在调用脚本和 3 分钟前之间获取任何日志条目。与之前提供的一些解决方案相比,这既简洁又更健壮。
##代码##回答by simi
A Ruby solution (tested on ruby 1.9.3)
Ruby 解决方案(在 ruby 1.9.3 上测试)
You can pass days, hours, minutes or seconds as a parameter and it will search for the expression and on the file specified (or directory, in which case it will append '/*' to the name):
您可以将天、小时、分钟或秒作为参数传递,它将搜索表达式和指定的文件(或目录,在这种情况下,它将在名称后附加“/*”):
In your case just call the script like so: $0 -m 3 "expression" log_file
在您的情况下,只需像这样调用脚本: $0 -m 3 "expression" log_file
Note: Also if you know the location of 'ruby' change the shebang (first line of the script), for security reasons.
注意:此外,如果您知道 'ruby' 的位置,请出于安全原因更改shebang(脚本的第一行)。
##代码##