bash 从最近 3 分钟的日志文件中获取/提取数据?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/21042534/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-18 09:12:59  来源:igfitidea点击:

Get/extract the data from log file of last 3 minutes?

linuxbash

提问by Gaurav Katkamwar

I have agent.log file. This file is updating as regular interval.

我有 agent.log 文件。此文件定期更新。

Entries are as follows 2014-01-07 03:43:35,223 INFO ...some data

条目如下 2014-01-07 03:43:35,223 INFO ...some data

I want to extract data of last 3 minutes, Is there any way so that I will get this data using bash script?

我想提取最后 3 分钟的数据,有什么办法可以使用 bash 脚本获取这些数据?

回答by Vlad.Bachurin

Try the solution below:

尝试以下解决方案:

awk \
-v start="$(date +"%F %R" --date=@$(expr `date +%s` - 180))" \
-v end="$(date "+%F %R")" \
'
NOW=$(date +"%F %R")
M1=$(date --date="@$(($(date +"%s") - 1*60))" +"%F %R")
M2=$(date --date="@$(($(date +"%s") - 2*60))" +"%F %R")
grep '^'"$NOW\|$M1\|$M2" agent.log
~ start,
#!/bin/bash
# this script expects descending dates in a log file (reverse as real life examples)!!!
FILE=
INTV=180 # sec

while read LINE
do    
    if [ -z $LAST_LOG_LINE ]
    then
        # interval stat line
        LAST_LOG_LINE=$(date --date="$( echo "$LINE" | sed -e 's/INFO.*//')" +%s)
        # mod 
        #continue 
    fi
    ACT_LOG_LINE=$(date --date="$( echo "$LINE" | sed -e 's/INFO.*//')" +%s)
    # print line if not greater than $INTV (180s)
    # else break the reading and exit
    if [ $(($LAST_LOG_LINE-$ACT_LOG_LINE)) -gt $INTV ]
    then
        break
    fi
    # actual print
    echo "$LINE"
done < $FILE
~ end' \ agent.log

In the startvariable there is the time stamp 3 minutes (180 seconds) before the current time.

start变量中有当前时间前 3 分钟(180 秒)的时间戳。

In the endthere is the current time.

end有当前时间。

$0 ~ start, $0 ~ endselects the lines between startand end

$0 ~ start, $0 ~ end选择之间的线startend

回答by dargaud

date +"%F %R"gives you the current time down to the minute.

date +"%F %R"为您提供精确到分钟的当前时间。

grep '^'"$(date +"%F %R")" agent.logwill select the last minute from the file

grep '^'"$(date +"%F %R")" agent.log将从文件中选择最后一分钟

Now for the previous two minutes it's more tricky... I have developed some scripts that can do complete time manipulation in relative or absolute, and it may be simpler than fiddling with date...

现在前两分钟更棘手...我已经开发了一些脚本,可以在相对或绝对时间进行完整的操作,而且它可能比摆弄date...更简单...

2 minutes ago in the right format: date --date="@$(($(date +"%s") - 2*60))" +"%F %R"

2 分钟前以正确的格式: date --date="@$(($(date +"%s") - 2*60))" +"%F %R"

Merge all 3:

合并所有 3:

2014-01-07 03:43:35,223 INFO ...some data
2014-01-07 03:42:35,223 INFO ...some data
2014-01-07 03:41:35,223 INFO ...some data
2014-01-07 03:40:35,223 INFO ...some data
2014-01-07 02:43:35,223 INFO ...some data
2014-01-07 01:43:35,223 INFO ...some data
2014-01-06 03:43:35,223 INFO ...some data

回答by csikos.balint

my answer considers the followings:

我的回答考虑了以下几点:

  1. using bash and UNIX/Linux commands
  2. the last log line is the start time not the actual server time
  3. there is no expectation about the lines' date (minutes, days, years, etc.)
  4. the whole script should be expandable to the inverse, or a specified from-to interval

    #!/bin/bash
    # this script expects descending dates in a log file (reverse as real life examples)!!!
    FILE=
    INTV=180 # sec
    
    while read LINE
    do    
        if [ -z $LAST_LOG_LINE ]
        then
            # interval stat line
            LAST_LOG_LINE=$(date --date="$( echo "$LINE" | sed -e 's/INFO.*//')" +%s)
            # mod 
            #continue 
        fi
        ACT_LOG_LINE=$(date --date="$( echo "$LINE" | sed -e 's/INFO.*//')" +%s)
        # print line if not greater than $INTV (180s)
        # else break the reading and exit
        if [ $(($LAST_LOG_LINE-$ACT_LOG_LINE)) -gt $INTV ]
        then
            break
        fi
        # actual print
        echo "$LINE"
    done < $FILE
    

    Testing:

    2014-01-07 03:43:35,223 INFO ...some data
    2014-01-07 03:42:35,223 INFO ...some data
    2014-01-07 03:41:35,223 INFO ...some data
    2014-01-07 03:40:35,223 INFO ...some data
    2014-01-07 02:43:35,223 INFO ...some data
    2014-01-07 01:43:35,223 INFO ...some data
    2014-01-06 03:43:35,223 INFO ...some data
    
  1. 使用 bash 和 UNIX/Linux 命令
  2. 最后一行是开始时间而不是实际的服务器时间
  3. 对行的日期(分钟、天、年等)没有期望
  4. 整个脚本应该可以扩展到逆向,或指定的起始间隔

        $ /tmp/stack.sh /tmp/log 
        2014-01-07 03:42:35,223 INFO ...some data
        2014-01-07 03:41:35,223 INFO ...some data
        2014-01-07 03:40:35,223 INFO ...some data
        $
    

    测试:

    #!/usr/bin/env python                                                           
    from datetime import datetime, timedelta                                        
    
    with open('agent.log') as f:                                                    
        for line in f:                                                              
             logdate = datetime.strptime(line.split(',')[0], '%Y-%m-%d %H:%M:%S')                                                                      
             if logdate >= datetime.now() - timedelta(minutes=3):                   
                 print(line) 
    


#! /usr/bin/env ruby

require 'date'
require 'pathname'

if ARGV.length != 4
        $stderr.print "usage: #{##代码##} -d|-h|-m|-s time expression log_file\n"
        exit 1
end
begin
        total_amount = Integer ARGV[1]
rescue ArgumentError
        $stderr.print "error: parameter 'time' must be an Integer\n"
        $stderr.print "usage: #{##代码##} -d|-h|-m|-s time expression log_file\n"
end

if ARGV[0] == "-m"
        gap = Rational(60, 86400)
        time_str = "%Y-%m-%d %H:%M"
elsif ARGV[0] == "-s"
        gap = Rational(1, 86400)
        time_str = "%Y-%m-%d %H:%M:%S"
elsif ARGV[0] == "-h"
        gap = Rational(3600, 86400)
        time_str = "%Y-%m-%d %H"
elsif ARGV[0] == "-d"
        time_str = "%Y-%m-%d"
        gap = 1
else
        $stderr.print "usage: #{##代码##} -d|-h|-m|-s time expression log_file\n"
        exit 1
end

pn = Pathname.new(ARGV[3])
if pn.exist?
        log = (pn.directory?) ? ARGV[3] + "/*" : ARGV[3]
else
        $stderr.print "error: file '" << ARGV[3] << "' does not exist\n"
        $stderr.print "usage: #{##代码##} -d|-h|-m|-s time expression log_file\n"
end

search_str = ARGV[2]
now = DateTime.now

total_amount.times do
        now -= gap
        system "cat " << log << " | grep '" << now.strftime(time_str) << ".*" << search_str << "'"
end

回答by benjwadams

I think you may be somewhat better off using Python in this case. Even if this script doesn't find a date exactly 3 minutes ago, it will still get any log entries in between the time the script was called and 3 minutes ago. This is both concise and more robust than some of the previous solutions offered.

我认为在这种情况下使用 Python 可能会更好一些。即使此脚本没有找到恰好 3 分钟前的日期,它仍然会在调用脚本和 3 分钟前之间获取任何日志条目。与之前提供的一些解决方案相比,这既简洁又更健壮。

##代码##

回答by simi

A Ruby solution (tested on ruby 1.9.3)

Ruby 解决方案(在 ruby​​ 1.9.3 上测试)

You can pass days, hours, minutes or seconds as a parameter and it will search for the expression and on the file specified (or directory, in which case it will append '/*' to the name):

您可以将天、小时、分钟或秒作为参数传递,它将搜索表达式和指定的文件(或目录,在这种情况下,它将在名称后附加“/*”):

In your case just call the script like so: $0 -m 3 "expression" log_file

在您的情况下,只需像这样调用脚本: $0 -m 3 "expression" log_file

Note: Also if you know the location of 'ruby' change the shebang (first line of the script), for security reasons.

注意:此外,如果您知道 'ruby' 的位置,请出于安全原因更改shebang(脚本的第一行)。

##代码##