bash 在指定的时间范围内从日志文件中提取数据
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/7575267/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Extract data from log file in specified range of time
提问by ham raaz _e
I want to extract information from a log file using a shell script (bash) based on time range. A line in the log file looks like this:
我想根据时间范围使用 shell 脚本 (bash) 从日志文件中提取信息。日志文件中的一行如下所示:
172.16.0.3 - - [31/Mar/2002:19:30:41 +0200] "GET / HTTP/1.1" 200 123 "" "Mozilla/5.0 (compatible; Konqueror/2.2.2-2; Linux)"
i want to extract data specific intervals. For example I need to look only at the events which happened during the last X minutes or X days ago from the last recorded data. I'm new in shell scripting but i have tried to use grep command.
我想提取数据特定的时间间隔。例如,我只需要查看最后记录的数据中最后 X 分钟或 X 天前发生的事件。我是 shell 脚本的新手,但我尝试过使用 grep 命令。
回答by ychaouche
You can use sed
for this. For example:
您可以sed
为此使用。例如:
$ sed -n '/Feb 23 13:55/,/Feb 23 14:00/p' /var/log/mail.log
Feb 23 13:55:01 messagerie postfix/smtpd[20964]: connect from localhost[127.0.0.1]
Feb 23 13:55:01 messagerie postfix/smtpd[20964]: lost connection after CONNECT from localhost[127.0.0.1]
Feb 23 13:55:01 messagerie postfix/smtpd[20964]: disconnect from localhost[127.0.0.1]
Feb 23 13:55:01 messagerie pop3d: Connection, ip=[::ffff:127.0.0.1]
...
How it works
这个怎么运作
The -n
switch tells sed to not output each line of the file it reads (default behaviour).
该-n
开关告诉 sed 不要输出它读取的文件的每一行(默认行为)。
The last p
after the regular expressions tells it to print lines that match the preceding expression.
p
正则表达式之后的最后一个告诉它打印与前面的表达式匹配的行。
The expression '/pattern1/,/pattern2/'
will print everything that is between first pattern and second pattern. In this case it will print every line it finds between the string Feb 23 13:55
and the string Feb 23 14:00
.
该表达式'/pattern1/,/pattern2/'
将打印第一个模式和第二个模式之间的所有内容。在这种情况下,它将打印在 stringFeb 23 13:55
和 string之间找到的每一行Feb 23 14:00
。
回答by ztank1013
Use grep and regular expressions, for example if you want 4 minutes interval of logs:
使用 grep 和正则表达式,例如,如果您想要 4 分钟的日志间隔:
grep "31/Mar/2002:19:3[1-5]" logfile
will return all logs lines between 19:31 and 19:35 on 31/Mar/2002. Supposing you need the last 5 days starting from today 27/Sep/2011 you may use the following:
将返回 2002 年 3 月 31 日 19:31 到 19:35 之间的所有日志行。假设您需要从今天 27/Sep/2011 开始的最后 5 天,您可以使用以下内容:
grep "2[3-7]/Sep/2011" logfile
回答by Kent
well, I have spent some time on your date format.....
好吧,我花了一些时间研究您的日期格式.....
however, finally i worked it out..
然而,最后我解决了..
let's take an example file (named logFile), i made it a bit short. say, you want to get last 5 mins' log in this file:
让我们举一个示例文件(名为logFile),我让它有点短。说,你想在这个文件中获得最后 5 分钟的日志:
172.16.0.3 - - [31/Mar/2002:19:20:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:20:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:20:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:20:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:20:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:20:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:20:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:20:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:20:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:20:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:20:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:20:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:20:41 +0200] "GET
### lines below are what you want (5 mins till the last record)
172.16.0.3 - - [31/Mar/2002:19:27:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:27:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:27:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:27:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:27:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:27:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:27:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:27:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:27:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:27:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:27:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:27:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:27:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:27:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:30:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:30:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:30:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:30:41 +0200] "GET
here is the solution:
这是解决方案:
# this variable you could customize, important is convert to seconds.
# e.g 5days=$((5*24*3600))
x=$((5*60)) #here we take 5 mins as example
# this line get the timestamp in seconds of last line of your logfile
last=$(tail -n1 logFile|awk -F'[][]' '{ gsub(/\//," ",); sub(/:/," ",); "date +%s -d \"""\""|getline d; print d;}' )
#this awk will give you lines you needs:
awk -F'[][]' -v last=$last -v x=$x '{ gsub(/\//," ",); sub(/:/," ",); "date +%s -d \"""\""|getline d; if (last-d<=x)print 172.16.0.3 - - 31 Mar 2002 19:27:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:27:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:27:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:27:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:27:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:27:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:27:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:27:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:27:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:27:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:27:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:27:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:27:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:27:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:30:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:30:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:30:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:30:41 +0200 "GET
}' logFile
output:
输出:
$ grep "DHCPACK" /var/log/messages | grep "$(date +%h\ %d) [$(date --date='5 min ago' %H)-$(date +%H)]:*:*"
EDIT
编辑
you may notice that in the output the [ and ] are disappeared. If you do want them back, you can change the last awk line print $0
-> print $1 "[" $2 "]" $3
您可能会注意到在输出中 [ 和 ] 消失了。如果你确实想要它们回来,你可以改变最后的 awk 行print $0
->print $1 "[" $2 "]" $3
回答by sdeva
I used this command to find last 5 minutes logs for particular event "DHCPACK
", try below:
我使用此命令查找特定事件“ DHCPACK
”的最后 5 分钟日志,请尝试以下操作:
#!/bin/bash
log="log_file_name"
while read line
do
current_hours=`date | awk 'BEGIN{FS="[ :]+"}; {print }'`
current_minutes=`date | awk 'BEGIN{FS="[ :]+"}; {print }'`
current_seconds=`date | awk 'BEGIN{FS="[ :]+"}; {print }'`
log_file_hours=`echo $line | awk 'BEGIN{FS="[ [/:]+"}; {print }'`
log_file_minutes=`echo $line | awk 'BEGIN{FS="[ [/:]+"}; {print }'`
log_file_seconds=`echo $line | awk 'BEGIN{FS="[ [/:]+"}; {print }'`
done < $log
回答by nick
You can use this for getting current and log times:
您可以使用它来获取当前和日志时间:
##代码##And compare log_file_*
and current_*
variables.
并比较log_file_*
和current_*
变量。