bash 如何在bash脚本中使用awk过滤2个日期之间的数据

Question

提问by Gheorghe Frunza

Hi I have the following log file structure:

嗨，我有以下日志文件结构：

####<19-Jan-2015 07:16:47 o'clock UTC> <Notice> <Stdout> <example.com>
####<20-Jan-2015 07:16:43 o'clock UTC> <Notice> <Stdout> <example2.com>
####<21-Jan-2015 07:16:48 o'clock UTC> <Notice> <Stdout> <example3.com>

How can I filter this file by a date interval, for example: Show all data between 19'th and 20'th of January 2015

如何按日期间隔过滤此文件，例如：显示 2015 年 1 月 19 日和 20 日之间的所有数据

I tried to use awkbut I have problems converting 19-Jan-2015to 2015-01-19to continue comparison of dates.

我试图用awk，但我有问题转换19-Jan-2015到2015-01-19继续比较日期。

Answer 1

回答by Wintermute

For an oddball date format like that, I'd outsource the date parsing to the dateutility.

对于这种奇怪的日期格式，我会将日期解析外包给date实用程序。

#!/usr/bin/awk -f

# Formats the timestamp as a number, so that higher numbers represent
# a later timestamp. This will not handle the time zone because date
# can't handle the o'clock notation. I hope all your timestamps use the
# same time zone, otherwise you'll have to hack support for it in here.
function datefmt(d) {
  # make d compatible with singly-quoted shell strings
  gsub(/'/, "'\''", d)

  # then run the date command and get its output
  command = "date -d '" d "' +%Y%m%d%H%M%S"
  command | getline result
  close(command)

  # that's our result.
  return result;
}

BEGIN {
  # Field separator, so the part of the timestamp we'll parse is in  and 
  FS = "[< >]+"

  # start, end set here.
  start = datefmt("19-Jan-2015 00:00:00")
  end   = datefmt("20-Jan-2015 23:59:59")
}

{
  # convert the timestamp into an easily comparable format
  stamp = datefmt( " " )

  # then print only lines in which the time stamp is in the range.
  if(stamp >= start && stamp <= end) {
    print
  }
}

Answer 2

回答by LogicIO

If the name of the file is example.txt, the the below script should work

如果文件名是example.txt，下面的脚本应该可以工作

 for i in `awk -F'<' {'print '} example.txt| awk {'print "_"'}`; do date=`echo $i | sed 's/_/ /g'`;  dunix=`date -d "$date" +%s`; if [[ (($dunix -ge 1421605800)) && (($dunix -le 1421778599)) ]]; then  grep "$date" example.txt;fi;  done

The script just converts the time provided in to unix timestamp, then compares the time and print the lines that meets the condition from the file.

该脚本只是将提供的时间转换为 unix 时间戳，然后比较时间并打印文件中满足条件的行。

Answer 3

回答by glenn Hymanman

Using string comparisons jwill be faster than creating date objects:

使用字符串比较 jwill 比创建日期对象更快：

awk -F '<' '
    {split(, d, /[- ]/)} 
    d[3]=="2015" && d[2]=="Jan" && 19<=d[1] && d[1]<=20
' file

Answer 4

回答by glenn Hymanman

Another way using mktime all in awk

在 awk 中使用 mktime 的另一种方法

awk '

BEGIN{
        From=mktime("2015 01 19 00 00 00")
        To=mktime("2015 01 20 00 00 00")
}
{Time=0}
match(####<19-Jan-2015 07:16:47 o'clock UTC> <Notice> <Stdout> <example.com>
,/<([^ ]+) ([^ ]+)/,a){
        split(a[1],b,"-")
        split(a[2],c,":")
        b[2]=(index("JanFebMarAprMayJunJulAugSepOctNovDec",b[2])+2)/3
        Time=mktime(b[3]" "b[2]" "b[1]" "c[1]" "c[2]" "c[3])
}
Time<To&&Time>From

' file

Output

输出

BEGIN{
        From=mktime("2015 01 19 00 00 00")
        To=mktime("2015 01 20 00 00 00")
}

How it works

这个怎么运作

{time=0}

Before processing the lines set the dates To and From where the data we want will be between the two.
This format is required for mktimeto work.
The format is YYYY MM DD HH MM SS.

在处理这些行之前，设置日期 To 和 From ，我们想要的数据将在两者之间。
这种格式是mktime工作所必需的。
格式为YYYY MM DD HH MM SS.

match(    split(a[1],b,"-")
    split(a[2],c,":")
,/<([^ ]+) ([^ ]+)/,a)

Reset time so further lines that don't match are not printed

重置时间，以便不打印更多不匹配的行

b[2]=(index("JanFebMarAprMayJunJulAugSepOctNovDec",b[2])+2)/3

Matches the first two words after the <and stores them in a. Executes the next block if this is successful.

匹配 the 之后的前两个单词<并将它们存储在 a 中。如果成功，则执行下一个块。

 Time=mktime(b[3]" "b[2]" "b[1]" "c[1]" "c[2]" "c[3])

Splits the date and time into individual numbers/Month.

将日期和时间拆分为单独的数字/月。

Time<To&&Time>From

Converts month to number using the fact that all of them are three characters and then dividing by 3.

使用所有月份都是三个字符然后除以 3 的事实将月份转换为数字。

##代码##

makes time with collected values

为收集的值腾出时间

##代码##

if the time is more than Fromand less than Toit is inside the desired range and the default action for awk is to print.

如果时间大于From或小于To它在所需范围内并且 awk 的默认操作是打印。

Resources

资源

https://www.gnu.org/software/gawk/manual/html_node/Time-Functions.html

bash 如何在bash脚本中使用awk过滤2个日期之间的数据

提问by Gheorghe Frunza

回答by Wintermute

回答by LogicIO

回答by glenn Hymanman

回答by glenn Hymanman

Output

输出

How it works

这个怎么运作

Resources

资源

相关推荐

最近更新

标签

bash 如何在bash脚本中使用awk过滤2个日期之间的数据

提问by Gheorghe Frunza

回答by Wintermute

回答by LogicIO

回答by glenn Hymanman

回答by glenn Hymanman

Output

输出

How it works

这个怎么运作

Resources

资源

相关推荐

bash 在 Linux 中复制具有特定日期的文件夹及其文件

bash SED：删除两个字符串之间的文本，跨行重复

bash 第 1 行：?#!/usr/bin/sh：尝试执行 shell 脚本时未找到

bash 使用 cURL 获取当前会话的 cookie

相关推荐

最近更新

标签