bash 如何使用 sed 或 grep 仅提取格式化的日期字段?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/14372305/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-18 04:16:37  来源:igfitidea点击:

How can I extract just the formatted date fields using sed or grep?

regexbashsedawkgrep

提问by user1985920

I need a grep or sed statement that would only take out the dates from this statement:

我需要一个 grep 或 sed 语句,它只会从这个语句中取出日期:

echo 'asdfdsfa asdfs 12-Dec-13 asdasd asdf 11-Jan-12 asdasd' 

So answer should be something like this:

所以答案应该是这样的:

12-Dec-13 11-Jan-12

I have gotten far enough to get 12-Dec-13 asdasd asdf 11-Jan-12, but I cant remove the content between the dates. Is it possible to use a sed statement to keep the first word and last word using space to show which is the last word? The result should remain the same.

我已经获得了足够的信息12-Dec-13 asdasd asdf 11-Jan-12,但我无法删除日期之间的内容。是否可以使用 sed 语句使用空格来保留第一个单词和最后一个单词以显示哪个是最后一个单词?结果应该保持不变。

回答by Todd A. Jacobs

Use POSIX Character Classes

使用 POSIX 字符类

A set of POSIX character classes would match your desired text. For example:

一组 POSIX 字符类将匹配您想要的文本。例如:

\b[[:digit:]]{2}-[[:upper:]][[:lower:]]{2}-[[:digit:]]{2}\b

Sample Input/Output

样本输入/输出

The following pipeline will extract just the relevant text using GNU Grep, then concatenate the dates:

以下管道将使用 GNU Grep 仅提取相关文本,然后连接日期:

$ echo 'asdfdsfa asdfs 12-Dec-13 asdasd asdf 11-Jan-12 asdasd' |
    grep -Eo '\b[[:digit:]]{2}-[[:upper:]][[:lower:]]{2}-[[:digit:]]{2}\b' |
    xargs
12-Dec-13 11-Jan-12

回答by loganaayahee

 grep -o "[0-9]\{2\}-[^0-9]\{3\}-[^a-z]\{2\}" file | sed "N;s/\n/ /g"

12-Dec-13 11-Jan-12

13 年 12 月 12 日 12 年 1 月 11 日

回答by SHAILESH PATEL

try this one:

试试这个:

echo 'asdfdsfa asdfs 12-Dec-13 asdasd asdf 11-Jan-12 asdasd'  | sed 's: :\n:g' | grep ^[0-9]

回答by tsmets

I had access logs where the dates where stupidly formatted : [30/Jun/2013:08:00:45 +0200]

我有访问日志,其中日期格式愚蠢:[30/Jun/2013:08:00:45 +0200]

but I needed to display it as : 30/Jun/2013 08:00:45

但我需要将其显示为:30/Jun/2013 08:00:45

The problem is that using "OR" in my grep statement, I was receiving the 2 match expressions on 2 separated lines.

问题是在我的 grep 语句中使用“OR”时,我在 2 个分隔的行上收到了 2 个匹配表达式。

Here is the solution :

这是解决方案:

grep -in myURL_of_interest *access.log | \ grep -Eo '(\b[[:digit:]]{2}/[[:upper:]][[:lower:]]{2}/[[:digit:]]{4}|[[:digit:]]{2}:[[:digit:]]{2}:[[:digit:]]{2}\b)' \ | paste - - -d" " > MyAccess.log

grep -in myURL_of_interest *access.log | \ grep -Eo '(\b[[:digit:]]{2}/[[:upper:]][[:lower:]]{2}/[[:digit:]]{4}|[[:digit:]]{2}:[[:digit:]]{2}:[[:digit:]]{2}\b)' \ | paste - - -d" " > MyAccess.log

I hope it helps :)

我希望它有帮助:)

回答by potong

This might work for you (GNU sed):

这可能对你有用(GNU sed):

sed -r 'H;g;:a;s/\s*\n$//;t;s/\n(..-...-..)\b/ \n/;ta;s/\n([^0-9]+)/\n/;ta' file

回答by Guru

One way:

单程:

$ echo 'asdfdsfa asdfs 12-Dec-13 asdasd asdf 11-Jan-12 asdasd' | sed 's/.*\(..-...-..\).*\(..-...-..\).*/ /'
12-Dec-13 11-Jan-12

To make the search pattern more specific for numbers and alphabets:

要使搜索模式更具体地针对数字和字母:

$ echo 'asdfdsfa asdfs 12-Dec-13 asdasd asdf 11-Jan-12 asdasd' | sed 's/.*\([0-9][0-9]-[a-zA-Z]\{3\}-[0-9][0-9]\).*\([0-9][0-9]-[a-zA-Z]\{3\}-[0-9][0-9]\).*/ /'
12-Dec-13 11-Jan-12

回答by Babasaheb Gosavi

use following

使用以下

echo 'asdfdsfa asdfs 12-Dec-13 asdasd asdf 11-Jan-12 asdasd' | sed 's/ /\n/g' |grep '-' | tr -d '\n' |sed 's/$/ \n/g'

output is

输出是

12-Dec-1311-Jan-12

12 年 12 月 1311 年 1 月 12 日

回答by Mirage

Try with awk

用awk试试

awk '{for(i=1; i<NF; ++i){if ($i ~ /[0-9]+[-\w]*/) print $i}}' temp.txt

awk '{for(i=1; i<NF; ++i){if ($i ~ /[0-9]+[-\w]*/) print $i}}' temp.txt

Will work with any number of lines and columns

可以处理任意数量的行和列

回答by Vijay

perl -lne '@a=/([\d]+-[a-zA-Z]{3}-[\d]+)/g;print "@a"'

tested:

测试:

> echo 'asdfdsfa 12-Dec-13 asdf 11-Jan-12 asdasd' | perl -lne '@a=/([\d]+-[a-zA-Z]{3}-[\d]+)/g;print "@a"'
12-Dec-13 11-Jan-12

回答by Suku

I am suggesting date -d. So it will even validate the date.

我建议date -d。所以它甚至会验证日期。

$ cat string 
asdfdsfa asdfs 12-Dec-13 asdasd asdf 11-Jan-12 asdasd

$ for i in `cat string`; do date -d $i &>/dev/null && echo $i; done
12-Dec-13
11-Jan-12