将日期时间格式与 Bash REGEX 匹配

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/21910200/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-18 09:38:50  来源:igfitidea点击:

Match datetime format with Bash REGEX

regexbashdatetime

提问by araujophillips

I have data with this datetime format in bash:

我在 bash 中有这种日期时间格式的数据:

28/11/13 06:20:05(dd/mm/yy hh:mm:ss)

28/11/13 06:20:05(dd/mm/yy hh:mm:ss)

I need to reformat it like:

我需要像这样重新格式化它:

2013-11-28 06:20:05(MySQL datetime format)

2013-11-28 06:20:05(MySQL 日期时间格式)

I am using the following regex:

我正在使用以下正则表达式:

regex='([0-9][0-9])/([0-9][0-9])/([0-9][0-9])\s([0-9][0-9]/:[0-9][0-9]:[0-9][0-9])'

if [[$line=~$regex]]
then
   $line='20-- ';
fi

This produces an error:

这会产生一个错误:

./filename: line 10: [[09:34:38=~([0-9][0-9])/([0-9][0-9])/([0-9][0-9])\s([0-9][0-9]/:[0-9][0-9]:[0-9][0-9])]]: No such file or directory

UPDATE:

更新:

I want to read this file "line by line", parse it and insert data in mysql database:

我想“逐行”读取这个文件,解析它并在mysql数据库中插入数据:

'filenameX':

'文件名X':

27/11/13 12:20:05 9984 2885 260 54 288 94 696 1852 32 88 27 7 154
27/11/13 13:20:05 9978 2886 262 54 287 93 696 1854 32 88 27 7 154
27/11/13 14:20:05 9955 2875 262 54 287 93 696 1860 32 88 27 7 154
27/11/13 15:20:04 9921 2874 261 54 284 93 692 1868 32 88 27 7 154
27/11/13 16:20:09 9896 2864 260 54 283 92 689 1880 32 88 27 7 154
27/11/13 17:20:05 9858 2858 258 54 279 92 683 1888 32 88 27 7 154
27/11/13 18:20:04 9849 2853 258 54 279 92 683 1891 32 88 27 7 154
27/11/13 19:20:04 9836 2850 257 54 279 93 683 1891 32 88 27 7 154
27/11/13 20:20:05 9826 2845 257 54 279 93 683 1892 32 88 27 7 154
27/11/13 21:20:05 9820 2847 257 54 278 93 682 1892 32 88 27 7 154
27/11/13 22:20:04 9810 2844 257 54 277 93 681 1892 32 88 27 7 154
27/11/13 23:20:04 9807 2843 257 54 276 93 680 1892 32 88 27 7 154
28/11/13 00:20:05 9809 2843 257 54 276 93 680 1747 29 87 17 6 139
28/11/13 01:20:04 9809 2842 257 54 276 93 680 1747 29 87 17 6 139
28/11/13 02:20:05 9809 2843 256 54 276 93 679 1747 29 87 17 6 139
28/11/13 03:20:04 9808 2842 256 54 276 93 679 1747 29 87 17 6 139
28/11/13 04:20:05 9808 2842 256 54 276 93 679 1747 29 87 17 6 139
28/11/13 05:20:39 9807 2842 256 54 276 93 679 1747 29 87 17 6 139
28/11/13 06:20:05 9804 2840 256 54 276 93 679 1747 29 87 17 6 139

Script:

脚本:

#!/bin/bash

echo "Start!"

while IFS='     ' read -ra ADDR;
do
   for line in $(cat results)
   do
      regex='([0-9][0-9])/([0-9][0-9])/([0-9][0-9]) ([0-9][0-9]:[0-9][0-9]:[0-9]$
      if [[ $line =~ $regex ]]; then
         $line="20${BASH_REMATCH[3]}-${BASH_REMATCH[2]}-${BASH_REMATCH[1]} ${BASH_REMATCH[4]}"
      fi
      echo "insert into table(time, total, caracas, anzoategui) values('$line', '$line', '$line', '$line', '$line');"
   done | mysql -user -password database;
done < filenameX

Result:

结果:

time | total | caracas | anzoategui | 0000-00-00 00:00:00 | 9 | 9 | 9 |
2027-11-13 00:00:00 | 15 | 15 | 15 |

时间 | 总计 | 加拉加斯| 安佐阿特吉 | 0000-00-00 00:00:00 | 9 | 9 | 9 |
2027-11-13 00:00:00 | 15 | 15 | 15 |

回答by mklement0

Note: This answer was accepted based on fixing the bash-focused approach in the OP. For a simpler, awk-based solution see the last section of this answer.

注意:此答案是基于修复 OP 中以 bash 为重点的方法而被接受的。有关更简单的awk基于 - 的解决方案,请参阅此答案的最后一部分。

Try the following:

请尝试以下操作:

line='28/11/13 06:20:05' # sample input

regex='([0-9][0-9])/([0-9][0-9])/([0-9][0-9]) ([0-9][0-9]:[0-9][0-9]:[0-9][0-9])'

if [[ $line =~ $regex ]]; then
  line="20${BASH_REMATCH[3]}-${BASH_REMATCH[2]}-${BASH_REMATCH[1]} ${BASH_REMATCH[4]}"
fi

echo "$line"  # -> '2013-11-28 06:20:05'

As for why your code didn't work:

至于为什么你的代码不起作用:

  • As @anubhava pointed out, you need at least 1 space to the right of [[and to the left of ]].
  • Whether \sworks in a bash regex is platform-dependent (Linux: yes; OSX: no), so a single, literal space is the safer choice here.
  • Your variable assignment was incorrect ($line = ...) - when assigningto a variable, never prefix the variable name with $.
  • Your backreferences were incorrect ($1, ...): to refer to capture groups (subexpressions) in a bash regex you have to use the special ${BASH_REMATCH[@]}array variable; ${BASH_REMATCH[0]}contains the entire string that matched, ${BASH_REMATCH[1]}contains what the first capture group matched, and so on; by contrast, $1, $2, ... refer to the 1st, 2nd, ... argument passed to a shell script or function.
  • 作为@anubhava指出的那样,你需要至少1个空间的权利[[和左侧]]
  • 是否\s在 bash regex 中工作是平台相关的(Linux:是;OSX:否),因此这里使用单个文字空间是更安全的选择。
  • 您的变量分配不正确 ( $line = ...) -分配给变量时,切勿在变量名前加上$.
  • 您的反向引用不正确 ( $1, ...):要在 bash 正则表达式中引用捕获组(子表达式),您必须使用特殊的${BASH_REMATCH[@]}数组变量; ${BASH_REMATCH[0]}包含匹配的整个字符串,${BASH_REMATCH[1]}包含第一个捕获组匹配的内容,依此类推;相比之下,$1, $2, ... 指的是传递给 shell 脚本或函数的第一个、第二个、... 参数。


Update, to address the OP's updated question:

更新,以解决 OP 的更新问题:

I thinkthe following does what you want:

认为以下做你想要的:

# Read input file and store each col. value in separate variables.
while read -r f1 f2 f3 f4 f5 f6 f7 f8 f9 f10 f11 f12 f13 f14 f15; do

    # Concatenate the first 2 cols. to form a date + time string.
    dt="$f1 $f2"

    # Parse and reformat the date + time string.
    regex='([0-9][0-9])/([0-9][0-9])/([0-9][0-9]) ([0-9][0-9]:[0-9][0-9]:[0-9][0-9])'
    if [[ "$dt" =~ $regex ]]; then
      dt="20${BASH_REMATCH[3]}-${BASH_REMATCH[2]}-${BASH_REMATCH[1]} ${BASH_REMATCH[4]}"
    fi

    # Echo the SQL command; all of them are piped into a `mysql` command
    # at the end of the loop.
    # !! Fill the $f<n> variables in as needed - I don't know which ones you need.
    # !! Make sure the number column name matches the number of values.
    # !! Your original code had 4 column names, but 5 values, causing an error.
    echo "insert into table(time, total, caracas, anzoategui) values('$dt', '$f3', '$f4', '$f5');"

done < filenameX | mysql -user -password database


Afterthought: The above solution is based on improvements to the OP's code; below is a streamlined solution that is a one-liner based on awk(spread across multiple lines for readability - tip of the hat to @twalberg for the awk-based date reformatting):

事后思考:上述解决方案基于对 OP 代码的改进;下面是一个简化的解决方案,它是一个基于单行的解决方案awk(跨多行传播以提高可读性 - 对基于 awk 的日期重新格式化的@twalberg 的提示):

awk -v sq=\' '{
 split(, tkns, "/");
 dt=sprintf("20%s-%s-%s", tkns[3], tkns[2], tkns[1]); 
 printf "insert into table(time,total,caracas,anzoategui) values(%s,%s,%s,%s);", 
   sq dt " "  sq, sq  sq, sq  sq, sq  sq
}' filenameX | mysql -user -password database

Note: To make quoting inside the awkprogram simpler, a single quote is passed in via variable sq(-v sq=\').

注意:为了使awk程序内部的引用更简单,通过变量sq( -v sq=\')传入单引号。

回答by glenn Hymanman

Perl is handy here.

Perl 在这里很方便。

dt="28/11/13 06:20:05"
perl -MTime::Piece -E "say Time::Piece->strptime('$dt', '%d/%m/%y %T')->strftime('%Y-%m-%d %T')"
2013-11-28 06:20:05

回答by twalberg

This does the trick without any overly complicated regex invocations:

这没有任何过于复杂的正则表达式调用的技巧:

echo "28/11/13 06:20:05" | awk -F'[/ ]' \
    '{printf "20%s-%s-%s %s\n", , , , }'

Or, as suggested by @fedorqui in the comments, if the source of your timestamp is date, you can just give it the formatting options you want...

或者,正如@fedorqui 在评论中所建议的那样,如果您的时间戳的来源是date,您只需为其提供所需的格式选项...

回答by anubhava

Spaces are mandatory in BASH so use:

BASH 中的空格是强制性的,因此请使用:

[[ "$line" =~ $regex ]] && echo "${line//\//-}"

Also you cannot use \sin BASH so use this regex:

你也不能\s在 BASH 中使用,所以使用这个正则表达式:

regex='([0-9][0-9])/([0-9][0-9])/([0-9][0-9]) ([0-9][0-9]:[0-9][0-9]:[0-9][0-9])'