将日期时间格式与 Bash REGEX 匹配
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/21910200/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Match datetime format with Bash REGEX
提问by araujophillips
I have data with this datetime format in bash:
我在 bash 中有这种日期时间格式的数据:
28/11/13 06:20:05
(dd/mm/yy hh:mm:ss)
28/11/13 06:20:05
(dd/mm/yy hh:mm:ss)
I need to reformat it like:
我需要像这样重新格式化它:
2013-11-28 06:20:05
(MySQL datetime format)
2013-11-28 06:20:05
(MySQL 日期时间格式)
I am using the following regex:
我正在使用以下正则表达式:
regex='([0-9][0-9])/([0-9][0-9])/([0-9][0-9])\s([0-9][0-9]/:[0-9][0-9]:[0-9][0-9])'
if [[$line=~$regex]]
then
$line='20-- ';
fi
This produces an error:
这会产生一个错误:
./filename: line 10: [[09:34:38=~([0-9][0-9])/([0-9][0-9])/([0-9][0-9])\s([0-9][0-9]/:[0-9][0-9]:[0-9][0-9])]]: No such file or directory
UPDATE:
更新:
I want to read this file "line by line", parse it and insert data in mysql database:
我想“逐行”读取这个文件,解析它并在mysql数据库中插入数据:
'filenameX':
'文件名X':
27/11/13 12:20:05 9984 2885 260 54 288 94 696 1852 32 88 27 7 154
27/11/13 13:20:05 9978 2886 262 54 287 93 696 1854 32 88 27 7 154
27/11/13 14:20:05 9955 2875 262 54 287 93 696 1860 32 88 27 7 154
27/11/13 15:20:04 9921 2874 261 54 284 93 692 1868 32 88 27 7 154
27/11/13 16:20:09 9896 2864 260 54 283 92 689 1880 32 88 27 7 154
27/11/13 17:20:05 9858 2858 258 54 279 92 683 1888 32 88 27 7 154
27/11/13 18:20:04 9849 2853 258 54 279 92 683 1891 32 88 27 7 154
27/11/13 19:20:04 9836 2850 257 54 279 93 683 1891 32 88 27 7 154
27/11/13 20:20:05 9826 2845 257 54 279 93 683 1892 32 88 27 7 154
27/11/13 21:20:05 9820 2847 257 54 278 93 682 1892 32 88 27 7 154
27/11/13 22:20:04 9810 2844 257 54 277 93 681 1892 32 88 27 7 154
27/11/13 23:20:04 9807 2843 257 54 276 93 680 1892 32 88 27 7 154
28/11/13 00:20:05 9809 2843 257 54 276 93 680 1747 29 87 17 6 139
28/11/13 01:20:04 9809 2842 257 54 276 93 680 1747 29 87 17 6 139
28/11/13 02:20:05 9809 2843 256 54 276 93 679 1747 29 87 17 6 139
28/11/13 03:20:04 9808 2842 256 54 276 93 679 1747 29 87 17 6 139
28/11/13 04:20:05 9808 2842 256 54 276 93 679 1747 29 87 17 6 139
28/11/13 05:20:39 9807 2842 256 54 276 93 679 1747 29 87 17 6 139
28/11/13 06:20:05 9804 2840 256 54 276 93 679 1747 29 87 17 6 139
Script:
脚本:
#!/bin/bash
echo "Start!"
while IFS=' ' read -ra ADDR;
do
for line in $(cat results)
do
regex='([0-9][0-9])/([0-9][0-9])/([0-9][0-9]) ([0-9][0-9]:[0-9][0-9]:[0-9]$
if [[ $line =~ $regex ]]; then
$line="20${BASH_REMATCH[3]}-${BASH_REMATCH[2]}-${BASH_REMATCH[1]} ${BASH_REMATCH[4]}"
fi
echo "insert into table(time, total, caracas, anzoategui) values('$line', '$line', '$line', '$line', '$line');"
done | mysql -user -password database;
done < filenameX
Result:
结果:
time | total | caracas | anzoategui |
0000-00-00 00:00:00 | 9 | 9 | 9 |
2027-11-13 00:00:00 | 15 | 15 | 15 |
时间 | 总计 | 加拉加斯| 安佐阿特吉 | 0000-00-00 00:00:00 | 9 | 9 | 9 |
2027-11-13 00:00:00 | 15 | 15 | 15 |
回答by mklement0
Note: This answer was accepted based on fixing the bash-focused approach in the OP. For a simpler, awk
-based solution see the last section of this answer.
注意:此答案是基于修复 OP 中以 bash 为重点的方法而被接受的。有关更简单的awk
基于 - 的解决方案,请参阅此答案的最后一部分。
Try the following:
请尝试以下操作:
line='28/11/13 06:20:05' # sample input
regex='([0-9][0-9])/([0-9][0-9])/([0-9][0-9]) ([0-9][0-9]:[0-9][0-9]:[0-9][0-9])'
if [[ $line =~ $regex ]]; then
line="20${BASH_REMATCH[3]}-${BASH_REMATCH[2]}-${BASH_REMATCH[1]} ${BASH_REMATCH[4]}"
fi
echo "$line" # -> '2013-11-28 06:20:05'
As for why your code didn't work:
至于为什么你的代码不起作用:
- As @anubhava pointed out, you need at least 1 space to the right of
[[
and to the left of]]
. - Whether
\s
works in a bash regex is platform-dependent (Linux: yes; OSX: no), so a single, literal space is the safer choice here. - Your variable assignment was incorrect (
$line = ...
) - when assigningto a variable, never prefix the variable name with$
. - Your backreferences were incorrect (
$1
, ...): to refer to capture groups (subexpressions) in a bash regex you have to use the special${BASH_REMATCH[@]}
array variable;${BASH_REMATCH[0]}
contains the entire string that matched,${BASH_REMATCH[1]}
contains what the first capture group matched, and so on; by contrast,$1
,$2
, ... refer to the 1st, 2nd, ... argument passed to a shell script or function.
- 作为@anubhava指出的那样,你需要至少1个空间的权利
[[
和左侧]]
。 - 是否
\s
在 bash regex 中工作是平台相关的(Linux:是;OSX:否),因此这里使用单个文字空间是更安全的选择。 - 您的变量分配不正确 (
$line = ...
) -分配给变量时,切勿在变量名前加上$
. - 您的反向引用不正确 (
$1
, ...):要在 bash 正则表达式中引用捕获组(子表达式),您必须使用特殊的${BASH_REMATCH[@]}
数组变量;${BASH_REMATCH[0]}
包含匹配的整个字符串,${BASH_REMATCH[1]}
包含第一个捕获组匹配的内容,依此类推;相比之下,$1
,$2
, ... 指的是传递给 shell 脚本或函数的第一个、第二个、... 参数。
Update, to address the OP's updated question:
更新,以解决 OP 的更新问题:
I thinkthe following does what you want:
我认为以下做你想要的:
# Read input file and store each col. value in separate variables.
while read -r f1 f2 f3 f4 f5 f6 f7 f8 f9 f10 f11 f12 f13 f14 f15; do
# Concatenate the first 2 cols. to form a date + time string.
dt="$f1 $f2"
# Parse and reformat the date + time string.
regex='([0-9][0-9])/([0-9][0-9])/([0-9][0-9]) ([0-9][0-9]:[0-9][0-9]:[0-9][0-9])'
if [[ "$dt" =~ $regex ]]; then
dt="20${BASH_REMATCH[3]}-${BASH_REMATCH[2]}-${BASH_REMATCH[1]} ${BASH_REMATCH[4]}"
fi
# Echo the SQL command; all of them are piped into a `mysql` command
# at the end of the loop.
# !! Fill the $f<n> variables in as needed - I don't know which ones you need.
# !! Make sure the number column name matches the number of values.
# !! Your original code had 4 column names, but 5 values, causing an error.
echo "insert into table(time, total, caracas, anzoategui) values('$dt', '$f3', '$f4', '$f5');"
done < filenameX | mysql -user -password database
Afterthought: The above solution is based on improvements to the OP's code; below is a streamlined solution that is a one-liner based on awk
(spread across multiple lines for readability - tip of the hat to @twalberg for the awk-based date reformatting):
事后思考:上述解决方案基于对 OP 代码的改进;下面是一个简化的解决方案,它是一个基于单行的解决方案awk
(跨多行传播以提高可读性 - 对基于 awk 的日期重新格式化的@twalberg 的提示):
awk -v sq=\' '{
split(, tkns, "/");
dt=sprintf("20%s-%s-%s", tkns[3], tkns[2], tkns[1]);
printf "insert into table(time,total,caracas,anzoategui) values(%s,%s,%s,%s);",
sq dt " " sq, sq sq, sq sq, sq sq
}' filenameX | mysql -user -password database
Note: To make quoting inside the awk
program simpler, a single quote is passed in via variable sq
(-v sq=\'
).
注意:为了使awk
程序内部的引用更简单,通过变量sq
( -v sq=\'
)传入单引号。
回答by glenn Hymanman
Perl is handy here.
Perl 在这里很方便。
dt="28/11/13 06:20:05"
perl -MTime::Piece -E "say Time::Piece->strptime('$dt', '%d/%m/%y %T')->strftime('%Y-%m-%d %T')"
2013-11-28 06:20:05
回答by twalberg
This does the trick without any overly complicated regex invocations:
这没有任何过于复杂的正则表达式调用的技巧:
echo "28/11/13 06:20:05" | awk -F'[/ ]' \
'{printf "20%s-%s-%s %s\n", , , , }'
Or, as suggested by @fedorqui in the comments, if the source of your timestamp is date
, you can just give it the formatting options you want...
或者,正如@fedorqui 在评论中所建议的那样,如果您的时间戳的来源是date
,您只需为其提供所需的格式选项...
回答by anubhava
Spaces are mandatory in BASH so use:
BASH 中的空格是强制性的,因此请使用:
[[ "$line" =~ $regex ]] && echo "${line//\//-}"
Also you cannot use \s
in BASH so use this regex:
你也不能\s
在 BASH 中使用,所以使用这个正则表达式:
regex='([0-9][0-9])/([0-9][0-9])/([0-9][0-9]) ([0-9][0-9]:[0-9][0-9]:[0-9][0-9])'