Python正则表达式匹配日期

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/4709652/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 17:07:25  来源:igfitidea点击:

Python regex to match dates

pythonregexdate

提问by clumpter

What regular expression in Python do I use to match dates like this: "11/12/98"?

我使用 Python 中的什么正则表达式来匹配这样的日期:“11/12/98”?

采纳答案by unutbu

Instead of using regex, it is generally better to parse the string as a datetime.datetimeobject:

与使用正则表达式不同,通常最好将字符串解析为datetime.datetime对象:

In [140]: datetime.datetime.strptime("11/12/98","%m/%d/%y")
Out[140]: datetime.datetime(1998, 11, 12, 0, 0)

In [141]: datetime.datetime.strptime("11/12/98","%d/%m/%y")
Out[141]: datetime.datetime(1998, 12, 11, 0, 0)

You could then access the day, month, and year (and hour, minutes, and seconds) as attributes of the datetime.datetimeobject:

然后,您可以访问日、月和年(以及小时、分钟和秒)作为datetime.datetime对象的属性:

In [143]: date.year
Out[143]: 1998

In [144]: date.month
Out[144]: 11

In [145]: date.day
Out[145]: 12

To test if a sequence of digits separated by forward-slashes represents a valid date, you could use a try..exceptblock. Invalid dates will raise a ValueError:

要测试由正斜杠分隔的数字序列是否代表有效日期,您可以使用try..except块。无效日期将引发ValueError

In [159]: try:
   .....:     datetime.datetime.strptime("99/99/99","%m/%d/%y")
   .....: except ValueError as err:
   .....:     print(err)
   .....:     
   .....:     
time data '99/99/99' does not match format '%m/%d/%y'


If you need to search a longer string for a date, you could use regex to search for digits separated by forward-slashes:

如果您需要为日期搜索更长的字符串,您可以使用正则表达式来搜索由正斜杠分隔的数字:

In [146]: import re
In [152]: match = re.search(r'(\d+/\d+/\d+)','The date is 11/12/98')

In [153]: match.group(1)
Out[153]: '11/12/98'

Of course, invalid dates will also match:

当然,无效日期也会匹配:

In [154]: match = re.search(r'(\d+/\d+/\d+)','The date is 99/99/99')

In [155]: match.group(1)
Out[155]: '99/99/99'

To check that match.group(1)returns a valid date string, you could then parsing it using datetime.datetime.strptimeas shown above.

要检查是否match.group(1)返回有效的日期字符串,您可以使用datetime.datetime.strptime如上所示解析它。

回答by aditya Prakash

I find the below RE working fine for Date in the following format;

我发现以下 RE 在以下格式中对 Date 工作正常;

  1. 14-11-2017
  2. 14.11.2017
  3. 14|11|2017
  1. 14-11-2017
  2. 14.11.2017
  3. 14|11|2017

It can accept year from 2000-2099

它可以接受 2000-2099 年

Please do not forget to add $ at the end,if not it accept 14-11-201 or 20177

请不要忘记在最后添加 $,如果没有它接受 14-11-201 或 20177

date="13-11-2017"

x=re.search("^([1-9] |1[0-9]| 2[0-9]|3[0-1])(.|-)([1-9] |1[0-2])(.|-|)20[0-9][0-9]$",date)

x.group()

output = '13-11-2017'

输出 = '13-11-2017'

回答by Mohammad Hossein Shojaeinia

Using this regular expression you can validate different kinds of Date/Time samples, just a little change is needed.

使用此正则表达式,您可以验证不同类型的日期/时间样本,只需要稍作更改。

^\d\d\d\d/(0?[1-9]|1[0-2])/(0?[1-9]|[12][0-9]|3[01]) (00|[0-9]|1[0-9]|2[0-3]):([0-9]|[0-5][0-9]):([0-9]|[0-5][0-9])$-->validate this: 2018/7/12 13:00:00

^\d\d\d\d/(0?[1-9]|1[0-2])/(0?[1-9]|[12][0-9]|3[01]) (00|[0-9]|1[0-9]|2[0-3]):([0-9]|[0-5][0-9]):([0-9]|[0-5][0-9])$-->验证:2018/7/12 13:00:00

for your format you cad change it to:

对于您的格式,您可以将其更改为:

^(0?[1-9]|[12][0-9]|3[01])/(0?[1-9]|1[0-2])/\d\d$--> validates this: 11/12/98

^(0?[1-9]|[12][0-9]|3[01])/(0?[1-9]|1[0-2])/\d\d$--> 验证这一点:11/12/98

回答by J.Melody

Well, from my understanding, simply for matching this format in a given string, I prefer this regular expression:

好吧,根据我的理解,只是为了在给定的字符串中匹配这种格式,我更喜欢这个正则表达式:

pattern='[0-9|/]+'

to match the format in a more strict way, the following works:

为了以更严格的方式匹配格式,以下工作:

pattern='(?:[0-9]{2}/){2}[0-9]{2}'

Personally, I cannot agree with unutbu's answer since sometimes we use regular expression for "finding" and "extract", not only "validating".

就我个人而言,我不能同意 unutbu 的回答,因为有时我们使用正则表达式来“查找”和“提取”,而不仅仅是“验证”。

回答by Mukundhan

Sometimes we need to get the date from a string. One example with grouping:

有时我们需要从字符串中获取日期。分组的一个例子:

record = '1518-09-06 00:57 some-alphanumeric-charecter'
pattern_date_time = ([0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}) .+
match = re.match(pattern_date_time, record)
if match is not None:
  group = match.group()
  date = group[0]
  print(date) // outputs 1518-09-06 00:57

回答by Heribert

I built my solution on top of @aditya Prakash appraoch:

我在@aditya Prakash appraoch 之上构建了我的解决方案:

 print(re.search("^([1-9]|0[1-9]|1[0-9]|2[0-9]|3[0-1])(\.|-|/)([1-9]|0[1-9]|1[0-2])(\.|-|/)([0-9][0-9]|19[0-9][0-9]|20[0-9][0-9])$|^([0-9][0-9]|19[0-9][0-9]|20[0-9][0-9])(\.|-|/)([1-9]|0[1-9]|1[0-2])(\.|-|/)([1-9]|0[1-9]|1[0-9]|2[0-9]|3[0-1])$",'01/01/2018'))

The first part (^([1-9]|0[1-9]|1[0-9]|2[0-9]|3[0-1])(\.|-|/)([1-9]|0[1-9]|1[0-2])(\.|-|/)([0-9][0-9]|19[0-9][0-9]|20[0-9][0-9])$) can handle the following formats:

第一部分 ( ^([1-9]|0[1-9]|1[0-9]|2[0-9]|3[0-1])(\.|-|/)([1-9]|0[1-9]|1[0-2])(\.|-|/)([0-9][0-9]|19[0-9][0-9]|20[0-9][0-9])$) 可以处理以下格式:

  • 01.10.2019
  • 1.1.2019
  • 1.1.19
  • 12/03/2020
  • 01.05.1950
  • 01.10.2019
  • 1.1.2019
  • 1.1.19
  • 12/03/2020
  • 01.05.1950

The second part (^([0-9][0-9]|19[0-9][0-9]|20[0-9][0-9])(\.|-|/)([1-9]|0[1-9]|1[0-2])(\.|-|/)([1-9]|0[1-9]|1[0-9]|2[0-9]|3[0-1])$) can basically do the same, but in inverse order, where the year comes first, followed by month, and then day.

第二部分 ( ^([0-9][0-9]|19[0-9][0-9]|20[0-9][0-9])(\.|-|/)([1-9]|0[1-9]|1[0-2])(\.|-|/)([1-9]|0[1-9]|1[0-9]|2[0-9]|3[0-1])$) 基本上可以做同样的事情,但顺序相反,先是年,然后是月,然后是日。

  • 2020/02/12
  • 2020/02/12

As delimiters it allows ., /, -. As years it allows everything from 1900-2099, also giving only two numbers is fine.

作为分隔符,它允许 .、/、-。作为年份,它允许从 1900 年到 2099 年的所有内容,也可以只提供两个数字。

If you have suggestions for improvement please let me know in the comments, so I can update the answer.

如果您有改进建议,请在评论中告诉我,以便我更新答案。