Python 检查字符串是否有日期,任何格式
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/25341945/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Check if string has date, any format
提问by zack_falcon
How do I check if a string can be parsed to a date?
如何检查字符串是否可以解析为日期?
- Jan 19, 1990
- January 19, 1990
- Jan 19,1990
- 01/19/1990
- 01/19/90
- 1990
- Jan 1990
- January1990
- 1990 年 1 月 19 日
- 1990 年 1 月 19 日
- 1990年1月19日
- 01/19/1990
- 90 年 1 月 19 日
- 1990年
- 1990 年 1 月
- 1990年1月
These are all valid dates. If there's any concern regarding the lack of space in between stuff in item #3 and the last item above, that can be easily remedied via automatically inserting a space in between letters/characters and numbers, if so needed.
这些都是有效日期。如果对第 3 项和上面最后一项之间缺少空格有任何顾虑,可以通过自动在字母/字符和数字之间插入空格(如果需要)轻松解决。
But first, the basics:
但首先,基础知识:
I tried putting it in an if statement:
我试着把它放在一个if statement:
if datetime.strptime(item, '%Y') or datetime.strptime(item, '%b %d %y') or datetime.strptime(item, '%b %d %Y') or datetime.strptime(item, '%B %d %y') or datetime.strptime(item, '%B %d %Y'):
But that's in a try-except block, and keeps returning something like this:
但这是在 try-except 块中,并不断返回如下内容:
16343 time data 'JUNE1890' does not match format '%Y'
Unless, it met the first condition in the ifstatement.
除非,它满足if语句中的第一个条件。
To clarify, I don't actually need the value of the date - I just want to know if it is. Ideally, it would've been something like this:
澄清一下,我实际上并不需要日期的值 - 我只是想知道它是否是。理想情况下,它会是这样的:
if item is date:
print date
else:
print "Not a date"
Is there any way to do this?
有没有办法做到这一点?
采纳答案by Alex Riley
The parsefunction in dateutils.parseris capable of parsing many date string formats to a datetimeobject.
该parse函数dateutils.parser是能够分析许多日期字符串格式到的datetime对象。
If you simply want to know whether a particular string couldrepresent or contain a valid date, you could try the following simple function:
如果您只想知道特定字符串是否可以表示或包含有效日期,您可以尝试以下简单函数:
from dateutil.parser import parse
def is_date(string, fuzzy=False):
"""
Return whether the string can be interpreted as a date.
:param string: str, string to check for date
:param fuzzy: bool, ignore unknown tokens in string if True
"""
try:
parse(string, fuzzy=fuzzy)
return True
except ValueError:
return False
Then you have:
然后你有:
>>> is_date("1990-12-1")
True
>>> is_date("2005/3")
True
>>> is_date("Jan 19, 1990")
True
>>> is_date("today is 2019-03-27")
False
>>> is_date("today is 2019-03-27", fuzzy=True)
True
>>> is_date("Monday at 12:01am")
True
>>> is_date("xyz_not_a_date")
False
>>> is_date("yesterday")
False
Custom parsing
自定义解析
parsemight recognise some strings as dates which you don't want to treat as dates. For example:
parse可能会将某些字符串识别为您不想将其视为日期的日期。例如:
Parsing
"12"and"1999"will return a datetime object representing the current date with the day and year substituted for the number in the string"23, 4"and"23 4"will be parsed asdatetime.datetime(2023, 4, 16, 0, 0)."Friday"will return the date of the nearest Friday in the future.- Similarly
"August"corresponds to the current date with the month changed to August.
解析
"12"并"1999"返回一个表示当前日期的日和年在字符串中的数字取代的DateTime对象"23, 4"并且"23 4"会被解析为datetime.datetime(2023, 4, 16, 0, 0)。"Friday"将返回未来最近的星期五的日期。- 同样
"August"对应于当前日期,月份更改为八月。
Also parseis not locale aware, so does not recognise months or days of the week in languages other than English.
还parse没有意识到语言环境,所以不承认一周的几个月或几天在英语以外的语言。
Both of these issues can be addressed to some extent by using a custom parserinfoclass, which defines how month and day names are recognised:
使用自定义parserinfo类可以在一定程度上解决这两个问题,该类定义了如何识别月和日名称:
from dateutil.parser import parserinfo
class CustomParserInfo(parserinfo):
# three months in Spanish for illustration
MONTHS = [("Enero", "Enero"), ("Feb", "Febrero"), ("Marzo", "Marzo")]
An instance of this class can then be used with parse:
然后可以将此类的实例用于parse:
>>> parse("Enero 1990")
# ValueError: Unknown string format
>>> parse("Enero 1990", parserinfo=CustomParserInfo())
datetime.datetime(1990, 1, 27, 0, 0)
回答by dawg
If you want to parse those particular formats, you can just match against a list of formats:
如果你想解析这些特定的格式,你可以只匹配一个格式列表:
txt='''\
Jan 19, 1990
January 19, 1990
Jan 19,1990
01/19/1990
01/19/90
1990
Jan 1990
January1990'''
import datetime as dt
fmts = ('%Y','%b %d, %Y','%b %d, %Y','%B %d, %Y','%B %d %Y','%m/%d/%Y','%m/%d/%y','%b %Y','%B%Y','%b %d,%Y')
parsed=[]
for e in txt.splitlines():
for fmt in fmts:
try:
t = dt.datetime.strptime(e, fmt)
parsed.append((e, fmt, t))
break
except ValueError as err:
pass
# check that all the cases are handled
success={t[0] for t in parsed}
for e in txt.splitlines():
if e not in success:
print e
for t in parsed:
print '"{:20}" => "{:20}" => {}'.format(*t)
Prints:
印刷:
"Jan 19, 1990 " => "%b %d, %Y " => 1990-01-19 00:00:00
"January 19, 1990 " => "%B %d, %Y " => 1990-01-19 00:00:00
"Jan 19,1990 " => "%b %d,%Y " => 1990-01-19 00:00:00
"01/19/1990 " => "%m/%d/%Y " => 1990-01-19 00:00:00
"01/19/90 " => "%m/%d/%y " => 1990-01-19 00:00:00
"1990 " => "%Y " => 1990-01-01 00:00:00
"Jan 1990 " => "%b %Y " => 1990-01-01 00:00:00
"January1990 " => "%B%Y " => 1990-01-01 00:00:00

