在 Python 中解析带有时区缩写名称的日期/时间字符串?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1703546/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-11-03 22:53:40  来源:igfitidea点击:

Parsing date/time string with timezone abbreviated name in Python?

pythondatetimezone

提问by gct

I'm trying to parse timestamp strings like "Sat, 11/01/09 8:00PM EST"in Python, but I'm having trouble finding a solution that will handle the abbreviated timezone.

我正在尝试像"Sat, 11/01/09 8:00PM EST"在 Python 中一样解析时间戳字符串,但是我无法找到可以处理缩写时区的解决方案。

I'm using dateutil's parse()function, but it doesn't parse the timezone. Is there an easy way to do this?

我正在使用dateutilparse()函数,但它不解析时区。是否有捷径可寻?

采纳答案by Hank Gay

That probably won't work because those abbreviations aren't unique. See this pagefor details. You might wind up just having to manually handle it yourself if you're working with a known set of inputs.

这可能行不通,因为这些缩写不是唯一的。有关详细信息,请参阅此页面。如果您使用一组已知的输入,您可能最终只需要自己手动处理它。

回答by Nas Banov

dateutil's parser.parse()accepts as keyword argument tzinfosa dictionary of the kind {'EST': -5*3600}(that is, matching the zone name to GMT offset in seconds). So assuming we have that, we can do:

dateutil'sparser.parse()接受tzinfos该类型的字典作为关键字参数{'EST': -5*3600}(即,以秒为单位将区域名称与 GMT 偏移量匹配)。所以假设我们有这个,我们可以这样做:

>>> import dateutil.parser as dp
>>> s = 'Sat, 11/01/09 8:00PM'
>>> for tz_code in ('PST','PDT','MST','MDT','CST','CDT','EST','EDT'):
>>>     dt = s+' '+tz_code
>>>     print dt, '=', dp.parse(dt, tzinfos=tzd)

Sat, 11/01/09 8:00PM PST = 2009-11-01 20:00:00-08:00
Sat, 11/01/09 8:00PM PDT = 2009-11-01 20:00:00-07:00
Sat, 11/01/09 8:00PM MST = 2009-11-01 20:00:00-07:00
Sat, 11/01/09 8:00PM MDT = 2009-11-01 20:00:00-06:00
Sat, 11/01/09 8:00PM CST = 2009-11-01 20:00:00-06:00
Sat, 11/01/09 8:00PM CDT = 2009-11-01 20:00:00-05:00
Sat, 11/01/09 8:00PM EST = 2009-11-01 20:00:00-05:00
Sat, 11/01/09 8:00PM EDT = 2009-11-01 20:00:00-04:00

Regarding the content of tzinfos, here is how i populated mine:

关于 的内容tzinfos,这是我如何填充我的:

tz_str = '''-12 Y
-11 X NUT SST
-10 W CKT HAST HST TAHT TKT
-9 V AKST GAMT GIT HADT HNY
-8 U AKDT CIST HAY HNP PST PT
-7 T HAP HNR MST PDT
-6 S CST EAST GALT HAR HNC MDT
-5 R CDT COT EASST ECT EST ET HAC HNE PET
-4 Q AST BOT CLT COST EDT FKT GYT HAE HNA PYT
-3 P ADT ART BRT CLST FKST GFT HAA PMST PYST SRT UYT WGT
-2 O BRST FNT PMDT UYST WGST
-1 N AZOT CVT EGT
0 Z EGST GMT UTC WET WT
1 A CET DFT WAT WEDT WEST
2 B CAT CEDT CEST EET SAST WAST
3 C EAT EEDT EEST IDT MSK
4 D AMT AZT GET GST KUYT MSD MUT RET SAMT SCT
5 E AMST AQTT AZST HMT MAWT MVT PKT TFT TJT TMT UZT YEKT
6 F ALMT BIOT BTT IOT KGT NOVT OMST YEKST
7 G CXT DAVT HOVT ICT KRAT NOVST OMSST THA WIB
8 H ACT AWST BDT BNT CAST HKT IRKT KRAST MYT PHT SGT ULAT WITA WST
9 I AWDT IRKST JST KST PWT TLT WDT WIT YAKT
10 K AEST ChST PGT VLAT YAKST YAPT
11 L AEDT LHDT MAGT NCT PONT SBT VLAST VUT
12 M ANAST ANAT FJT GILT MAGST MHT NZST PETST PETT TVT WFT
13 FJST NZDT
11.5 NFT
10.5 ACDT LHST
9.5 ACST
6.5 CCT MMT
5.75 NPT
5.5 SLT
4.5 AFT IRDT
3.5 IRST
-2.5 HAT NDT
-3.5 HNT NST NT
-4.5 HLV VET
-9.5 MART MIT'''

tzd = {}
for tz_descr in map(str.split, tz_str.split('\n')):
    tz_offset = int(float(tz_descr[0]) * 3600)
    for tz_code in tz_descr[1:]:
        tzd[tz_code] = tz_offset

ps. per @Hank Gay time zone naming is not clearly defined. To form my table i used http://www.timeanddate.com/library/abbreviations/timezones/and http://en.wikipedia.org/wiki/List_of_time_zone_abbreviations. I looked at each conflict and resolved conflicts between obscure and popular names towards the popular (more used ones). There was one - IST- that was not as clear cut (it can mean Indian Standard Time, Iran Standard Time, Irish Standard Timeor Israel Standard Time), so i left it out of the table - you may need to chose what to add for it based on your location. Oh - and I left out the Republic of Kiribati with their absurd "look at me i am first to celebrate New Year" GMT+13 and GMT+14 time zones.

附:每个@Hank Gay 时区命名没有明确定义。为了形成我的表格,我使用了http://www.timeanddate.com/library/abbreviations/timezones/http://en.wikipedia.org/wiki/List_of_time_zone_abbreviations。我查看了每个冲突,并解决了晦涩和流行名称与流行名称(更常用的名称)之间的冲突。有一个 - IST- 不是那么明确(它可以表示印度标准时间伊朗标准时间爱尔兰标准时间以色列标准时间),所以我将它排除在表之外 - 您可能需要根据您的位置选择要添加的内容。哦 - 我遗漏了基里巴斯共和国及其荒谬的“看看我,我是第一个庆祝新年的人”GMT+13 和 GMT+14 时区。

回答by Drake Guan

You might try pytz module: http://pytz.sourceforge.net/

您可以尝试 pytz 模块:http://pytz.sourceforge.net/

pytz brings the Olson tz database into Python. This library allows accurate and cross platform timezone calculations using Python 2.3 or higher. It also solves the issue of ambiguous times at the end of daylight savings, which you can read more about in the Python Library Reference (datetime.tzinfo).

Amost all of the Olson timezones are supported.

pytz 将 Olson tz 数据库引入 Python。该库允许使用 Python 2.3 或更高版本进行准确的跨平台时区计算。它还解决了夏令时结束时时间不明确的问题,您可以在 Python 库参考 (datetime.tzinfo) 中阅读更多相关信息。

支持几乎所有的奥尔森时区。

回答by Mike DeSimone

The parse() function in dateutil can't handle time zones. The thing I've been using is the %Z formatter and the time.strptime() function. I have no idea how it deals with the ambiguity in time zones, but it seems to tell the difference between CDT and CST, which is all I needed.

dateutil 中的 parse() 函数无法处理时区。我一直在使用的是 %Z 格式化程序和 time.strptime() 函数。我不知道它如何处理时区的歧义,但它似乎说明了 CDT 和 CST 之间的区别,这正是我所需要的。

Background: I store backup images in directories whose names are timestamps using local time, since I don't have GMT clocks handy at home. So I use time.strptime(d, r"%Y-%m-%dT%H:%M:%S_%Z") to parse the directory names back into an actual time for age analysis.

背景:我将备份图像存储在名称为使用本地时间的时间戳的目录中,因为我家里没有 GMT 时钟。所以我使用 time.strptime(d, r"%Y-%m-%dT%H:%M:%S_%Z") 将目录名称解析回实际时间进行年龄分析。

回答by reubano

I used pytzto generate a TZINFOSmapping:

我曾经pytz生成一个TZINFOS映射:

from datetime import datetime as dt

import pytz

from dateutil.tz import gettz
from pytz import utc
from dateutil import parser


def gen_tzinfos():
    for zone in pytz.common_timezones:
        try:
            tzdate = pytz.timezone(zone).localize(dt.utcnow(), is_dst=None)
        except pytz.NonExistentTimeError:
            pass
        else:
            tzinfo = gettz(zone)

            if tzinfo:
                yield tzdate.tzname(), tzinfo

TZINFOSUsage

TZINFOS用法

>>> TZINFOS = dict(gen_tzinfos())
>>> TZINFOS
{'+02': tzfile('/usr/share/zoneinfo/Antarctica/Troll'),
 '+03': tzfile('/usr/share/zoneinfo/Europe/Volgograd'),
 '+04': tzfile('Europe/Ulyanovsk'),
 '+05': tzfile('/usr/share/zoneinfo/Indian/Kerguelen'),              
...
 'WGST': tzfile('/usr/share/zoneinfo/America/Godthab'),
 'WIB': tzfile('/usr/share/zoneinfo/Asia/Pontianak'),
 'WIT': tzfile('/usr/share/zoneinfo/Asia/Jayapura'),
 'WITA': tzfile('/usr/share/zoneinfo/Asia/Makassar'),
 'WSDT': tzfile('/usr/share/zoneinfo/Pacific/Apia'),
 'XJT': tzfile('/usr/share/zoneinfo/Asia/Urumqi')}

parserUsage

parser用法

>>> date_str = 'Sat, 11/01/09 8:00PM EST'
>>> tzdate = parser.parse(date_str, tzinfos=TZINFOS)
>>> tzdate.astimezone(utc)
datetime.datetime(2009, 11, 2, 1, 0, tzinfo=<UTC>)

The UTC conversion is needed since there are many timezones available for each abbreviation. Since TZINFOSis a dict, it only has the last timezone per abbreviation. And you may not get the one you were expecting pre conversion.

需要 UTC 转换,因为每个缩写都有许多可用的时区。由于TZINFOS是 a dict,它只有每个缩写的最后一个时区。而且您可能无法获得您期望的转换前的内容。

>>> tzdate
datetime.datetime(2009, 11, 1, 20, 0, tzinfo=tzfile('/usr/share/zoneinfo/America/Port-au-Prince'))