如何在 Python 中解析 HTTP 日期字符串?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1471987/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-11-03 22:20:56  来源:igfitidea点击:

How do I parse an HTTP date-string in Python?

pythonhttpdatetimeparsing

提问by Troels Arvin

Is there an easy way to parse HTTP date-strings in Python? According to the standard, there are several ways to format HTTP date strings; the method should be able to handle this.

有没有一种简单的方法可以在 Python 中解析 HTTP 日期字符串?根据标准,有几种方法可以格式化 HTTP 日期字符串;该方法应该能够处理这个问题。

In other words, I want to convert a string like "Wed, 23 Sep 2009 22:15:29 GMT" to a python time-structure.

换句话说,我想将像“Wed, 23 Sep 2009 22:15:29 GMT”这样的字符串转换为 python 时间结构。

回答by tzot

>>> import email.utils as eut
>>> eut.parsedate('Wed, 23 Sep 2009 22:15:29 GMT')
(2009, 9, 23, 22, 15, 29, 0, 1, -1)

If you want a datetime.datetimeobject, you can do:

如果你想要一个datetime.datetime对象,你可以这样做:

def my_parsedate(text):
    return datetime.datetime(*eut.parsedate(text)[:6])

回答by SilentGhost

>>> import datetime
>>> datetime.datetime.strptime('Wed, 23 Sep 2009 22:15:29 GMT', '%a, %d %b %Y %H:%M:%S GMT')
datetime.datetime(2009, 9, 23, 22, 15, 29)

回答by user237419

httplib.HTTPMessage(filehandle).getdate(headername)
httplib.HTTPMessage(filehandle).getdate_tz(headername)
mimetools.Message(filehandle).getdate()
rfc822.parsedate(datestr)
rfc822.parsedate_tz(datestr)
  • if you have a raw data stream, you can build an HTTPMessage or a mimetools.Message from it. it may offer additional help while querying the response object for infos
  • if you are using urllib2, you already have an HTTPMessage object hidden in the filehandler returned by urlopen
  • it can probably parse many date formats
  • httplib is in the core
  • 如果您有原始数据流,则可以从中构建 HTTPMessage 或 mimetools.Message。它可以在查询响应对象的信息时提供额外的帮助
  • 如果您使用的是 urllib2,则您已经在 urlopen 返回的文件处理程序中隐藏了一个 HTTPMessage 对象
  • 它可能可以解析许多日期格式
  • httplib 是核心

NOTE:

笔记:

  • had a look at implementation, HTTPMessage inherits from mimetools.Message which inherits from rfc822.Message. two floating defs are of your interest maybe, parsedate and parsedate_tz (in the latter)
  • parsedate(_tz) from email.utils has a different implementation, although it looks kind of the same.
  • 看看实现,HTTPMessage 继承自 mimetools.Message 继承自 rfc822.Message。两个浮动 defs 可能是您感兴趣的, parsedate 和 parsedate_tz (在后者中)
  • 来自 email.utils 的 parsedate(_tz) 有一个不同的实现,虽然它看起来有点相同。

you can do this, if you only have that piece of string and you want to parse it:

你可以这样做,如果你只有那一段字符串并且你想解析它:

>>> from rfc822 import parsedate, parsedate_tz
>>> parsedate('Wed, 23 Sep 2009 22:15:29 GMT')
(2009, 9, 23, 22, 15, 29, 0, 1, 0)
>>> 

but let me exemplify through mime messages:

但让我通过 mime 消息来举例说明:

import mimetools
import StringIO
message = mimetools.Message(
    StringIO.StringIO('Date:Wed, 23 Sep 2009 22:15:29 GMT\r\n\r\n'))
>>> m
<mimetools.Message instance at 0x7fc259146710>
>>> m.getdate('Date')
(2009, 9, 23, 22, 15, 29, 0, 1, 0)

or via http messages (responses)

或通过 http 消息(响应)

>>> from httplib import HTTPMessage
>>> from StringIO import StringIO
>>> http_response = HTTPMessage(StringIO('Date:Wed, 23 Sep 2009 22:15:29 GMT\r\n\r\n'))
>>> #http_response can be grabbed via urllib2.urlopen(url).info(), right?
>>> http_response.getdate('Date')
(2009, 9, 23, 22, 15, 29, 0, 1, 0)

right?

对?

>>> import urllib2
>>> urllib2.urlopen('https://fw.io/').info().getdate('Date')
(2014, 2, 19, 18, 53, 26, 0, 1, 0)

there, now we now more about date formats, mime messages, mime tools and their pythonic implementation ;-)

在那里,现在我们更多地了解日期格式、mime 消息、mime 工具及其 Python 实现;-)

whatever the case, looks better than using email.utils for parsing http headers.

无论如何,看起来比使用 email.utils 解析 http 标头更好。

回答by saaj

Since Python 3.3 there's email.utils.parsedate_to_datetimewhich can parse RFC 5322timestamps (aka IMF-fixdate, Internet Message Format fixed length format, a subset of HTTP-dateof RFC 7231).

因为Python 3.3有email.utils.parsedate_to_datetime其可以解析RFC 5322时间戳(又名IMF-fixdate,Internet邮件格式的固定长度格式中,一个子集HTTP-dateRFC 7231)。

>>> from email.utils import parsedate_to_datetime
... 
... s = 'Sun, 06 Nov 1994 08:49:37 GMT'
... parsedate_to_datetime(s)
0: datetime.datetime(1994, 11, 6, 8, 49, 37, tzinfo=datetime.timezone.utc)

There's also undocumented http.cookiejar.http2timewhich can achieve the same as follows:

还有未记录的http.cookiejar.http2time可以实现如下相同的:

>>> from datetime import datetime, timezone
... from http.cookiejar import http2time
... 
... s = 'Sun, 06 Nov 1994 08:49:37 GMT'
... datetime.utcfromtimestamp(http2time(s)).replace(tzinfo=timezone.utc)
1: datetime.datetime(1994, 11, 6, 8, 49, 37, tzinfo=datetime.timezone.utc)

It was introduced in Python 2.4 as cookielib.http2timefor dealing with Cookie Expiresdirective which is expressed in the same format.

它是在 Python 2.4 中引入的,cookielib.http2time用于处理以Expires相同格式表示的Cookie指令。