python 如何使用pycurl读取标题

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/472179/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-11-03 20:11:37  来源:igfitidea点击:

How to read the header with pycurl

pythoncurlpycurl

提问by sverrejoh

How do I read the response headers returned from a PyCurl request?

如何读取从 PyCurl 请求返回的响应标头?

回答by bortzmeyer

There are several solutions (by default, they are dropped). Here is an example using the option HEADERFUNCTION which lets you indicate a function to handle them.

有几种解决方案(默认情况下,它们被删除)。这是一个使用选项 HEADERFUNCTION 的示例,它允许您指示处理它们的函数。

Other solutions are options WRITEHEADER (not compatible with WRITEFUNCTION) or setting HEADER to True so that they are transmitted with the body.

其他解决方案是选项 WRITEHEADER(与 WRITEFUNCTION 不兼容)或将 HEADER 设置为 True 以便它们与正文一起传输。

#!/usr/bin/python

import pycurl
import sys

class Storage:
    def __init__(self):
        self.contents = ''
        self.line = 0

    def store(self, buf):
        self.line = self.line + 1
        self.contents = "%s%i: %s" % (self.contents, self.line, buf)

    def __str__(self):
        return self.contents

retrieved_body = Storage()
retrieved_headers = Storage()
c = pycurl.Curl()
c.setopt(c.URL, 'http://www.demaziere.fr/eve/')
c.setopt(c.WRITEFUNCTION, retrieved_body.store)
c.setopt(c.HEADERFUNCTION, retrieved_headers.store)
c.perform()
c.close()
print retrieved_headers
print retrieved_body

回答by vontrapp

import pycurl
from StringIO import StringIO

headers = StringIO()

c = pycurl.Curl()
c.setopt(c.URL, url)
c.setopt(c.HEADER, 1)
c.setopt(c.NOBODY, 1) # header only, no body
c.setopt(c.HEADERFUNCTION, headers.write)

c.perform()

print headers.getvalue()

Add any other curl setopts as necessary/desired, such as FOLLOWLOCATION.

根据需要/需要添加任何其他 curl setopt,例如 FOLLOWLOCATION。

回答by Alexandr

Anothr alternate, human_curl usage: pip human_curl

另一个替代,human_curl 用法:pip human_curl

In [1]: import human_curl as hurl

In [2]: r = hurl.get("http://stackoverflow.com")

In [3]: r.headers
Out[3]: 
{'cache-control': 'public, max-age=45',
 'content-length': '198515',
 'content-type': 'text/html; charset=utf-8',
 'date': 'Thu, 01 Sep 2011 11:53:43 GMT',
 'expires': 'Thu, 01 Sep 2011 11:54:28 GMT',
 'last-modified': 'Thu, 01 Sep 2011 11:53:28 GMT',
 'vary': '*'}

回答by PEZ

This might or might not be an alternative for you:

这可能是也可能不是您的替代方案:

import urllib
headers = urllib.urlopen('http://www.pythonchallenge.com').headers.headers