使用 python urllib/urllib2 发出 http POST 请求以上传文件

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/27050399/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 01:22:29  来源:igfitidea点击:

Make an http POST request to upload a file using python urllib/urllib2

pythonhttpposturllib2urllib

提问by Ying Xiong

I would like to make a POST request to upload a file to a web service (and get response) using python. For example, I can do the following POST request with curl:

我想使用 python 发出 POST 请求以将文件上传到 Web 服务(并获得响应)。例如,我可以使用以下 POST 请求curl

curl -F "[email protected]" -F output=json http://jigsaw.w3.org/css-validator/validator

How can I make the same request with python urllib/urllib2? The closest I got so far is the following:

如何使用 python urllib/urllib2 发出相同的请求?到目前为止,我得到的最接近的是以下内容:

with open("style.css", 'r') as f:
    content = f.read()
post_data = {"file": content, "output": "json"}
request = urllib2.Request("http://jigsaw.w3.org/css-validator/validator", \
                          data=urllib.urlencode(post_data))
response = urllib2.urlopen(request)

I got a HTTP Error 500 from the code above. But since my curlcommand succeeds, it must be something wrong with my python request?

我从上面的代码中得到了一个 HTTP 错误 500。但是既然我的curl命令成功了,那一定是我的python请求有问题吧?

I am quite new to this topic and please forgive me if the rookie question has very simple answers or mistakes. Thanks in advance for all your helps!

我对这个话题很陌生,如果菜鸟问题有非常简单的答案或错误,请原谅我。在此先感谢您的帮助!

采纳答案by Ying Xiong

After some digging around, it seems this postsolved my problem. It turns out I need to have the multipart encoder setup properly.

经过一番挖掘,这篇文章似乎解决了我的问题。事实证明,我需要正确设置多部分编码器。

from poster.encode import multipart_encode
from poster.streaminghttp import register_openers
import urllib2

register_openers()

with open("style.css", 'r') as f:
    datagen, headers = multipart_encode({"file": f})
    request = urllib2.Request("http://jigsaw.w3.org/css-validator/validator", \
                              datagen, headers)
    response = urllib2.urlopen(request)

回答by Wolph

Personally I think you should consider the requestslibrary to post files.

我个人认为您应该考虑使用请求库来发布文件。

url = 'http://jigsaw.w3.org/css-validator/validator'
files = {'file': open('style.css')}
response = requests.post(url, files=files)

Uploading files using urllib2is not impossible but quite a complicated task: http://pymotw.com/2/urllib2/#uploading-files

使用上传文件urllib2并非不可能但相当复杂的任务:http: //pymotw.com/2/urllib2/#uploading-files

回答by real4x

Well, there are multiple ways to do it. As mentioned above, you can send the file in "multipart/form-data". However, the target service may not be expecting this type, in which case you may try some more approaches.

嗯,有多种方法可以做到。如上所述,您可以在“multipart/form-data”中发送文件。但是,目标服务可能不期望这种类型,在这种情况下,您可以尝试更多方法。

Pass the file object

传递文件对象

urllib2 can accept a file object as data. When you pass this type, the library reads the file as a binary stream and sends it out. However, it will notset the proper Content-Typeheader. Moreover, if the Content-Lengthheader is missing, then it will try to access the lenproperty of the object, which doesn't exist for the files. That said, you must provide both the Content-Typeand the Content-Lengthheaders to have the method working:

urllib2 可以接受一个文件对象作为data. 当您传递此类型时,库将文件作为二进制流读取并将其发送出去。但是,它不会设置正确的Content-Type标头。此外,如果Content-Length缺少标头,则它将尝试访问len对象的属性,而文件不存在该属性。也就是说,您必须同时提供 theContent-TypeContent-Lengthheaders 才能使该方法工作:

import os
import urllib2

filename = '/var/tmp/myfile.zip'
headers = {
    'Content-Type': 'application/zip',
    'Content-Length': os.stat(filename).st_size,
}
request = urllib2.Request('http://localhost', open(filename, 'rb'),
                          headers=headers)
response = urllib2.urlopen(request)

Wrap the file object

包装文件对象

To not deal with the length, you may create a simple wrapper object. With just a little change you can adapt it to get the content from a string if you have the file loaded in memory.

为了不处理长度,您可以创建一个简单的包装对象。如果您将文件加载到内存中,只需稍作更改,您就可以对其进行调整以从字符串中获取内容。

class BinaryFileObject:
  """Simple wrapper for a binary file for urllib2."""

  def __init__(self, filename):
    self.__size = int(os.stat(filename).st_size)
    self.__f = open(filename, 'rb')

  def read(self, blocksize):
    return self.__f.read(blocksize)

  def __len__(self):
    return self.__size

Encode the content as base64

将内容编码为 base64

Another way is encoding the datavia base64.b64encodeand providing Content-Transfer-Type: base64header. However, this method requires support on the server side. Depending on the implementation, the service can either accept the file and store it incorrectly, or return HTTP 400. E.g. the GitHub API won't throw an error, but the uploaded file will be corrupted.

另一种方法是对data通孔进行编码base64.b64encode并提供Content-Transfer-Type: base64标头。但是,这种方法需要服务器端的支持。根据实现,服务可以接受文件并错误地存储它,或者返回HTTP 400. 例如,GitHub API 不会抛出错误,但上传的文件将被损坏。