使用 python urllib/urllib2 发出 http POST 请求以上传文件
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/27050399/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Make an http POST request to upload a file using python urllib/urllib2
提问by Ying Xiong
I would like to make a POST request to upload a file to a web service (and get response) using python. For example, I can do the following POST request with curl
:
我想使用 python 发出 POST 请求以将文件上传到 Web 服务(并获得响应)。例如,我可以使用以下 POST 请求curl
:
curl -F "[email protected]" -F output=json http://jigsaw.w3.org/css-validator/validator
How can I make the same request with python urllib/urllib2? The closest I got so far is the following:
如何使用 python urllib/urllib2 发出相同的请求?到目前为止,我得到的最接近的是以下内容:
with open("style.css", 'r') as f:
content = f.read()
post_data = {"file": content, "output": "json"}
request = urllib2.Request("http://jigsaw.w3.org/css-validator/validator", \
data=urllib.urlencode(post_data))
response = urllib2.urlopen(request)
I got a HTTP Error 500 from the code above. But since my curl
command succeeds, it must be something wrong with my python request?
我从上面的代码中得到了一个 HTTP 错误 500。但是既然我的curl
命令成功了,那一定是我的python请求有问题吧?
I am quite new to this topic and please forgive me if the rookie question has very simple answers or mistakes. Thanks in advance for all your helps!
我对这个话题很陌生,如果菜鸟问题有非常简单的答案或错误,请原谅我。在此先感谢您的帮助!
采纳答案by Ying Xiong
After some digging around, it seems this postsolved my problem. It turns out I need to have the multipart encoder setup properly.
经过一番挖掘,这篇文章似乎解决了我的问题。事实证明,我需要正确设置多部分编码器。
from poster.encode import multipart_encode
from poster.streaminghttp import register_openers
import urllib2
register_openers()
with open("style.css", 'r') as f:
datagen, headers = multipart_encode({"file": f})
request = urllib2.Request("http://jigsaw.w3.org/css-validator/validator", \
datagen, headers)
response = urllib2.urlopen(request)
回答by Wolph
Personally I think you should consider the requestslibrary to post files.
我个人认为您应该考虑使用请求库来发布文件。
url = 'http://jigsaw.w3.org/css-validator/validator'
files = {'file': open('style.css')}
response = requests.post(url, files=files)
Uploading files using urllib2
is not impossible but quite a complicated task: http://pymotw.com/2/urllib2/#uploading-files
使用上传文件urllib2
并非不可能但相当复杂的任务:http: //pymotw.com/2/urllib2/#uploading-files
回答by real4x
Well, there are multiple ways to do it. As mentioned above, you can send the file in "multipart/form-data". However, the target service may not be expecting this type, in which case you may try some more approaches.
嗯,有多种方法可以做到。如上所述,您可以在“multipart/form-data”中发送文件。但是,目标服务可能不期望这种类型,在这种情况下,您可以尝试更多方法。
Pass the file object
传递文件对象
urllib2 can accept a file object as data
. When you pass this type, the library reads the file as a binary stream and sends it out. However, it will notset the proper Content-Type
header. Moreover, if the Content-Length
header is missing, then it will try to access the len
property of the object, which doesn't exist for the files. That said, you must provide both the Content-Type
and the Content-Length
headers to have the method working:
urllib2 可以接受一个文件对象作为data
. 当您传递此类型时,库将文件作为二进制流读取并将其发送出去。但是,它不会设置正确的Content-Type
标头。此外,如果Content-Length
缺少标头,则它将尝试访问len
对象的属性,而文件不存在该属性。也就是说,您必须同时提供 theContent-Type
和Content-Length
headers 才能使该方法工作:
import os
import urllib2
filename = '/var/tmp/myfile.zip'
headers = {
'Content-Type': 'application/zip',
'Content-Length': os.stat(filename).st_size,
}
request = urllib2.Request('http://localhost', open(filename, 'rb'),
headers=headers)
response = urllib2.urlopen(request)
Wrap the file object
包装文件对象
To not deal with the length, you may create a simple wrapper object. With just a little change you can adapt it to get the content from a string if you have the file loaded in memory.
为了不处理长度,您可以创建一个简单的包装对象。如果您将文件加载到内存中,只需稍作更改,您就可以对其进行调整以从字符串中获取内容。
class BinaryFileObject:
"""Simple wrapper for a binary file for urllib2."""
def __init__(self, filename):
self.__size = int(os.stat(filename).st_size)
self.__f = open(filename, 'rb')
def read(self, blocksize):
return self.__f.read(blocksize)
def __len__(self):
return self.__size
Encode the content as base64
将内容编码为 base64
Another way is encoding the data
via base64.b64encode
and providing Content-Transfer-Type: base64
header. However, this method requires support on the server side. Depending on the implementation, the service can either accept the file and store it incorrectly, or return HTTP 400
. E.g. the GitHub API won't throw an error, but the uploaded file will be corrupted.
另一种方法是对data
通孔进行编码base64.b64encode
并提供Content-Transfer-Type: base64
标头。但是,这种方法需要服务器端的支持。根据实现,服务可以接受文件并错误地存储它,或者返回HTTP 400
. 例如,GitHub API 不会抛出错误,但上传的文件将被损坏。