Python 使用 urllib3 下载文件的最佳方式是什么

Question

提问by running.t

I would like to download file over HTTPprotocol using urllib3. I have managed to do this using following code:

我想HTTP使用urllib3. 我已经使用以下代码设法做到了这一点：

 url = 'http://url_to_a_file'
 connection_pool = urllib3.PoolManager()
 resp = connection_pool.request('GET',url )
 f = open(filename, 'wb')
 f.write(resp.data)
 f.close()
 resp.release_conn()

But I was wondering what is the properway of doing this. For example will it work well for big files and If no what to do to make this code more bug tolerant and scalable.

但我想知道这样做的正确方法是什么。例如，它是否适用于大文件，如果没有如何使此代码更具容错性和可扩展性。

Note. It is important to me to use urllib3library not urllib2for example, because I want my code to be thread safe.

笔记。例如，urllib3不使用库对我来说很重要urllib2，因为我希望我的代码是线程安全的。

Answer 1

采纳答案by shazow

Your code snippet is close. Two things worth noting:

您的代码片段很接近。有两点值得注意：

If you're using resp.data, it will consume the entire response and return the connection (you don't need to resp.release_conn()manually). This is fine if you're cool with holding the data in-memory.
You could use resp.read(amt)which will stream the response, but the connection will need to be returned via resp.release_conn().

如果您正在使用resp.data，它将消耗整个响应并返回连接（您不需要resp.release_conn()手动）。如果您喜欢将数据保存在内存中，这很好。
您可以使用resp.read(amt)which 来流式传输响应，但连接需要通过resp.release_conn().

This would look something like...

这看起来像......

import urllib3
http = urllib3.PoolManager()
r = http.request('GET', url, preload_content=False)

with open(path, 'wb') as out:
    while True:
        data = r.read(chunk_size)
        if not data:
            break
        out.write(data)

r.release_conn()

The documentation might be a bit lacking on this scenario. If anyone is interested in making a pull-request to improve the urllib3 documentation, that would be greatly appreciated. :)

在这种情况下，文档可能有点缺乏。如果有人有兴趣提出请求以改进 urllib3 文档，我们将不胜感激。:)

Answer 2

回答by Alecz

The most correct way to do this is probably to get a file-like object that represents the HTTP response and copy it to a real file using shutil.copyfileobj as below:

最正确的方法可能是获取一个表示 HTTP 响应的类文件对象，并使用shutil.copyfileobj 将其复制到真实文件中，如下所示：

url = 'http://url_to_a_file'
c = urllib3.PoolManager()

with c.request('GET',url, preload_content=False) as resp, open(filename, 'wb') as out_file:
    shutil.copyfileobj(resp, out_file)

resp.release_conn()     # not 100% sure this is required though

Answer 3

回答by Gray

Most easy way with urllib3, you can use shutil do auto-manage packages.

使用 urllib3 最简单的方法，您可以使用 shutil 自动管理软件包。

import urllib3
import shutil

http = urllib3.PoolManager()
with open(filename, 'wb') as out:
    r = http.request('GET', url, preload_content=False)
    shutil.copyfileobj(r, out)

Python 使用 urllib3 下载文件的最佳方式是什么

提问by running.t

采纳答案by shazow

回答by Alecz

回答by Gray

相关推荐

最近更新

标签

Python 使用 urllib3 下载文件的最佳方式是什么

提问by running.t

采纳答案by shazow

回答by Alecz

回答by Gray

相关推荐

在Python中获取列表中每个元组的第一个元素

带 Groupby 的 Python Pandas 条件求和

Python 在 Windows 7 x64 上使用 pgxnclient 时找不到 pg_config 可执行文件

Python 谷歌地图驾驶时间

相关推荐

最近更新

标签