Python 使用请求通过 http 下载文件时的进度条
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/37573483/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Progress Bar while download file over http with Requests
提问by Gamegoofs2
I need to download a sizable (~200MB) file. I figured out how to download and save the file with here. It would be nice to have a progress bar to know how much has been downloaded. I found ProgressBarbut I'm not sure how to incorperate the two together.
我需要下载一个相当大(~200MB)的文件。我想出了如何在这里下载和保存文件。最好有一个进度条来知道下载了多少。我找到了ProgressBar,但我不确定如何将两者结合在一起。
Here's the code I tried, but it didn't work.
这是我试过的代码,但没有用。
bar = progressbar.ProgressBar(max_value=progressbar.UnknownLength)
with closing(download_file()) as r:
for i in range(20):
bar.update(i)
回答by leovp
I suggest you try tqdm
[1], it's very easy to use.
Example code for downloading with requests
library[2]:
我建议你试试tqdm
[1],它很容易使用。使用requests
library[2]下载的示例代码:
from tqdm import tqdm
import requests
url = "http://www.ovh.net/files/10Mb.dat" #big file test
# Streaming, so we can iterate over the response.
r = requests.get(url, stream=True)
# Total size in bytes.
total_size = int(r.headers.get('content-length', 0))
block_size = 1024 #1 Kibibyte
t=tqdm(total=total_size, unit='iB', unit_scale=True)
with open('test.dat', 'wb') as f:
for data in r.iter_content(block_size):
t.update(len(data))
f.write(data)
t.close()
if total_size != 0 and t.n != total_size:
print("ERROR, something went wrong")
[1]: https://github.com/tqdm/tqdm
[2]: http://docs.python-requests.org/en/master/
[1]:https: //github.com/tqdm/tqdm
[2]:http: //docs.python-requests.org/en/master/
回答by andrew
It seems that there is a disconnect between the examples on the Progress Bar Usagepage and what the code actually requires.
Progress Bar Usage页面上的示例与代码实际需要的内容之间似乎存在脱节。
In the following example, note the use of maxval
instead of max_value
. Also note the use of .start()
to initialized the bar. This has been noted in an Issue.
在以下示例中,请注意使用maxval
代替max_value
。还要注意.start()
初始化栏的使用。这已在一个问题中指出。
The n_chunk
parameter denotes how many 1024 kb chunks to stream at once while looping through the request iterator.
该n_chunk
参数表示在循环遍历请求迭代器时一次流式传输多少个 1024 kb 块。
import requests
import time
import numpy as np
import progressbar
url = "http://wikipedia.com/"
def download_file(url, n_chunk=1):
r = requests.get(url, stream=True)
# Estimates the number of bar updates
block_size = 1024
file_size = int(r.headers.get('Content-Length', None))
num_bars = np.ceil(file_size / (n_chunk * block_size))
bar = progressbar.ProgressBar(maxval=num_bars).start()
with open('test.html', 'wb') as f:
for i, chunk in enumerate(r.iter_content(chunk_size=n_chunk * block_size)):
f.write(chunk)
bar.update(i+1)
# Add a little sleep so you can see the bar progress
time.sleep(0.05)
return
download_file(url)
EDIT: Addressed comment about code clarity.
EDIT2: Fixed logic so bar reports 100% at completion. Credit to leovp's answerfor using the 1024 kb block size.
编辑:解决了关于代码清晰度的评论。
EDIT2:固定逻辑,所以 bar 在完成时报告 100%。感谢leovp的答案使用1024 KB的块大小。
回答by Paul Ellsworth
It seems like you're going to need to get the remote file size (answered here) to calculate how far along you are.
似乎您需要获取远程文件大小(在此处回答)来计算您的距离。
You could then update your progress bar while processing each chunk... if you know the total size and the size of the chunk, you can figure out when to update the progress bar.
然后,您可以在处理每个块时更新进度条……如果您知道块的总大小和大小,则可以确定何时更新进度条。