Python 使用请求通过 http 下载文件时的进度条

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/37573483/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 19:34:54  来源:igfitidea点击:

Progress Bar while download file over http with Requests

pythonpython-requests

提问by Gamegoofs2

I need to download a sizable (~200MB) file. I figured out how to download and save the file with here. It would be nice to have a progress bar to know how much has been downloaded. I found ProgressBarbut I'm not sure how to incorperate the two together.

我需要下载一个相当大(~200MB)的文件。我想出了如何在这里下载和保存文件。最好有一个进度条来知道下载了多少。我找到了ProgressBar,但我不确定如何将两者结合在一起。

Here's the code I tried, but it didn't work.

这是我试过的代码,但没有用。

bar = progressbar.ProgressBar(max_value=progressbar.UnknownLength)
with closing(download_file()) as r:
    for i in range(20):
        bar.update(i)

回答by leovp

I suggest you try tqdm[1], it's very easy to use. Example code for downloading with requestslibrary[2]:

我建议你试试tqdm[1],它很容易使用。使用requestslibrary[2]下载的示例代码:

from tqdm import tqdm
import requests

url = "http://www.ovh.net/files/10Mb.dat" #big file test
# Streaming, so we can iterate over the response.
r = requests.get(url, stream=True)
# Total size in bytes.
total_size = int(r.headers.get('content-length', 0))
block_size = 1024 #1 Kibibyte
t=tqdm(total=total_size, unit='iB', unit_scale=True)
with open('test.dat', 'wb') as f:
    for data in r.iter_content(block_size):
        t.update(len(data))
        f.write(data)
t.close()
if total_size != 0 and t.n != total_size:
    print("ERROR, something went wrong")

[1]: https://github.com/tqdm/tqdm
[2]: http://docs.python-requests.org/en/master/

[1]:https: //github.com/tqdm/tqdm
[2]:http: //docs.python-requests.org/en/master/

回答by andrew

It seems that there is a disconnect between the examples on the Progress Bar Usagepage and what the code actually requires.

Progress Bar Usage页面上的示例与代码实际需要的内容之间似乎存在脱节。

In the following example, note the use of maxvalinstead of max_value. Also note the use of .start()to initialized the bar. This has been noted in an Issue.

在以下示例中,请注意使用maxval代替max_value。还要注意.start()初始化栏的使用。这已在一个问题中指出。

The n_chunkparameter denotes how many 1024 kb chunks to stream at once while looping through the request iterator.

n_chunk参数表示在循环遍历请求迭代器时一次流式传输多少个 1024 kb 块。

import requests
import time

import numpy as np

import progressbar


url = "http://wikipedia.com/"

def download_file(url, n_chunk=1):
    r = requests.get(url, stream=True)
    # Estimates the number of bar updates
    block_size = 1024
    file_size = int(r.headers.get('Content-Length', None))
    num_bars = np.ceil(file_size / (n_chunk * block_size))
    bar =  progressbar.ProgressBar(maxval=num_bars).start()
    with open('test.html', 'wb') as f:
        for i, chunk in enumerate(r.iter_content(chunk_size=n_chunk * block_size)):
            f.write(chunk)
            bar.update(i+1)
            # Add a little sleep so you can see the bar progress
            time.sleep(0.05)
    return

download_file(url)

EDIT: Addressed comment about code clarity.
EDIT2: Fixed logic so bar reports 100% at completion. Credit to leovp's answerfor using the 1024 kb block size.

编辑:解决了关于代码清晰度的评论。
EDIT2:固定逻辑,所以 bar 在完成时报告 100%。感谢leovp的答案使用1024 KB的块大小。

回答by Paul Ellsworth

It seems like you're going to need to get the remote file size (answered here) to calculate how far along you are.

似乎您需要获取远程文件大小(在此处回答)来计算您的距离。

You could then update your progress bar while processing each chunk... if you know the total size and the size of the chunk, you can figure out when to update the progress bar.

然后,您可以在处理每个块时更新进度条……如果您知道块的总大小和大小,则可以确定何时更新进度条。