Python 使用请求通过 http 下载文件时的进度条

Question

提问by Gamegoofs2

I need to download a sizable (~200MB) file. I figured out how to download and save the file with here. It would be nice to have a progress bar to know how much has been downloaded. I found ProgressBarbut I'm not sure how to incorperate the two together.

我需要下载一个相当大（~200MB）的文件。我想出了如何在这里下载和保存文件。最好有一个进度条来知道下载了多少。我找到了ProgressBar，但我不确定如何将两者结合在一起。

Here's the code I tried, but it didn't work.

这是我试过的代码，但没有用。

bar = progressbar.ProgressBar(max_value=progressbar.UnknownLength)
with closing(download_file()) as r:
    for i in range(20):
        bar.update(i)

Answer 1

回答by leovp

I suggest you try tqdm[1], it's very easy to use. Example code for downloading with requestslibrary[2]:

我建议你试试tqdm[1]，它很容易使用。使用requestslibrary[2]下载的示例代码：

from tqdm import tqdm
import requests

url = "http://www.ovh.net/files/10Mb.dat" #big file test
# Streaming, so we can iterate over the response.
r = requests.get(url, stream=True)
# Total size in bytes.
total_size = int(r.headers.get('content-length', 0))
block_size = 1024 #1 Kibibyte
t=tqdm(total=total_size, unit='iB', unit_scale=True)
with open('test.dat', 'wb') as f:
    for data in r.iter_content(block_size):
        t.update(len(data))
        f.write(data)
t.close()
if total_size != 0 and t.n != total_size:
    print("ERROR, something went wrong")

[1]: https://github.com/tqdm/tqdm
[2]: http://docs.python-requests.org/en/master/

[1]：https: //github.com/tqdm/tqdm
[2]：http: //docs.python-requests.org/en/master/

Answer 2

回答by andrew

It seems that there is a disconnect between the examples on the Progress Bar Usagepage and what the code actually requires.

Progress Bar Usage页面上的示例与代码实际需要的内容之间似乎存在脱节。

In the following example, note the use of maxvalinstead of max_value. Also note the use of .start()to initialized the bar. This has been noted in an Issue.

在以下示例中，请注意使用maxval代替max_value。还要注意.start()初始化栏的使用。这已在一个问题中指出。

The n_chunkparameter denotes how many 1024 kb chunks to stream at once while looping through the request iterator.

该n_chunk参数表示在循环遍历请求迭代器时一次流式传输多少个 1024 kb 块。

import requests
import time

import numpy as np

import progressbar


url = "http://wikipedia.com/"

def download_file(url, n_chunk=1):
    r = requests.get(url, stream=True)
    # Estimates the number of bar updates
    block_size = 1024
    file_size = int(r.headers.get('Content-Length', None))
    num_bars = np.ceil(file_size / (n_chunk * block_size))
    bar =  progressbar.ProgressBar(maxval=num_bars).start()
    with open('test.html', 'wb') as f:
        for i, chunk in enumerate(r.iter_content(chunk_size=n_chunk * block_size)):
            f.write(chunk)
            bar.update(i+1)
            # Add a little sleep so you can see the bar progress
            time.sleep(0.05)
    return

download_file(url)

EDIT: Addressed comment about code clarity.
EDIT2: Fixed logic so bar reports 100% at completion. Credit to leovp's answerfor using the 1024 kb block size.

编辑：解决了关于代码清晰度的评论。
EDIT2：固定逻辑，所以 bar 在完成时报告 100%。感谢leovp的答案使用1024 KB的块大小。

Answer 3

回答by Paul Ellsworth

It seems like you're going to need to get the remote file size (answered here) to calculate how far along you are.

似乎您需要获取远程文件大小（在此处回答）来计算您的距离。

You could then update your progress bar while processing each chunk... if you know the total size and the size of the chunk, you can figure out when to update the progress bar.

然后，您可以在处理每个块时更新进度条……如果您知道块的总大小和大小，则可以确定何时更新进度条。

Python 使用请求通过 http 下载文件时的进度条

提问by Gamegoofs2

回答by leovp

回答by andrew

回答by Paul Ellsworth

相关推荐

最近更新

标签

Python 使用请求通过 http 下载文件时的进度条

提问by Gamegoofs2

回答by leovp

回答by andrew

回答by Paul Ellsworth

相关推荐

如何为anaconda python3安装gi模块？

定义常量变量的最佳方法是什么python 3

Python pandas NameError: StringIO 未定义

Python 如何将自定义数据集拆分为训练数据集和测试数据集？

相关推荐

最近更新

标签