Python 大文件的flask make_response

Question

提问by SheffDoinWork

So I'm real green with file I/O and memory limits and the such, and I'm having a rough time getting my web application to successfully serve large file downloads to a web browser with flask's make_response. The following code works on smaller files (<~1GB), but gives me a MemoryErrorException when I get into larger files:

因此，我对文件 I/O 和内存限制等非常满意，而且我很难让我的 Web 应用程序成功地将大文件下载到带有 Flask 的make_response. 以下代码适用于较小的文件（<~1GB），但是MemoryError当我进入较大的文件时会出现异常：

raw_bytes = ""
with open(file_path, 'rb') as r:
    for line in r:
        raw_bytes = raw_bytes + line
response = make_response(raw_bytes)
response.headers['Content-Type'] = "application/octet-stream"
response.headers['Content-Disposition'] = "inline; filename=" + file_name
return response

I'm assuming that sticking over 2 GB worth of binary data into a string is probably a big no-no, but I don't know an alternative to accomplishing these file download black magicks. If someone could get me on the right track with a chunky[?] or buffered approach for file downloads, or just point me toward some intermediate-level resources to facilitate a deeper understanding of this stuff, I would greatly appreciate it. Thanks!

我假设将超过 2 GB 的二进制数据粘贴到字符串中可能是一个很大的禁忌，但我不知道完成这些文件下载黑魔法的替代方法。如果有人可以使用粗大 [?] 或缓冲文件下载方法让我走上正确的轨道，或者只是将我指向一些中级资源以促进对这些东西的更深入理解，我将不胜感激。谢谢！

Answer 1

采纳答案by davidism

See the docs on Streaming Content. Basically, you write a function that yields chunks of data, and pass that generator to the response, rather than the whole thing at once. Flask and your web server do the rest.

请参阅有关流式内容的文档。基本上，您编写一个生成数据块的函数，并将该生成器传递给响应，而不是一次将整个事件传递给响应。Flask 和您的 Web 服务器完成剩下的工作。

from flask import stream_with_context, Response

@app.route('/stream_data')
def stream_data():
    def generate():
        # create and return your data in small parts here
        for i in xrange(10000):
            yield str(i)

    return Response(stream_with_context(generate()))

If the file is static, you can instead take advantage of send_from_directory(). The docs advise you to use nginx or another server that supports X-SendFile, so that reading and sending the data is efficient.

如果文件是静态的，您可以改为利用send_from_directory(). 文档建议您使用 nginx 或其他支持 X-SendFile 的服务器，以便高效地读取和发送数据。

Answer 2

回答by Jan Vlcinsky

The problem in your attempt is, that you are first reading complete content into "raw_bytes", so with large files you are easy to exhaust all the memory you have.

您尝试的问题是，您首先将完整内容读入“raw_bytes”，因此对于大文件，您很容易耗尽所有内存。

There are multiple options to resolve that:

有多种选择可以解决这个问题：

Streaming the content

流式传输内容

As explained by davidism answer, you can use a generator passed int Response. This serves the large file piece by piece and does not require so much memory.

正如 davidism answer 所解释的，您可以使用传递给 int Response 的生成器。这一点一点地服务于大文件，不需要那么多内存。

The streaming can go not only from a generator, but also from a file, as shown in this anwer

流不仅可以来自生成器，还可以来自文件，如本答案所示

Serving static files over flask

通过烧瓶提供静态文件

In case your file is static, search for how to configure Flask to serve static files. These shall be automatically served in streaming manner.

如果您的文件是静态的，请搜索如何配置 Flask 以提供静态文件。这些应以流方式自动提供。

Serving static files over `apache`or `nginx`(or other web server)

通过`apache`或`nginx`（或其他网络服务器）提供静态文件

Assuming, the file is static, you shall in production serve it by reverse proxy in front of your Flask app. This not only offloads your app, but also works much faster.

假设文件是静态的，您应该在生产环境中通过 Flask 应用程序前的反向代理为其提供服务。这不仅可以卸载您的应用程序，而且运行速度也更快。

Python 大文件的flask make_response

提问by SheffDoinWork

采纳答案by davidism

回答by Jan Vlcinsky

Streaming the content

流式传输内容

Serving static files over flask

通过烧瓶提供静态文件

Serving static files over `apache`or `nginx`(or other web server)

通过`apache`或`nginx`（或其他网络服务器）提供静态文件

相关推荐

最近更新

标签

Python 大文件的flask make_response

提问by SheffDoinWork

采纳答案by davidism

回答by Jan Vlcinsky

Streaming the content

流式传输内容

Serving static files over flask

通过烧瓶提供静态文件

Serving static files over apacheor nginx(or other web server)

通过apache或nginx（或其他网络服务器）提供静态文件

相关推荐

Python 织物导入错误：无法导入名称“isMappingType”

Python 向 Pandas 数据框插入一行

Python 从熊猫的日期和时间变量中删除时间？

如何在 Windows 上将 pip 与 python 3.4 一起使用？

相关推荐

最近更新

标签

Serving static files over `apache`or `nginx`(or other web server)

通过`apache`或`nginx`（或其他网络服务器）提供静态文件