python 使用 Django 将文件异步上传到 Amazon S3

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/670442/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-11-03 20:35:23  来源:igfitidea点击:

Asynchronous File Upload to Amazon S3 with Django

pythondjangoamazon-s3

提问by

I am using this file storage engine to store files to Amazon S3 when they are uploaded:

我正在使用此文件存储引擎在上传文件时将文件存储到 Amazon S3:

http://code.welldev.org/django-storages/wiki/Home

http://code.welldev.org/django-storages/wiki/Home

It takes quite a long time to upload because the file must first be uploaded from client to web server, and then web server to Amazon S3 before a response is returned to the client.

上传需要很长时间,因为文件必须首先从客户端上传到 Web 服务器,然后在将响应返回给客户端之前,将 Web 服务器上传到 Amazon S3。

I would like to make the process of sending the file to S3 asynchronous, so the response can be returned to the user much faster. What is the best way to do this with the file storage engine?

我想让将文件发送到 S3 的过程异步,以便可以更快地将响应返回给用户。使用文件存储引擎执行此操作的最佳方法是什么?

Thanks for your advice!

谢谢你的建议!

回答by Vasil

I've taken another approach to this problem.

我对这个问题采取了另一种方法。

My models have 2 file fields, one uses the standard file storage backend and the other one uses the s3 file storage backend. When the user uploads a file it get's stored localy.

我的模型有 2 个文件字段,一个使用标准文件存储后端,另一个使用 s3 文件存储后端。当用户上传文件时,它会在本地存储。

I have a management command in my application that uploads all the localy stored files to s3 and updates the models.

我的应用程序中有一个管理命令,可以将所有本地存储的文件上传到 s3 并更新模型。

So when a request comes for the file I check to see if the model object uses the s3 storage field, if so I send a redirect to the correct url on s3, if not I send a redirect so that nginx can serve the file from disk.

因此,当对文件提出请求时,我会检查模型对象是否使用 s3 存储字段,如果是,则将重定向发送到 s3 上的正确 url,如果没有,则发送重定向,以便 nginx 可以从磁盘提供文件.

This management command can ofcourse be triggered by any event a cronjob or whatever.

这个管理命令当然可以由任何事件、cronjob 或其他任何事件触发。

回答by Simon Willison

It's possible to have your users upload files directly to S3 from their browser using a special form (with an encrypted policy document in a hidden field). They will be redirected back to your application once the upload completes.

可以让您的用户使用特殊形式(在隐藏字段中带有加密的策略文档)从他们的浏览器将文件直接上传到 S3。上传完成后,它们将被重定向回您的应用程序。

More information here: http://developer.amazonwebservices.com/connect/entry.jspa?externalID=1434

更多信息在这里:http: //developer.amazonwebservices.com/connect/entry.jspa?externalID=1434

回答by thedk

There is an app for that :-)

有一个应用程序:-)

https://github.com/jezdez/django-queued-storage

https://github.com/jezdez/django-queued-storage

It does exactly what you need - and much more, because you can set any "local" storage and any "remote" storage. This app will store your file in fast "local" storage (for example MogileFS storage) and then using Celery(django-celery), will attempt asynchronous uploading to the "remote" storage.

它完全满足您的需求 - 甚至更多,因为您可以设置任何“本地”存储和任何“远程”存储。此应用程序会将您的文件存储在快速的“本地”存储(例如 MogileFS 存储)中,然后使用Celery(django-celery),将尝试异步上传到“远程”存储。

Few remarks:

几点说明:

  1. The tricky thing is - you can setup it to copy&upload, or to upload&delete strategy, that will delete local file once it is uploaded.

  2. Second tricky thing - it will serve file from "local" storage until it is not uploaded.

  3. It also can be configured to make number of retries on uploads failures.

  1. 棘手的是 - 您可以将其设置为复制和上传,或上传和删除策略,一旦上传将删除本地文件。

  2. 第二件棘手的事情 - 它将从“本地”存储提供文件,直到它没有被上传。

  3. 它还可以配置为对上传失败进行多次重试。

Installation & usage is also very simple and straightforward:

安装和使用也非常简单明了:

pip install django-queued-storage

append to INSTALLED_APPS:

附加到INSTALLED_APPS

INSTALLED_APPS += ('queued_storage',)

in models.py:

models.py

from queued_storage.backends import QueuedStorage
queued_s3storage = QueuedStorage(
    'django.core.files.storage.FileSystemStorage',
    'storages.backends.s3boto.S3BotoStorage', task='queued_storage.tasks.TransferAndDelete')

class MyModel(models.Model):
    my_file = models.FileField(upload_to='files', storage=queued_s3storage)

回答by Martin Thurau

You could decouple the process:

你可以解耦这个过程:

  • the user selects file to upload and sends it to your server. After this he sees a page "Thank you for uploading foofile.txt, it is now stored in our storage backend"
  • When the users has uploaded the file it is stored temporary directory on your server and, if needed, some metadata is stored in your database.
  • A background process on your server then uploads the file to S3. This would only possible if you have full access to your server so you can create some kind of "deamon" to to this (or simply use a cronjob).*
  • The page that is displayed polls asynchronously and displays some kind of progress bar to the user (or s simple "please wait" Message. This would only be needed if the user should be able to "use" (put it in a message, or something like that) it directly after uploading.
  • 用户选择要上传的文件并将其发送到您的服务器。之后他会看到一个页面“感谢您上传 foofile.txt,它现在存储在我们的存储后端中”
  • 当用户上传文件时,它会存储在您服务器上的临时目录中,如果需要,一些元数据会存储在您的数据库中。
  • 您服务器上的后台进程然后将文件上传到 S3。这只有在您拥有对服务器的完全访问权限时才有可能,因此您可以为此创建某种“守护进程”(或仅使用 cronjob)。*
  • 显示的页面异步轮询并向用户显示某种进度条(或简单的“请等待”消息。仅当用户应该能够“使用”(将其放入消息中,或类似的东西)它直接上传后。

[*: In case you have only a shared hosting you could possibly build some solution which uses an hidden Iframe in the users browser to start a script which then uploads the file to S3]

[*:如果您只有一个共享主机,您可能会构建一些解决方案,该解决方案在用户浏览器中使用隐藏的 Iframe 来启动脚本,然后将文件上传到 S3]

回答by digitalPBK

You can directly upload media to the s3 server without using your web application server.

您可以直接将媒体上传到 s3 服务器,而无需使用您的 Web 应用程序服务器。

See the following references:

请参阅以下参考资料:

Amazon API Reference : http://docs.amazonwebservices.com/AmazonS3/latest/dev/index.html?UsingHTTPPOST.html

亚马逊 API 参考:http: //docs.amazonwebservices.com/AmazonS3/latest/dev/index.html?UsingHTTPPOST.html

A django implementation : https://github.com/sbc/django-uploadify-s3

django 实现:https: //github.com/sbc/django-uploadify-s3

回答by Alon Burg

As some of the answers here suggest uploading directly to S3, here's a Django S3 Mixin using plupload: https://github.com/burgalon/plupload-s3mixin

由于这里的一些答案建议直接上传到 S3,这里有一个使用 plupload 的 Django S3 Mixin:https: //github.com/burgalon/plupload-s3mixin

回答by gterzian

I encountered the same issue with uploaded images. You cannot pass along files to a Celery worker because Celery needs to be able to pickle the arguments to a task. My solution was to deconstruct the image data into a string and get all other info from the file, passing this data and info to the task, where I reconstructed the image. After that you can save it, which will send it to your storage backend (such as S3). If you want to associate the image with a model, just pass along the id of the instance to the task and retrieve it there, bind the image to the instance and save the instance.

我在上传图片时遇到了同样的问题。您不能将文件传递给 Celery 工作人员,因为 Celery 需要能够对任务的参数进行腌制。我的解决方案是将图像数据解构为一个字符串并从文件中获取所有其他信息,将这些数据和信息传递给我重建图像的任务。之后,您可以保存它,它将发送到您的存储后端(例如 S3)。如果要将图像与模型相关联,只需将实例的 id 传递给任务并在那里检索它,将图像绑定到实例并保存实例。

When a file has been uploaded via a form, it is available in your view as a UploadedFile file-like object. You can get it directly out of request.FILES, or better first bind it to your form, run is_valid and retrieve the file-like object from form.cleaned_data. At that point at least you know it is the kind of file you want it to be. After that you can get the data using read(), and get the other info using other methods/attributes. See https://docs.djangoproject.com/en/1.4/topics/http/file-uploads/

当文件通过表单上传后,它在您的视图中可用作 UploadedFile 文件类对象。您可以直接从 request.FILES 中获取它,或者最好先将它绑定到您的表单,运行 is_valid 并从 form.cleaned_data 中检索类似文件的对象。那时至少你知道它是你想要的那种文件。之后,您可以使用 read() 获取数据,并使用其他方法/属性获取其他信息。见https://docs.djangoproject.com/en/1.4/topics/http/file-uploads/

I actually ended up writing and distributing a little package to save an image asyncly. Have a look at https://github.com/gterzian/django_asyncRight it's just for images and you could fork it and add functionalities for your situation. I'm using it with https://github.com/duointeractive/django-athumband S3

我实际上最终编写并分发了一个小包来异步保存图像。看看https://github.com/gterzian/django_async是的,它仅用于图像,您可以分叉它并为您的情况添加功能。我将它与https://github.com/duointeractive/django-athumb和 S3 一起使用