Nginx PHP 上传大文件失败(超过 6 GB)

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/44371643/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-26 02:58:41  来源:igfitidea点击:

Nginx PHP Failing with Large File Uploads (Over 6 GB)

phpnginxfile-uploadamazon-s3hhvm

提问by Devin Dixon

I am having a very weird issue uploading larges files over 6GB. My process works like this:

我在上传超过 6GB 的大文件时遇到了一个非常奇怪的问题。我的流程是这样的:

  1. Files are uploaded via Ajax to an php script.
  2. The PHP upload script takes the $_FILE and copies it over in chunks, as in this answerto a tmp location.
  3. The location of the file is stored in the db
  4. A cron script will upload the file to s3 at a later time, again using fopen functions and buffering to keep memory usage low
  1. 文件通过 Ajax 上传到 php 脚本。
  2. PHP 上传脚本采用 $_FILE 并将其分块复制,如在tmp 位置的答案中所示。
  3. 文件的位置存储在db中
  4. cron 脚本稍后将文件上传到 s3,再次使用 fopen 函数和缓冲以保持较低的内存使用率

My PHP(HHVM) and NGINX configuration both have their configuration set to allow up to 16GB of file, my test file is only 8GB.

我的 PHP(HHVM) 和 NGINX 配置都将其配置设置为允许最多 16GB 的文件,我的测试文件只有 8GB。

Here is the weird part, the ajax will ALWAYStime out. But the file is successfully uploaded, its gets copied to the tmp location, the location stored in the db, s3, etc. But the AJAX runs for an hour even AFTER all the execution is finished(which takes 10-15 minutes)and only ends when timing out.

这是奇怪的部分,ajax总是会超时。但该文件上传成功,其被复制到TMP位置,即使储存在数据库,S3等的位置,但一个小时的AJAX运行后,所有的执行完成(这需要10-15分钟),只超时时结束。

What can be causing the server not send a response for only large files?

是什么导致服务器只对大文件不发送响应?

Also error logs on server side are empty.

服务器端的错误日志也是空的。

回答by Anatoly

A large file upload is an expensive and error prone operation. Nginx and backend should have correct timeout configured to handle slow disk IO if occur. Theoretically it is straightforward to manage file upload using multipart/form-data encoding RFC 1867.

大文件上传是一项昂贵且容易出错的操作。Nginx 和后端应该有正确的超时配置,以便在发生时处理慢速磁盘 IO。从理论上讲,使用 multipart/form-data 编码 RFC 1867 管理文件上传很简单。

According to developer.mozilla.orgin a multipart/form-data body, the HTTP Content-Disposition general header is a header that can be used on the subpart of a multipart body to give information about the field it applies to. The subpart is delimited by the boundary defined in the Content-Type header. Used on the body itself, Content-Disposition has no effect.

根据developer.mozilla.org在 multipart/form-data 正文中的说法,HTTP Content-Disposition 通用标头是一个标头,可用于多部分正文的子部分,以提供有关其适用字段的信息。子部分由 Content-Type 标头中定义的边界定界。用于 body 本身,Content-Disposition 没有效果。

Let's see what happens while file being uploaded:

让我们看看上传文件时会发生什么:

1) client sends HTTP request with the file content to webserver

1) 客户端向网络服务器发送带有文件内容的 HTTP 请求

2) webserver accepts the request and initiates data transfer (or returns error 413 if the file size is exceed the limit)

2)webserver接受请求并开始数据传输(如果文件大小超过限制则返回413错误)

3) webserver starts to populate buffers (depends on file and buffers size)

3)网络服务器开始填充缓冲区(取决于文件和缓冲区大小)

4) webserver sends file content via file/network socket to backend

4)网络服务器通过文件/网络套接字将文件内容发送到后端

5) backend authenticates initial request

5) 后端验证初始请求

6) backend reads the file and cuts headers (Content-Disposition, Content-Type)

6) 后端读取文件并剪切标题(Content-Disposition、Content-Type)

7) backend dumps resulted file on to disk

7) 后端转储结果文件到磁盘

8) any follow up procedures like database changes

8) 任何后续程序,如数据库更改

client_body_in_file_only off;

client_body_in_file_only 关闭;

During large files upload several problems occur:

在上传大文件的过程中会出现几个问题:

  • the HTTP body request dumps on to disk and passes to backend which process and copy the file
  • not possible to authenticate request before HTTP request content is uploaded to server
  • while upload large files backend rarely requires a file content itself immediately
  • HTTP 正文请求转储到磁盘并传递到后端,后者处理和复制文件
  • 在 HTTP 请求内容上传到服务器之前无法验证请求
  • 而上传大文件后端很少需要立即文件内容本身

Let's start with Nginx configured with new location http://backend/uploadto receive large file upload, back-end interaction is minimised (Content-Legth: 0), file is being stored just to disk. Using buffers Nginx dumps the file to disk (a file stored to the temporary directory with random name, it can not be changed) followed by POST request to backend to location http://backend/filewith the file name in X-File-Nameheader.

让我们从配置新位置http://backend/upload 的Nginx 开始,以接收大文件上传,后端交互最小化(Content-Legth:0),文件仅存储到磁盘。使用缓冲区 Nginx 将文件转储到磁盘(一个文件以随机名称存储在临时目录中,不能更改),然后向后端发送 POST 请求到位置http://backend/file,文件名在X-File-名称标题。

To keep extra information you may use headers with initial POST request. For instance, having X-Original-File-Nameheaders from initial requests help you to match file and store necessary mapping information to the database.

为了保留额外的信息,您可以在初始 POST 请求中使用标头。例如,来自初始请求的X-Original-File-Name标头可帮助您匹配文件并将必要的映射信息存储到数据库。

client_body_in_file_only on;

client_body_in_file_only on;

Let's see how make it happen:

让我们看看如何实现:

1) configure Nginx to dump HTTP body content to a file and keep it stored client_body_in_file_only on;

1) 配置 Nginx 将 HTTP 正文内容转储到文件中,并将其保存在client_body_in_file_only 上;

2) create new backend endpoint http://backend/fileto handle mapping between temp file name and header X-File-Name

2) 创建新的后端端点http://backend/file来处理临时文件名和头X-File-Name之间的映射

4) instrument AJAX query with header X-File-NameNginx will use to send post upload request with

4) 使用标头X-File-NameNginx 将使用AJAX 查询发送后上传请求

Configuration:

配置:

location /upload {
  client_body_temp_path      /tmp/;
  client_body_in_file_only   on;
  client_body_buffer_size    1M;
  client_max_body_size       7G;

  proxy_pass_request_headers on;
  proxy_set_header           X-File-Name $request_body_file; 
  proxy_set_body             off;
  proxy_redirect             off;
  proxy_pass                 http://backend/file;
}

Nginx configuration option client_body_in_file_onlyis incompatible with multi-part data upload, but you can use it with AJAX i.e. XMLHttpRequest2 (binary data).

Nginx 配置选项client_body_in_file_only与多部分数据上传不兼容,但您可以将其与 AJAX 一起使用,即 XMLHttpRequest2(二进制数据)。

If you need to have back-end authentication, only way to handle is to use auth_request, for instance:

如果您需要进行后端身份验证,唯一的处理方法是使用auth_request,例如:

location = /upload {
  auth_request               /upload/authenticate;
  ...
}

location = /upload/authenticate {
  internal;
  proxy_set_body             off;
  proxy_pass                 http://backend;
}

client_body_in_file_only on; auth_request on;

client_body_in_file_only on;  auth_request 上;

Pre-upload authentication logic protects from unauthenticated requests regardless of the initial POST Content-Length size.

无论初始 POST Content-Length 大小如何,预上传身份验证逻辑都可以防止未经身份验证的请求。