502 高负载下的网关错误 (nginx/php-fpm)
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/8772015/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
502 Gateway Errors under High Load (nginx/php-fpm)
提问by Mr.Boon
I work for a rather busy internet site that is often gets very large spikes of traffic. During these spikes hundreds of pages per second are requested and this produces random 502 gateway errors.
我为一个相当繁忙的网站工作,该网站经常会出现非常大的流量高峰。在这些峰值期间,每秒请求数百页,这会产生随机的 502 网关错误。
Now we run Nginx (1.0.10) and PHP-FPM on a machine with 4x SAS 15k drives (raid10) with a 16 core CPU and 24GB of DDR3 ram. Also we make use of the latest Xcache version. The DB is located on another machine, but this machine's load is very low, and has no issues.
现在我们在具有 4 个 SAS 15k 驱动器 (raid10)、16 核 CPU 和 24GB DDR3 内存的机器上运行 Nginx (1.0.10) 和 PHP-FPM。我们还使用最新的 Xcache 版本。DB位于另一台机器上,但是这台机器的负载很低,没有问题。
Under normal load everything runs perfect, system load is below 1, and PHP-FPM status report never really shows more than 10 active processes at one time. There is always about 10GB of ram still available. Under normal load the machine handles about 100 pageviews per second.
在正常负载下一切运行完美,系统负载低于 1,并且 PHP-FPM 状态报告一次从未真正显示超过 10 个活动进程。始终有大约 10GB 的内存可用。在正常负载下,机器每秒处理大约 100 次网页浏览。
The problem arises when huge spikes of traffic arrive, and hundredsof page-views per second are requested from the machine. I notice that FPM's status report then shows up to 50 active processes, but that is still way below the 300 max connections that we have configured. During these spikes Nginx status reports up to 5000 active connections instead of the normal average of 1000.
当巨大的流量高峰到来时,就会出现问题,并且每秒从机器请求数百次页面浏览。我注意到 FPM 的状态报告显示多达 50 个活动进程,但这仍远低于我们配置的 300 个最大连接数。在这些高峰期间,Nginx 状态报告多达 5000 个活动连接,而不是正常的 1000 个。
OS Info: CentOS release 5.7 (Final)
操作系统信息:CentOS 5.7 版(最终版)
CPU: Intel(R) Xeon(R) CPU E5620 @ 2.40GH (16 cores)
CPU:Intel(R) Xeon(R) CPU E5620 @ 2.40GH (16 核)
php-fpm.conf
php-fpm.conf
daemonize = yes
listen = /tmp/fpm.sock
pm = static
pm.max_children = 300
pm.max_requests = 1000
I have not setup rlimit_files, because as far as I know it should use the system default if you don't.
我还没有设置 rlimit_files,因为据我所知,如果不这样做,它应该使用系统默认值。
fastcgi_params(only added values to standard file)
fastcgi_params(仅向标准文件添加值)
fastcgi_connect_timeout 60;
fastcgi_send_timeout 180;
fastcgi_read_timeout 180;
fastcgi_buffer_size 128k;
fastcgi_buffers 4 256k;
fastcgi_busy_buffers_size 256k;
fastcgi_temp_file_write_size 256k;
fastcgi_intercept_errors on;
fastcgi_pass unix:/tmp/fpm.sock;
nginx.conf
配置文件
worker_processes 8;
worker_connections 16384;
sendfile on;
tcp_nopush on;
keepalive_timeout 4;
Nginx connects to FPM via Unix Socket.
Nginx 通过 Unix Socket 连接到 FPM。
sysctl.conf
配置文件
net.ipv4.ip_forward = 0
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.default.accept_source_route = 0
kernel.sysrq = 1
kernel.core_uses_pid = 1
net.ipv4.tcp_syncookies = 1
kernel.msgmnb = 65536
kernel.msgmax = 65536
kernel.shmmax = 68719476736
kernel.shmall = 4294967296
net.ipv4.conf.all.send_redirects = 0
net.ipv4.conf.default.send_redirects = 0
net.ipv4.tcp_max_syn_backlog = 2048
net.ipv4.icmp_echo_ignore_broadcasts = 1
net.ipv4.conf.all.accept_source_route = 0
net.ipv4.conf.all.accept_redirects = 0
net.ipv4.conf.all.secure_redirects = 0
net.ipv4.conf.all.log_martians = 1
net.ipv4.conf.default.accept_redirects = 0
net.ipv4.conf.default.secure_redirects = 0
net.ipv4.icmp_echo_ignore_broadcasts = 1
net.ipv4.icmp_ignore_bogus_error_responses = 1
net.ipv4.conf.default.rp_filter = 1
net.ipv4.tcp_timestamps = 0
net.ipv4.conf.all.rp_filter=1
net.ipv4.conf.default.rp_filter=1
net.ipv4.conf.eth0.rp_filter=1
net.ipv4.conf.lo.rp_filter=1
net.ipv4.ip_conntrack_max = 100000
limits.conf
限制文件
* soft nofile 65536
* hard nofile 65536
These are the results for the following commands:
这些是以下命令的结果:
ulimit -n
65536
ulimit -Sn
65536
ulimit -Hn
65536
cat /proc/sys/fs/file-max
2390143
Question:If PHP-FPM is not running out of connections, the load is still low, and there is plenty of RAM available, what bottleneck could be causing these random 502 gateway errors during high traffic?
问题:如果 PHP-FPM 没有耗尽连接,负载仍然很低,并且有足够的 RAM 可用,那么在高流量期间,什么瓶颈会导致这些随机的 502 网关错误?
Note: by default this machine's ulimit's were 1024, since I changed it to 65536 I have not fully rebooted the machine, as it's a production machine and it would mean too much downtime.
注意:默认情况下,这台机器的 ulimit 是 1024,因为我将其更改为 65536 我还没有完全重新启动机器,因为它是生产机器,这意味着停机时间太多。
采纳答案by Timothy Perez
This should fix it...
这应该解决它...
You have: fastcgi_buffers 4 256k;
你有: fastcgi_buffers 4 256k;
Change it to: fastcgi_buffers 256 16k;// 4096k total
改为: fastcgi_buffers 256 16k;// 总共 4096k
Also set fastcgi_max_temp_file_size 0, that will disable buffering to disk if replies start to exceeed your fastcgi buffers.
还要设置fastcgi_max_temp_file_size 0,如果回复开始超过您的 fastcgi 缓冲区,这将禁用对磁盘的缓冲。
回答by kait
Unix socket accept 128 connections by default. It is good to put this line into /etc/sysctl.conf
Unix 套接字默认接受 128 个连接。把这条线放进去就好了/etc/sysctl.conf
net.core.somaxconn = 4096
回答by Misiek
If it's not helping in some cases - use normal port bind instead of socket, because socket on 300+ can block new requests forcing nginx to show 502.
如果在某些情况下没有帮助 - 使用普通端口绑定而不是套接字,因为 300+ 上的套接字可以阻止新请求,迫使 nginx 显示 502。
回答by Bhavin Visariya
@Mr. Boon
@先生。恩恩
I have 8 core 14 GB ram. But the system gives Gateway time-out very often.
Implementing below fix also didn't solved the issue. Still searching for better fixes.
我有 8 核 14 GB 内存。但是系统经常给网关超时。
实施以下修复也没有解决问题。仍在寻找更好的修复方法。
You have: fastcgi_buffers 4 256k;
你有:fastcgi_buffers 4 256k;
Change it to:
将其更改为:
fastcgi_buffers 256 16k; // 4096k total
fastcgi_buffers 256 16k;// 总共 4096k
Also set fastcgi_max_temp_file_size 0,that will disable buffering to disk if replies start to exceed your fastcgi
buffers.
还要设置fastcgi_max_temp_file_size 0,如果回复开始超过您的fastcgi
缓冲区,这将禁用对磁盘的缓冲。
Thanks.
谢谢。