php-fpm 进程监控/分析
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/15465333/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
php-fpm processes monitoring / profiling
提问by shaharmor
I have recently encountered an issue with php-fpm processes usage (As in the amount of active processes) peaking to the maximum available processes and by that stopping execution of other scripts until the problematic processes finish.
我最近遇到了 php-fpm 进程使用(如活动进程的数量)达到最大可用进程的问题,并因此停止执行其他脚本,直到有问题的进程完成。
In a bit more detail, my current php-fpm settings are:
更详细一点,我当前的 php-fpm 设置是:
pm = static
pm.max_children = 100
I am watching the php-fpm's status page, which most of the time shows:
我正在查看 php-fpm 的状态页面,大部分时间显示:
total processes: 100
idle processes: 95-99
active processes: 1-5
which is normal. However, ever few minutes the active processes count jumps to 100 for a few seconds, and then goes back to normal of 1-5. in that time all other scripts running on the server are simply stuck for that period of time. (From the browser you simply see the page waiting).
这是正常的。但是,每隔几分钟,活动进程数就会在几秒钟内跳到 100,然后恢复到正常的 1-5。在那段时间里,服务器上运行的所有其他脚本都只是停留在那段时间。(从浏览器中,您只会看到等待的页面)。
Now, i have checked to see if its in specific traffic spikes, but its not. it can also occur with the lowest traffic count of the day.
现在,我检查了它是否在特定的流量高峰,但它没有。它也可能发生在一天中最低的流量计数。
I believe that a certain script, maybe even only in specific situations, is causing php to simply use all available processes for some reason.
我相信某个脚本(甚至可能仅在特定情况下)导致 php 出于某种原因简单地使用所有可用进程。
This issue started once we moved to PHP 5.4.X from 5.2.X
一旦我们从 5.2.X 迁移到 PHP 5.4.X,这个问题就开始了
We currently have around 60 websites, so its kinda hard going through each website's pages and checking it.
我们目前有大约 60 个网站,所以浏览每个网站的页面并检查它有点困难。
There is nothing in the nginx logs (Nothing critical anyway, there are a few Notices and such).
nginx 日志中没有任何内容(反正没什么重要的,有一些注意事项等)。
What i'm trying to do, is somehow trace/profile/monitor which php-fpm script is using the processes so i will know where to start looking for the problem.
我正在尝试做的是以某种方式跟踪/配置文件/监控哪个 php-fpm 脚本正在使用这些进程,这样我就知道从哪里开始寻找问题。
Is this possible? Maybe a different approach?
这可能吗?也许是一种不同的方法?
Update
更新
Here is a graph of the PHP-FPM process count in 1 hour, in 1 min intervals:
这是 1 小时内 PHP-FPM 进程计数的图表,间隔为 1 分钟:


I have marked in red the jumps that i'm talking about. The memory usage at the time of the spike stays the same
我用红色标记了我正在谈论的跳跃。尖峰时的内存使用量保持不变
回答by Danack
In your php-fpm log file you should be able to see something like:
在您的 php-fpm 日志文件中,您应该能够看到如下内容:
WARNING: [pool www-images] server reached pm.max_children setting (5), consider raising it.
for when the number of active processes hits the limits. You should be able to correlate that with the requests that are coming in.
当活动进程的数量达到限制时。您应该能够将其与传入的请求相关联。
If that doesn't show any pattern of which requests are causing the issue then you should add slow logging to your php-fpm config:
如果这没有显示导致问题的请求的任何模式,那么您应该将慢日志记录添加到您的 php-fpm 配置中:
request_slowlog_timeout = 10
slowlog = /var/log/php-fpm/slow.$pool.log
The will log a stack trace for each request that takes up more than the slowlog_timeout limit.
将为每个占用超过 slowlog_timeout 限制的请求记录堆栈跟踪。
If that still doesn't show anything then either your internal application logging should show where the slow down occurs.
如果这仍然没有显示任何内容,那么您的内部应用程序日志记录应该显示速度下降的位置。
If that doesn't have enough detail then you could use straceas a last resort, which will show which system calls are being made. That will produce a torrent of information. I'd recommend only attaching it to a single process strace -p PIDwhere PID is the processID of a php-fpm instance.
如果没有足够的细节,那么您可以使用strace作为最后的手段,它将显示正在执行哪些系统调用。这将产生大量信息。我建议只将它附加到单个进程strace -p PID,其中 PID 是 php-fpm 实例的 processID。
it can also occur with the lowest traffic count of the day.
它也可能发生在一天中最低的流量计数。
That should definitely show up in the php-fpm slow logging. However if that only shows you what request are slow, but doesn't help you figure out why, you can add debugging using the auto pre and post-pend files, in your PHP-FPM config file.
这肯定会出现在 php-fpm 慢日志记录中。但是,如果这仅向您显示哪些请求很慢,但不能帮助您找出原因,您可以在 PHP-FPM 配置文件中使用自动前置和后置文件添加调试。
php_value[auto_prepend_file]=/php_shared/prepend.php
php_value[auto_append_file]=/php_shared/postpend.php
Or really simply
或者真的很简单
You can set up the PHP-FPM status page.
您可以设置 PHP-FPM 状态页面。
Add this to your PHP-FPM pool config:
将此添加到您的 PHP-FPM 池配置中:
pm.status_path = /www-status
And pass the requests through nginx to PHP-FPM
并将请求通过 nginx 传递给 PHP-FPM
location ~ ^/(www-status)$ {
include %mysite.root.directory%/conf/fastcgi.conf;
fastcgi_pass unix:%phpfpm.socket%/php-fpm-www.sock;
# or IP address
# fastcgi_pass 127.0.0.1:9000;
#If you're fastcgi.conf doesn't set the query_string
#pass the query string here instead.
# fastcgi_param QUERY_STRING $query_string;
fastcgi_index index.php;
fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
allow 127.0.0.1;
allow stats_collector.localdomain;
allow watchdog.localdomain;
deny all;
}
Then going to yoursite.com/www-status?full will give you a big print out of every php-fpm process like:
然后去 yoursite.com/www-status?full 会给你一个大打印出来的每个 php-fpm 过程,比如:
pool: www
process manager: dynamic
start time: 18/Mar/2013:20:17:21 +1100
start since: 243
accepted conn: 3
listen queue: 0
max listen queue: 0
listen queue len: 0
idle processes: 3
active processes: 1
total processes: 4
max active processes: 1
max children reached: 0
slow requests: 0
************************
pid: 6233
state: Idle
start time: 18/Mar/2013:20:17:21 +1100
start since: 243
requests: 1
request duration: 631
request method: GET
request URI: /www-status
content length: 0
user: -
script: /documents/projects/intahwebz/intahwebz/basereality/www-status
last request cpu: 0.00
last request memory: 262144
btw I bet it some silly query that's locking up your database.
顺便说一句,我敢打赌,这是一些锁定数据库的愚蠢查询。

