使用 PHP Curl 库持久/保持 HTTP ?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/972925/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Persistent/keepalive HTTP with the PHP Curl library?
提问by Frank Farmer
I'm using a simple PHP library to add documents to a SOLR index, via HTTP.
我正在使用一个简单的 PHP 库通过 HTTP 将文档添加到 SOLR 索引。
There are 3 servers involved, currently:
涉及3台服务器,目前:
- The PHP box running the indexing job
- A database box holding the data being indexed
- The solr box.
- 运行索引作业的 PHP 框
- 一个包含正在索引的数据的数据库框
- 溶胶盒。
At 80 documents/sec (out of 1 million docs), I'm noticing an unusually high interrupt rate on the network interfaces on the PHP and solr boxes (2000/sec; what's more, the graphs are nearly identical -- when the interrupt rate on the PHP box spikes, it also spikes on the Solr box), but much less so on the database box (300/sec). I imagine this is simply because I open and reuse a single connection to the database server, but every single Solr request is currently opening a new HTTP connection via cURL, thanks to the way the Solr client library is written.
以 80 个文档/秒(在 100 万个文档中),我注意到 PHP 和 solr 盒上的网络接口上的中断率异常高(2000/秒;此外,图表几乎相同——当中断PHP 框上的速率峰值,它在 Solr 框上也有峰值),但在数据库框上的速度要小得多(300/秒)。我想这仅仅是因为我打开并重用了一个到数据库服务器的连接,但是由于 Solr 客户端库的编写方式,每个 Solr 请求当前都通过 cURL 打开一个新的 HTTP 连接。
So, my question is:
所以,我的问题是:
- Can cURL be made to open a keepalive session?
- What does it take to reuse a connection? -- is it as simple as reusing the cURL handle resource?
- Do I need to set any special cURL options? (e.g. force HTTP 1.1?)
- Are there any gotchas with cURL keepalive connections? This script runs for hours at a time; will I be able to use a single connection, or will I need to periodically reconnect?
- 可以让 cURL 打开一个 keepalive 会话吗?
- 重用连接需要什么?-- 是否像重用 cURL 句柄资源一样简单?
- 我需要设置任何特殊的 cURL 选项吗?(例如强制 HTTP 1.1?)
- cURL 保持活动连接有什么问题吗?这个脚本一次运行几个小时;我可以使用单个连接,还是需要定期重新连接?
采纳答案by Piskvor left the building
cURL PHP documentation (curl_setopt) says:
cURL PHP 文档(curl_setopt)说:
CURLOPT_FORBID_REUSE-TRUEto force the connection to explicitly close when it has finished processing, and not be pooled for reuse.
CURLOPT_FORBID_REUSE-TRUE强制连接在完成处理后显式关闭,并且不会被合并以供重用。
So:
所以:
- Yes, actually it should re-use connections by default, as long as you re-use the cURL handle.
- by default, cURL handles persistent connections by itself; should you need some special headers, check CURLOPT_HTTPHEADER
- the server may send a keep-alive timeout (with default Apache install, it is 15 seconds or 100 requests, whichever comes first) - but cURL will just open another connection when that happens.
- 是的,实际上它应该默认重用连接,只要你重用 cURL 句柄。
- 默认情况下,cURL 自己处理持久连接;如果您需要一些特殊的标头,请检查 CURLOPT_HTTPHEADER
- 服务器可能会发送一个保持活动超时(对于默认的 Apache 安装,它是 15 秒或 100 个请求,以先到者为准) - 但当发生这种情况时,cURL 只会打开另一个连接。
回答by Richard Keizer
Curl sends the keep-alive header by default, but:
Curl 默认发送 keep-alive 标头,但是:
- create a context using
curl_init()without any parameters. - store the context in a scope where it will survive (not a local var)
- use
CURLOPT_URLoption to pass the url to the context - execute the request using
curl_exec() - don't close the connection with
curl_close()
- 使用
curl_init()不带任何参数创建上下文。 - 将上下文存储在可以生存的范围内(不是本地变量)
- 使用
CURLOPT_URL选项将 url 传递给上下文 - 使用执行请求
curl_exec() - 不要关闭与
curl_close()
very basic example:
非常基本的例子:
function get($url) {
global $context;
curl_setopt($context, CURLOPT_URL, $url);
return curl_exec($context);
}
$context = curl_init();
//multiple calls to get() here
curl_close($context);
回答by Oleg Barshay
On the server you are accessing keep-alive must be enabled and maximum keep-alive requests should be reasonable. In the case of Apache, refer to the apache docs.
You have to be re-using the same cURL context.
When configuring the cURL context, enable keep-alive with timeout in the header:
curl_setopt($curlHandle, CURLOPT_HTTPHEADER, array( 'Connection: Keep-Alive', 'Keep-Alive: 300' ));
在您访问的服务器上必须启用保持连接,并且最大保持连接请求应该是合理的。对于 Apache,请参阅apache 文档。
您必须重新使用相同的 cURL 上下文。
配置 cURL 上下文时,在标头中启用 keep-alive with timeout:
curl_setopt($curlHandle, CURLOPT_HTTPHEADER, array( 'Connection: Keep-Alive', 'Keep-Alive: 300' ));
回答by Brent
If you don't care about the response from the request, you can do them asynchronously, but you run the risk of overloading your SOLR index. I doubt it though, SOLR is pretty damn quick.
如果您不关心请求的响应,您可以异步执行它们,但是您会冒着 SOLR 索引过载的风险。不过我对此表示怀疑,SOLR 非常快。

