使用 PHP Curl 库持久/保持 HTTP ？

Question

提问by Frank Farmer

I'm using a simple PHP library to add documents to a SOLR index, via HTTP.

我正在使用一个简单的 PHP 库通过 HTTP 将文档添加到 SOLR 索引。

There are 3 servers involved, currently:

涉及3台服务器，目前：

The PHP box running the indexing job
A database box holding the data being indexed
The solr box.

运行索引作业的 PHP 框
一个包含正在索引的数据的数据库框
溶胶盒。

At 80 documents/sec (out of 1 million docs), I'm noticing an unusually high interrupt rate on the network interfaces on the PHP and solr boxes (2000/sec; what's more, the graphs are nearly identical -- when the interrupt rate on the PHP box spikes, it also spikes on the Solr box), but much less so on the database box (300/sec). I imagine this is simply because I open and reuse a single connection to the database server, but every single Solr request is currently opening a new HTTP connection via cURL, thanks to the way the Solr client library is written.

以 80 个文档/秒（在 100 万个文档中），我注意到 PHP 和 solr 盒上的网络接口上的中断率异常高（2000/秒；此外，图表几乎相同——当中断PHP 框上的速率峰值，它在 Solr 框上也有峰值），但在数据库框上的速度要小得多（300/秒）。我想这仅仅是因为我打开并重用了一个到数据库服务器的连接，但是由于 Solr 客户端库的编写方式，每个 Solr 请求当前都通过 cURL 打开一个新的 HTTP 连接。

So, my question is:

所以，我的问题是：

Can cURL be made to open a keepalive session?
What does it take to reuse a connection? -- is it as simple as reusing the cURL handle resource?
Do I need to set any special cURL options? (e.g. force HTTP 1.1?)
Are there any gotchas with cURL keepalive connections? This script runs for hours at a time; will I be able to use a single connection, or will I need to periodically reconnect?

可以让 cURL 打开一个 keepalive 会话吗？
重用连接需要什么？-- 是否像重用 cURL 句柄资源一样简单？
我需要设置任何特殊的 cURL 选项吗？（例如强制 HTTP 1.1？）
cURL 保持活动连接有什么问题吗？这个脚本一次运行几个小时；我可以使用单个连接，还是需要定期重新连接？

Answer 1

采纳答案by Piskvor left the building

cURL PHP documentation (curl_setopt) says:

cURL PHP 文档（curl_setopt）说：

CURLOPT_FORBID_REUSE- TRUEto force the connection to explicitly close when it has finished processing, and not be pooled for reuse.

CURLOPT_FORBID_REUSE- TRUE强制连接在完成处理后显式关闭，并且不会被合并以供重用。

So:

所以：

Yes, actually it should re-use connections by default, as long as you re-use the cURL handle.
by default, cURL handles persistent connections by itself; should you need some special headers, check CURLOPT_HTTPHEADER
the server may send a keep-alive timeout (with default Apache install, it is 15 seconds or 100 requests, whichever comes first) - but cURL will just open another connection when that happens.

是的，实际上它应该默认重用连接，只要你重用 cURL 句柄。
默认情况下，cURL 自己处理持久连接；如果您需要一些特殊的标头，请检查 CURLOPT_HTTPHEADER
服务器可能会发送一个保持活动超时（对于默认的 Apache 安装，它是 15 秒或 100 个请求，以先到者为准） - 但当发生这种情况时，cURL 只会打开另一个连接。

Answer 2

回答by Richard Keizer

Curl sends the keep-alive header by default, but:

Curl 默认发送 keep-alive 标头，但是：

create a context using curl_init()without any parameters.
store the context in a scope where it will survive (not a local var)
use CURLOPT_URLoption to pass the url to the context
execute the request using curl_exec()
don't close the connection with curl_close()

使用curl_init()不带任何参数创建上下文。
将上下文存储在可以生存的范围内（不是本地变量）
使用CURLOPT_URL选项将 url 传递给上下文
使用执行请求 curl_exec()
不要关闭与 curl_close()

very basic example:

非常基本的例子：

function get($url) {
    global $context;
    curl_setopt($context, CURLOPT_URL, $url);
    return curl_exec($context);
}

$context = curl_init();
//multiple calls to get() here
curl_close($context);

Answer 3

回答by Oleg Barshay

On the server you are accessing keep-alive must be enabled and maximum keep-alive requests should be reasonable. In the case of Apache, refer to the apache docs.
You have to be re-using the same cURL context.

When configuring the cURL context, enable keep-alive with timeout in the header:

curl_setopt($curlHandle, CURLOPT_HTTPHEADER, array(
    'Connection: Keep-Alive',
    'Keep-Alive: 300'
));

在您访问的服务器上必须启用保持连接，并且最大保持连接请求应该是合理的。对于 Apache，请参阅apache 文档。
您必须重新使用相同的 cURL 上下文。

配置 cURL 上下文时，在标头中启用 keep-alive with timeout：

curl_setopt($curlHandle, CURLOPT_HTTPHEADER, array(
    'Connection: Keep-Alive',
    'Keep-Alive: 300'
));

Answer 4

回答by Brent

If you don't care about the response from the request, you can do them asynchronously, but you run the risk of overloading your SOLR index. I doubt it though, SOLR is pretty damn quick.

如果您不关心请求的响应，您可以异步执行它们，但是您会冒着 SOLR 索引过载的风险。不过我对此表示怀疑，SOLR 非常快。

Asynchronous PHP calls?

异步 PHP 调用？

使用 PHP Curl 库持久/保持 HTTP ？

提问by Frank Farmer

采纳答案by Piskvor left the building

回答by Richard Keizer

回答by Oleg Barshay

回答by Brent

相关推荐

最近更新

标签

使用 PHP Curl 库持久/保持 HTTP ？

提问by Frank Farmer

采纳答案by Piskvor left the building

回答by Richard Keizer

回答by Oleg Barshay

回答by Brent

相关推荐

匿名函数中的 PHP 变量

如何比较 PHP 5.2.8 中的两个 DateTime 对象？

php 警告：mysql_result() 期望参数 1 是资源，给定的布尔值

带有 %s 和 %d 的 PHP MySQL 查询

相关推荐

最近更新

标签