php 重复使用相同的卷曲手柄。性能提升大吗?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/3787002/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Reusing the same curl handle. Big performance increase?
提问by benjisail
In a php script I am doing a lot of different cUrl GET requests (a hundred) to different URL.
在一个 php 脚本中,我对不同的 URL 做了很多不同的 cUrl GET 请求(一百个)。
Is reusing the same curl handle from curl_init will improve the performance or is it negligible compare to the response time of the cURL requests?
重用 curl_init 中的相同 curl 句柄会提高性能还是与 cURL 请求的响应时间相比可以忽略不计?
I am asking that because in the current architecture it would be not easy to keep the same cUrl handle.
我这么问是因为在当前架构中,保持相同的 cUrl 句柄并不容易。
Thanks,
谢谢,
Benjamin
本杰明
采纳答案by Antti Ryts?l?
It depends on if the urls are on same servers or not. If they are, concurrent requests to same server will reuse the connection. see CURLOPT_FORBID_REUSE.
这取决于网址是否在同一台服务器上。如果是,对同一服务器的并发请求将重用连接。见 CURLOPT_FORBID_REUSE。
If the urls are sometimes on same server you need to sort the urls as the default connection cache is limited to ten or twenty connections.
如果 url 有时在同一台服务器上,您需要对 url 进行排序,因为默认连接缓存限制为 10 或 20 个连接。
If they are on different servers there is no speed advantage on using the same handle.
如果它们位于不同的服务器上,则使用相同的句柄没有速度优势。
With curl_multi_exec you can connect to different servers at a same time (parallel). Even then you need some queuing to not use thousands of simultaneous connections.
使用 curl_multi_exec 您可以同时(并行)连接到不同的服务器。即便如此,您也需要一些排队才能不使用数千个同时连接。
回答by AlliterativeAlice
Crossposted from Should I close cURL or not?because I think it's relevant here too.
来自我应该关闭 cURL 还是不关闭?因为我认为这在这里也很重要。
I tried benching curl with using a new handle for each request and using the same handle with the following code:
我尝试对每个请求使用一个新句柄并使用相同的句柄和以下代码进行 benching curl:
ob_start(); //Trying to avoid setting as many curl options as possible
$start_time = microtime(true);
for ($i = 0; $i < 100; ++$i) {
$rand = rand();
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "http://www.google.com/?rand=" . $rand);
curl_exec($ch);
curl_close($ch);
}
$end_time = microtime(true);
ob_end_clean();
echo 'Curl without handle reuse: ' . ($end_time - $start_time) . '<br>';
ob_start(); //Trying to avoid setting as many curl options as possible
$start_time = microtime(true);
$ch = curl_init();
for ($i = 0; $i < 100; ++$i) {
$rand = rand();
curl_setopt($ch, CURLOPT_URL, "http://www.google.com/?rand=" . $rand);
curl_exec($ch);
}
curl_close($ch);
$end_time = microtime(true);
ob_end_clean();
echo 'Curl with handle reuse: ' . ($end_time - $start_time) . '<br>';
and got the following results:
并得到以下结果:
Curl without handle reuse: 8.5690529346466
Curl with handle reuse: 5.3703031539917
不带手柄重复使用的
卷发:8.5690529346466带手柄重复使用的卷发:5.3703031539917
So reusing the same handle actually provides a substantial performance increase when connecting to the same server multiple times. I tried connecting to different servers:
因此,当多次连接到同一台服务器时,重用相同的句柄实际上提供了显着的性能提升。我尝试连接到不同的服务器:
$url_arr = array(
'http://www.google.com/',
'http://www.bing.com/',
'http://www.yahoo.com/',
'http://www.slashdot.org/',
'http://www.stackoverflow.com/',
'http://github.com/',
'http://www.harvard.edu/',
'http://www.gamefaqs.com/',
'http://www.mangaupdates.com/',
'http://www.cnn.com/'
);
ob_start(); //Trying to avoid setting as many curl options as possible
$start_time = microtime(true);
foreach ($url_arr as $url) {
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_exec($ch);
curl_close($ch);
}
$end_time = microtime(true);
ob_end_clean();
echo 'Curl without handle reuse: ' . ($end_time - $start_time) . '<br>';
ob_start(); //Trying to avoid setting as many curl options as possible
$start_time = microtime(true);
$ch = curl_init();
foreach ($url_arr as $url) {
curl_setopt($ch, CURLOPT_URL, $url);
curl_exec($ch);
}
curl_close($ch);
$end_time = microtime(true);
ob_end_clean();
echo 'Curl with handle reuse: ' . ($end_time - $start_time) . '<br>';
And got the following result:
并得到以下结果:
Curl without handle reuse: 3.7672290802002
Curl with handle reuse: 3.0146431922913
不带手柄重复使用的
卷曲:3.7672290802002带手柄重复使用的卷曲:3.0146431922913
Still quite a substantial performance increase.
仍然是相当可观的性能提升。
回答by amenthes
I have a similar scenario where I post data to a server. It is chunked into requests of ~100 lines, so it produces a lot of requests. In a benchmark-run I compared two approaches for 12.614 Lines (127 requests needed) plus authentication and another housekeeping request (129 requests total).
我有一个类似的场景,我将数据发布到服务器。它被分成大约 100 行的请求,所以它会产生很多请求。在基准测试中,我比较了 12.614 行(需要 127 个请求)加上身份验证和另一个内务处理请求(总共 129 个请求)的两种方法。
The requests go over a network to a server in the same country, not on-site. They are secured by TLS 1.2 (the handshake will also take its toll, but given that HTTPS is becoming more and more a default choice, this might even make it more similar to your scenario).
请求通过网络到达同一国家/地区的服务器,而不是现场服务器。它们由 TLS 1.2 保护(握手也会造成损失,但鉴于 HTTPS 越来越成为默认选择,这甚至可能使其与您的场景更相似)。
With cURL reuse:one $curlHandle
that is curl_init()
'ed once, and then only modified with CURLOPT_URL
and CURLOPT_POSTFIELDS
的卷曲复用:一个$curlHandle
是curl_init()
“编一次,然后只具有修饰的CURLOPT_URL
和CURLOPT_POSTFIELDS
Run 1: ~42.92s
Run 3: ~41.52s
Run 4: ~53.17s
Run 5: ~53.93s
Run 6: ~55.51s
Run 11: ~53.59s
Run 12: ~53.76s
Avg: 50,63s / Std.Dev: 5,8s
TCP-Conversations / SSL Handshakes: 5 (Wireshark)
Without cURL reuse:one curl_init
per request
没有 cURL 重用:curl_init
每个请求
一个
Run 2: ~57.67s
Run 7: ~62.13s
Run 8: ~71.59s
Run 9: ~70.70s
Run 10: ~59.12s
Avg: 64,24s / Std. Dev: 6,5s
TCP-Conversations / SSL Handshakes: 129 (Wireshark)
It isn't the largest of datasets, but one can say that all of the "reused" runs are faster than all of the "init" runs. The average times show a difference of almost 14 seconds.
它不是最大的数据集,但可以说所有“重用”运行都比所有“初始化”运行都快。平均时间显示出几乎 14 秒的差异。
回答by Adam Hopkinson
It depends how many requests you will be making - the overhead for closing & reopening each is negligable, but when doing a thousand? Could be a few seconds or more.
这取决于您将发出多少请求 - 关闭和重新打开每个请求的开销可以忽略不计,但是当做一千个请求时?可能是几秒钟或更长时间。
I believe curl_multi_init would be the fastest method.
我相信 curl_multi_init 将是最快的方法。
The whole thing depends on how many requests you need to do.
整个事情取决于你需要做多少请求。
回答by sathia
check this out too
也看看这个
try { $pool = new HttpRequestPool( new HttpRequest($q1), new HttpRequest($qn) ); $pool->send(); foreach($pool as $request) { $out[] = $request->getResponseBody(); } } catch (HttpException $e) { echo $e; }