Python 请求与 PyCurl 性能

Question

提问by Eugene

How does the Requests library compare with the PyCurl performance wise?

Requests 库与 PyCurl 的性能相比如何？

My understanding is that Requests is a python wrapper for urllib whereas PyCurl is a python wrapper for libcurl which is native, so PyCurl should get better performance, but not sure by how much.

我的理解是 Requests 是 urllib 的 python 包装器，而 PyCurl 是原生 libcurl 的 python 包装器，所以 PyCurl 应该获得更好的性能，但不确定有多少。

I can't find any comparing benchmarks.

我找不到任何比较基准。

Answer 1

采纳答案by BobMcGee

I wrote you a full benchmark, using a trivial Flask application backed by gUnicorn/meinheld + nginx (for performance and HTTPS), and seeing how long it takes to complete 10,000 requests. Tests are run in AWS on a pair of unloaded c4.large instances, and the server instance was not CPU-limited.

我给你写了一个完整的基准测试，使用由 gUnicorn/meinheld + nginx 支持的简单 Flask 应用程序（用于性能和 HTTPS），并查看完成 10,000 个请求需要多长时间。测试在 AWS 中的一对未加载的 c4.large 实例上运行，并且服务器实例不受 CPU 限制。

TL;DR summary:if you're doing a lot of networking, use PyCurl, otherwise use requests. PyCurl finishes small requests 2x-3x as fast as requests until you hit the bandwidth limit with large requests (around 520 MBit or 65 MB/s here), and uses from 3x to 10x less CPU power. These figures compare cases where connection pooling behavior is the same; by default, PyCurl uses connection pooling and DNS caches, where requests does not, so a naive implementation will be 10x as slow.

TL;DR 总结：如果你做很多网络，使用 PyCurl，否则使用请求。PyCurl 完成小请求的速度是请求的 2 到 3 倍，直到达到大请求的带宽限制（此处约为 520 MBit 或 65 MB/s），并且使用的 CPU 功率减少 3 到 10 倍。这些图比较了连接池行为相同的情况；默认情况下，PyCurl 使用连接池和 DNS 缓存，而请求不使用，因此简单的实现将慢 10 倍。

Note that double log plots are used for the below graph only, due to the orders of magnitude involved

请注意，由于涉及的数量级，双对数图仅用于下图

pycurl takes about 73 CPU-microseconds to issue a request when reusing a connection
requests takes about 526 CPU-microsecondsto issue a request when reusing a connection
pycurl takes about 165 CPU-microseconds to open a new connectionand issue a request (no connection reuse), or ~92 microseconds to open
requests takes about 1078CPU-microseconds to open a new connectionand issue a request (no connection reuse), or ~552 microseconds to open

pycurl 在重用连接时发出请求大约需要 73 个 CPU 微秒
请求在重用连接时需要大约526 个 CPU 微秒来发出请求
pycurl 需要大约 165 个 CPU 微秒来打开一个新连接并发出请求（没有连接重用），或者大约 92 微秒来打开
requests 大约需要1078CPU-microseconds 来打开一个新连接并发出一个请求（没有连接重用），或者 ~552 微秒来打开

Full results are in the link, along with the benchmark methodology and system configuration.

完整结果以及基准测试方法和系统配置都在链接中。

Caveats:although I've taken pains to ensure the results are collected in a scientific way, it's only testing one system type and one operating system, and a limited subset of performance and especially HTTPS options.

警告：虽然我已经努力确保以科学的方式收集结果，但它仅测试一种系统类型和一种操作系统，以及有限的性能子集，尤其是 HTTPS 选项。

Answer 2

回答by Martijn Pieters

First and foremost, requestsis built on top of the urllib3library, the stdlib urllibor urllib2libraries are not used at all.

首先也是最重要的，它requests是建立在urllib3库之上的，stdliburllib或urllib2库根本不使用。

There is little point in comparing requestswith pycurlon performance. pycurlmay use C code for its work but like all network programming, your execution speed depends largely on the network that separates your machine from the target server. Moreover, the target server could be slow to respond.

requests与pycurl性能相比没有什么意义。pycurl可能会使用 C 代码来完成它的工作，但与所有网络编程一样，您的执行速度在很大程度上取决于将您的机器与目标服务器分开的网络。此外，目标服务器的响应速度可能很慢。

In the end, requestshas a far more friendly API to work with, and you'll find that you'll be more productive using that friendlier API.

最后，requests有一个更友好的 API 可以使用，你会发现使用这个更友好的 API 会更有效率。

Answer 3

回答by paul_h

Focussing on Size -

专注于尺寸 -

On my Mac Book Air with 8GB of RAM and a 512GB SSD, for a 100MB file coming in at 3 kilobytes a second (from the internet and wifi), pycurl, curl and the requests library's get function (regardless of chunking or streaming) are pretty much the same.
On a smaller Quad core Intel Linux boxwith 4GB RAM, over localhost (from Apache on the same box), for a 1GB file, curl and pycurl are 2.5x faster than the 'requests' library. And for requests chunking and streaming together give a 10% boost (chunk sizes above 50,000).

在具有 8GB RAM 和 512GB SSD 的 Mac Book Air 上，对于以每秒 3 KB 的速度传入的 100MB 文件（来自互联网和 wifi），pycurl、curl 和请求库的 get 函数（无论是分块还是流媒体）都是基本上一样。
在具有 4GB RAM的较小四核 Intel Linux 机器上，本地主机（来自同一机器上的 Apache）上，对于 1GB 文件，curl 和 pycurl 比“请求”库快 2.5 倍。对于请求分块和流式传输一起提供 10% 的提升（块大小超过 50,000）。

I thought I was going to have to swap requests out for pycurl, but not so as the application I'm making isn't going to have client and server that close.

我以为我将不得不为 pycurl 交换请求，但事实并非如此，因为我正在制作的应用程序不会关闭客户端和服务器。

Answer 4

回答by user2692263

It seems there is a new kid on the block: - requests interface for pycurl.

似乎有一个新的孩子在块上： - 请求 pycurl 的接口。

Thank You for the bench mark - it was nice - I like curl and it seems to be able to do a bit more than http.

谢谢你的基准测试 - 很好 - 我喜欢 curl，它似乎比 http 能做的更多。

https://github.com/dcoles/pycurl-requests

Python 请求与 PyCurl 性能

提问by Eugene

采纳答案by BobMcGee

回答by Martijn Pieters

回答by paul_h

回答by user2692263

相关推荐

最近更新

标签

Python 请求与 PyCurl 性能

提问by Eugene

采纳答案by BobMcGee

回答by Martijn Pieters

回答by paul_h

回答by user2692263

相关推荐

Python 将多个文件合并到一个新文件中

在 Python 中释放内存

Python 使用熊猫组合/合并 2 个不同的 Excel 文件/工作表

IndexError：Python 中的列表赋值索引超出范围

相关推荐

最近更新

标签