python请求模块和连接重用

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/24873927/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 05:23:44  来源:igfitidea点击:

python requests module and connection reuse

pythonpython-requestskeep-alive

提问by gmemon

I am working with python's requests module for HTTP communication, and I am wondering how to reuse already-established TCP connections? The requests module is stateless and if I repeatedly call get for the same URL, wouldn't it create a new connection each time?

我正在使用 python 的请求模块进行 HTTP 通信,我想知道如何重用已经建立的 TCP 连接?requests 模块是无状态的,如果我对同一个 URL 重复调用 get,它不会每次都创建一个新连接吗?

Thanks!!

谢谢!!

采纳答案by abarnert

The requests module is stateless and if I repeatedly call get for the same URL, wouldnt it create a new connection each time?

requests 模块是无状态的,如果我对同一个 URL 重复调用 get,它不会每次都创建一个新连接吗?

The requestsmodule is not stateless; it just lets you ignore the state and effectively use a global singleton state if you choose to do so.*

requests模块不是无状态的;如果您选择这样做,它只会让您忽略状态并有效地使用全局单例状态。 *

And it (or, rather, one of the underlying libraries, urllib3) maintains a connection pool keyed by (hostname, port) pair, so it will usually just magically reuse a connection if it can.

并且它(或者更确切地说,底层库之一urllib3)维护一个由(主机名,端口)对键控的连接池,因此如果可以,它通常会神奇地重用连接。

As the documentationsays:

正如文档所说:

Excellent news — thanks to urllib3, keep-alive is 100% automatic within a session! Any requests that you make within a session will automatically reuse the appropriate connection!

Note that connections are only released back to the pool for reuse once all body data has been read; be sure to either set streamto Falseor read the contentproperty of the Responseobject.

好消息——多亏了 urllib3,保持活动在一个会话中是 100% 自动的!您在会话中发出的任何请求都将自动重用适当的连接!

请注意,只有在读取所有主体数据后,连接才会释放回池以供重用;一定要设置streamFalse或读取对象的content属性Response

So, what does "if it can" mean? As the docs above imply, if you're keeping streaming response objects alive, their connections obviously can't be reused.

那么,“如果可以的话”是什么意思?正如上面的文档所暗示的那样,如果您保持流式响应对象处于活动状态,则它们的连接显然无法重用。

Also, the connection pool is really a finite cache, not infinite, so if you spam out a ton of connections and two of them are to the same server, you won't alwaysreuse the connection, just often. But usually, that's what you actually want.

此外,连接池实际上是一个有限的缓存,而不是无限的,因此如果您发送大量连接并且其中两个连接到同一台服务器,您将不会总是重复使用连接,只是经常使用。但通常,这就是你真正想要的。



* The particular state relevant here is the transport adapter. Each session gets a transport adapter. You can specify the adapter manually, or you can specify a global default, or you can just use the default global default, which basically just wraps up a urllib3.PoolManagerfor managing its HTTP connections. For more information, read the docs.

* 此处相关的特定状态是传输适配器。每个会话都有一个传输适配器。您可以手动指定适配器,也可以指定全局默认值,或者您可以只使用默认的全局默认值,它基本上只是封装了一个urllib3.PoolManager用于管理其 HTTP 连接的。有关更多信息,请阅读文档。

回答by Dmytro Kyrychuk

Global functions like requests.getor requests.postcreate the requests.Sessioninstance on each call. Connections made with these functions cannot be reused, because you cannot access automatically created session and use it's connection pool for subsequent requests. It's fine to use these functions if you have to do just a few requests. Otherwise you'll want to manage sessions yourself.

全局函数像requests.get或在每次调用时requests.post创建requests.Session实例。使用这些函数建立的连接无法重用,因为您无法访问自动创建的会话并将其连接池用于后续请求。如果您只需要执行几个请求,则可以使用这些函数。否则,您将需要自己管理会话。

Here is a quick display of requestsbehavior when you use global getfunction and session.

这是requests使用全局get函数和会话时的行为的快速显示。

Preparation, not really relevant to the question:

准备,与问题无关:

>>> import logging, requests, timeit
>>> logging.basicConfig(level=logging.DEBUG, format="%(message)s")

See, a new connection is established each time you call get:

看,每次调用时都会建立一个新连接get

>>> _ = requests.get("https://www.wikipedia.org")
Starting new HTTPS connection (1): www.wikipedia.org
>>> _ = requests.get("https://www.wikipedia.org")
Starting new HTTPS connection (1): www.wikipedia.org

But if you use the same session for subsequent calls, the connection gets reused:

但是,如果您对后续调用使用相同的会话,则连接会被重用:

>>> session = requests.Session()
>>> _ = session.get("https://www.wikipedia.org")
Starting new HTTPS connection (1): www.wikipedia.org
>>> _ = session.get("https://www.wikipedia.org")
>>> _ = session.get("https://www.wikipedia.org")
>>> _ = session.get("https://www.wikipedia.org")

Performance:

表现:

>>> timeit.timeit('_ = requests.get("https://www.wikipedia.org")', 'import requests', number=100)
Starting new HTTPS connection (1): www.wikipedia.org
Starting new HTTPS connection (1): www.wikipedia.org
Starting new HTTPS connection (1): www.wikipedia.org
...
Starting new HTTPS connection (1): www.wikipedia.org
Starting new HTTPS connection (1): www.wikipedia.org
Starting new HTTPS connection (1): www.wikipedia.org
52.74904417991638
>>> timeit.timeit('_ = session.get("https://www.wikipedia.org")', 'import requests; session = requests.Session()', number=100)
Starting new HTTPS connection (1): www.wikipedia.org
15.770191192626953

Works much faster when you reuse the session (and thus session's connection pool).

当您重用会话(以及会话的连接池)时,工作速度会快得多。