Python 如何在 redis 中正确使用连接池?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/31663288/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 10:22:51  来源:igfitidea点击:

How do I properly use connection pools in redis?

pythonpython-2.7redis

提问by vgoklani

It's not clear to me how connections pools work, and how to properly use them. I was hoping someone could elaborate. I've sketched out my use case below:

我不清楚连接池是如何工作的,以及如何正确使用它们。我希望有人可以详细说明。我在下面勾勒出我的用例:

settings.py:

设置.py:

import redis

def get_redis_connection():
    return redis.StrictRedis(host='localhost', port=6379, db=0)

task1.py

任务1.py

import settings

connection = settings.get_redis_connection()

def do_something1():
    return connection.hgetall(...)

task2.py

任务2.py

import settings

connection = settings.get_redis_connection()

def do_something1():
    return connection.hgetall(...)

etc.

等等。



Basically I have a setting.py file that returns redis connections, and several different task files that get the redis connections, and then run operations. So each task file has its own redis instance (which presumably is very expensive). What's the best way of optimizing this process. Is it possible to use connection pools for this example? Is there a more efficient way of setting up this pattern?

基本上,我有一个返回 redis 连接的 setting.py 文件,以及几个获取 redis 连接然后运行操作的不同任务文件。所以每个任务文件都有自己的 redis 实例(这大概是非常昂贵的)。优化此过程的最佳方法是什么。是否可以在此示例中使用连接池?有没有更有效的方法来设置这种模式?

For our system, we have over a dozen task files following this same pattern, and I've noticed our requests slowing down.

对于我们的系统,我们有十多个任务文件遵循相同的模式,我注意到我们的请求变慢了。

Thanks

谢谢

回答by ali haider

Redis-py provides a connection pool for you from which you can retrieve a connection. Connection pools create a set of connections which you can use as needed (and when done - the connection is returned to the connection pool for further reuse). Trying to create connections on the fly without discarding them (i.e. not using a pool or not using the pool correctly) will leave you with way too many connections to redis (until you hit the connection limit).

Redis-py 为您提供了一个连接池,您可以从中检索连接。连接池创建一组您可以根据需要使用的连接(完成后 - 连接返回到连接池以供进一步重用)。尝试在不丢弃连接的情况下动态创建连接(即不使用池或未正确使用池)将使您有太多连接到 redis(直到达到连接限制)。

You could choose to setup the connection pool in the init method and make the pool global (you can look at other options if uncomfortable with global).

您可以选择在 init 方法中设置连接池并使池全局化(如果对全局不满意,您可以查看其他选项)。

redis_pool = None

def init():
    global redis_pool
    print("PID %d: initializing redis pool..." % os.getpid())
    redis_pool = redis.ConnectionPool(host='10.0.0.1', port=6379, db=0)

You can then retrieve the connection from a pool like this:

然后,您可以像这样从池中检索连接:

redis_conn = redis.Redis(connection_pool=redis_pool)

Also, I am assuming you are using hiredis along with redis-py as it should improve performance in certain cases. Have you also checked the number of connections open to the redis server with your existing setup as it most likely is quite high? You can use the INFO commmand to get that information:

另外,我假设您将hiredis 与redis-py 一起使用,因为它在某些情况下应该可以提高性能。您是否还检查过使用现有设置打开到 redis 服务器的连接数,因为它很可能很高?您可以使用 INFO 命令来获取该信息:

redis-cli info

Check for the Clientssection in which you will see the "connected_clients" field that will tell you how many connections you have open to the redis server at that instant.

检查Clients部分,您将在其中看到“ connected_clients”字段,该字段将告诉您此时已打开到 redis 服务器的连接数。

回答by saaj

Here's a quote right from the Cheese Shop page.

这是来自奶酪店页面的报价。

Behind the scenes, redis-py uses a connection pool to manage connections to a Redis server. By default, each Redis instance you create will in turn create its own connection pool. You can override this behavior and use an existing connection pool by passing an already created connection pool instance to the connection_pool argument of the Redis class. You may choose to do this in order to implement client side sharding or have finer grain control of how connections are managed.

pool = redis.ConnectionPool(host='localhost', port=6379, db=0)
r = redis.Redis(connection_pool=pool)

在幕后,redis-py 使用连接池来管理与 Redis 服务器的连接。默认情况下,您创建的每个 Redis 实例将依次创建自己的连接池。您可以通过将已创建的连接池实例传递给 Redis 类的 connection_pool 参数来覆盖此行为并使用现有连接池。您可以选择这样做以实现客户端分片或对连接的管理方式进行更精细的控制。

pool = redis.ConnectionPool(host='localhost', port=6379, db=0)
r = redis.Redis(connection_pool=pool)

Moreover, instances are thread-safe:

此外,实例是线程安全的

Redis client instances can safely be shared between threads. Internally, connection instances are only retrieved from the connection pool during command execution, and returned to the pool directly after. Command execution never modifies state on the client instance.

Redis 客户端实例可以安全地在线程之间共享。在内部,连接实例只在命令执行期间从连接池中检索,并在之后直接返回到池中。命令执行永远不会修改客户端实例上的状态。

You say:

你说:

So each task file has its own redis instance (which presumably is very expensive). ... For our system, we have over a dozen task files following this same pattern, and I've noticed our requests slowing down.

所以每个任务文件都有自己的 redis 实例(这大概是非常昂贵的)。...对于我们的系统,我们有十多个任务文件遵循相同的模式,我注意到我们的请求变慢了。

It's quite unlikely that several dozens of connections can slow down Redis server. But because your code, behind the scenes, use connection pool, the problem is somewhere out of connections per se. Redis is in-memory storage, thus very fast in most imaginable cases. So I would rather look for the problem in the tasks.

几十个连接很可能会降低 Redis 服务器的速度。但是因为您的代码在幕后使用连接池,所以问题出在连接本身之外。Redis 是内存存储,因此在大多数可以想象的情况下都非常快。所以我宁愿在任务中寻找问题。

Update

更新

From comment of @user3813256. Yes, he uses connection pool at task level. The normal way to utilize built-in connection pool of redispackage is just share the connection. In simplest way, your settings.pymay look like this:

来自@user3813256 的评论。是的,他在任务级别使用连接池。利用包的内置连接池的正常方法redis就是共享连接。以最简单的方式,您settings.py可能看起来像这样:

import redis

connection = None

def connect_to_redis():
    global connection
    connection = redis.StrictRedis(host='localhost', port=6379, db=0)

Then somewhere in bootstrapping of your application call connect_to_redis. Then use import connectionin task modules.

然后在引导您的应用程序调用的某个地方connect_to_redis。然后connection在任务模块中使用导入。

回答by DhruvPathak

You shall use a singleton( borg pattern ) based wrapper written over redis-py, which will provide a common connection pool to all your files. Whenever you use an object of this wrapper class, it will use the same connection pool.

您应使用基于 redis-py 编写的基于单例(borg 模式)的包装器,它将为您的所有文件提供一个公共连接池。每当您使用此包装类的对象时,它将使用相同的连接池。

REDIS_SERVER_CONF = {
    'servers' : {
      'main_server': {
        'HOST' : 'X.X.X.X',
        'PORT' : 6379 ,
        'DATABASE':0
    }
  }
}

import redis
class RedisWrapper(object):
    shared_state = {}

    def __init__(self):
        self.__dict__ = self.shared_state

    def redis_connect(self, server_key):
        redis_server_conf = settings.REDIS_SERVER_CONF['servers'][server_key]
        connection_pool = redis.ConnectionPool(host=redis_server_conf['HOST'], port=redis_server_conf['PORT'],
                                               db=redis_server_conf['DATABASE'])
        return redis.StrictRedis(connection_pool=connection_pool)

Usage:

用法:

r_server = RedisWrapper().redis_connect(server_key='main_server')
r_server.ping()

UPDATE

更新

In case your files run as different processes, you will have to use a redis proxy which will pool the connections for you, and instead of connecting to redis directly, you will have to use the proxy. A very stable redis ( and memcached ) proxy is twemproxycreated by twitter, with main purpose being reduction in open connections.

如果你的文件作为不同的进程运行,你将不得不使用一个 redis 代理来为你池连接,而不是直接连接到 redis,你将不得不使用代理。一个非常稳定的 redis(和 memcached)代理是由 twitter 创建的twemproxy,主要目的是减少开放连接。