python MySQLDB查询超时

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1507091/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-11-03 22:26:21  来源:igfitidea点击:

python MySQLDB query timeout

pythontimeoutmysql

提问by mluebke

I'm trying to enforce a time limit on queries in python MySQLDB. I have a situation where I have no control over the queries, but need to ensure that they do not run over a set time limit. I've tried using signal.SIGALRM to interrupt the call to execute, but this does not seem to work. The signal gets sent, but does not get caught until after the call to execute finishes.

我正在尝试对 python MySQLDB 中的查询强制执行时间限制。我有一种情况,我无法控制查询,但需要确保它们不会超过设定的时间限制。我试过使用 signal.SIGALRM 来中断执行调用,但这似乎不起作用。信号被发送,但在调用 execute 完成之前不会被捕获。

I wrote a test case to prove this behavior:

我写了一个测试用例来证明这种行为:

#!/usr/local/bin/python2.6

import time
import signal

from somewhere import get_dbc

class Timeout(Exception):
    """ Time Exceded """

def _alarm_handler(*args):
    raise Timeout

dbc = get_dbc()

signal.signal(signal.SIGALRM, _alarm_handler)
signal.alarm(1)

try:
    print "START:  ", time.time()
    dbc.execute("SELECT SLEEP(10)")
except Timeout:
    print "TIMEOUT!", time.time()'

The "SELECT SLEEP(10)" is simulating a slow query, but I do see the same behavior with an actual slow query.

“SELECT SLEEP(10)”正在模拟慢速查询,但我确实在实际慢速查询中看到了相同的行为。

The Result:

结果:

START:   1254440686.69
TIMEOUT! 1254440696.69

As you can see, it's sleeping for 10 seconds then I get the Timeout Exception.

如您所见,它休眠了 10 秒,然后我收到超时异常。

Questions:

问题:

  1. Why do I not get the signal until after execute finishes?
  2. Is there another reliable way to limit query execution time?
  1. 为什么我直到执行完成后才收到信号?
  2. 是否有另一种可靠的方法来限制查询执行时间?

采纳答案by Alex Martelli

@nosklo's twisted-based solutionis elegant and workable, but if you want to avoid the dependency on twisted, the task is still doable, e.g:

@nosklo 基于 Twisted 的解决方案优雅且可行,但如果您想避免对Twisted 的依赖,该任务仍然可行,例如:

import multiprocessing

def query_with_timeout(dbc, timeout, query, *a, **k):
  conn1, conn2 = multiprocessing.Pipe(False)
  subproc = multiprocessing.Process(target=do_query,
                                    args=(dbc, query, conn2)+a, 
                                    kwargs=k)
  subproc.start()
  subproc.join(timeout)
  if conn1.poll():
    return conn1.recv()
  subproc.terminate()
  raise TimeoutError("Query %r ran for >%r" % (query, timeout))

def do_query(dbc, query, conn, *a, **k):
  cu = dbc.cursor()
  cu.execute(query, *a, **k)
  return cu.fetchall()

回答by shurik

I've tried using signal.SIGALRM to interrupt the call to execute, but this does not seem to work. The signal gets sent, but does not get caught until after the call to execute finishes.

我试过使用 signal.SIGALRM 来中断执行调用,但这似乎不起作用。信号被发送,但在调用 execute 完成之前不会被捕获。

mysql library handles interrupted systems calls internally so you won't see side effects of SIGALRM until after API call completes (short of killing the current thread or process)

mysql 库在内部处理中断的系统调用,因此在 API 调用完成之前您不会看到 SIGALRM 的副作用(没有杀死当前线程或进程)

You can try patching MySQL-Python and use MYSQL_OPT_READ_TIMEOUT option (added in mysql 5.0.25)

您可以尝试修补 MySQL-Python 并使用 MYSQL_OPT_READ_TIMEOUT 选项(在 mysql 5.0.25 中添加)

回答by Vyktor

Generic notes

通用注释

I've experienced the same issue lately with several conditions I had to met:

我最近遇到了同样的问题,我必须满足几个条件:

  • solution must be thread safe
  • multiple connections to database from the same machine may be active at the same time, kill the exact one connection/query
  • application contains connections to many different databases - portable handler for each DB host
  • 解决方案必须是线程安全的
  • 从同一台机器到数据库的多个连接可能同时处于活动状态,杀死确切的一个连接/查询
  • 应用程序包含到许多不同数据库的连接 - 每个数据库主机的便携式处理程序

We had following class layout (unfortunately I cannot post real sources):

我们有以下课程布局(不幸的是我无法发布真实来源):

class AbstractModel: pass 
class FirstDatabaseModel(AbstractModel): pass # Connection to one DB host
class SecondDatabaseModel(AbstractModel): pass # Connection to one DB host

And created several threads for each model.

并为每个模型创建了多个线程。



Solution Python 3.2

解决方案Python 3.2

In our application one model = one database. So I've created "service connection" for each model (so we could execute KILLin parallel connection). Therefore if one instance of FirstDatabaseModelwas created, 2 database connection were created; if 5 instances were created only 6 connections were used:

在我们的应用程序中,一个模型 = 一个数据库。所以我为每个模型创建了“服务连接”(所以我们可以KILL并行执行)。因此,如果FirstDatabaseModel创建了1 个实例,则创建了 2 个数据库连接;如果创建了 5 个实例,则只使用了 6 个连接:

class AbstractModel:
    _service_connection = None # Formal declaration

    def __init__(self):
        ''' Somehow load config and create connection
        '''
        self.config = # ...
        self.connection = MySQLFromConfig(self.config)
        self._init_service_connection()

        # Get connection ID (pseudocode)
        self.connection_id = self.connection.FetchOneCol('SELECT CONNECTION_ID()') 

    def _init_service_connection(self):
        ''' Initialize one singleton connection for model
        '''
        cls = type(self)
        if cls._service_connection is not None:
            return

        cls._service_connection = MySQLFromConfig(self.config)

Now we need a killer:

现在我们需要一个杀手:

def _kill_connection(self):
    # Add your own mysql data escaping
    sql = 'KILL CONNECTION {}'.format(self.connection_id)

    # Do your own connection check and renewal
    type(self)._service_connection.execute(sql)

Note: connection.execute= create cursor, execute, close cursor.

注意:connection.execute= 创建游标、执行、关闭游标。

And make killer thread safe using threading.Lock:

并使用threading.Lock以下方法使杀手线程安全:

def _init_service_connection(self):
    ''' Initialize one singleton connection for model
    '''
    cls = type(self)
    if cls._service_connection is not None:
        return

    cls._service_connection = MySQLFromConfig(self.config)
    cls._service_connection_lock = threading.Lock()

def _kill_connection(self):
    # Add your own mysql data escaping
    sql = 'KILL CONNECTION {}'.format(self.connection_id)
    cls = type(self)

    # Do your own connection check and renewal
    try:
        cls._service_connection_lock.acquire()    
        cls._service_connection.execute(sql)
    finally:
        cls._service_connection_lock.release()

And finally add timed execution method using threading.Timer:

最后使用threading.Timer以下方法添加定时执行方法:

def timed_query(self, sql, timeout=5):
    kill_query_timer = threading.Timer(timeout, self._kill_connection)
    kill_query_timer.start()

    try:
        self.connection.long_query() 
    finally:
        kill_query_timer.cancel()

回答by John Millikin

Why do I not get the signal until after execute finishes?

为什么我直到执行完成后才收到信号?

The query is executed through a C function, which blocks the Python VM from executing until it returns.

查询通过 C 函数执行,该函数阻止 Python VM 执行,直到它返回。

Is there another reliable way to limit query execution time?

是否有另一种可靠的方法来限制查询执行时间?

This is (IMO) a really ugly solution, but it doeswork. You could run the query in a separate process (either via fork()or the multiprocessingmodule). Run the alarm timer in your main process, and when you receive it, send a SIGINTor SIGKILLto the child process. If you use multiprocessing, you can use the Process.terminate()method.

这是(IMO)一个非常丑陋的解决方案,但它确实有效。您可以在单独的进程(通过fork()multiprocessing模块)中运行查询。在您的主进程中运行闹钟计时器,当您收到它时,向子进程发送一个SIGINTSIGKILL。如果使用multiprocessing,则可以使用该Process.terminate()方法。

回答by nosklo

Use adbapi. It allows you to do a db call asynchronously.

使用adbapi。它允许您异步执行数据库调用。

from twisted.internet import reactor
from twisted.enterprise import adbapi

def bogusQuery():
    return dbpool.runQuery("SELECT SLEEP(10)")

def printResult(l):
    # function that would be called if it didn't time out
    for item in l:
        print item

def handle_timeout():
    # function that will be called when it timeout
    reactor.stop()

dbpool = adbapi.ConnectionPool("MySQLdb", user="me", password="myself", host="localhost", database="async")
bogusQuery().addCallback(printResult)
reactor.callLater(4, handle_timeout)
reactor.run()

回答by Rafa? Dowgird

Why do I not get the signal until after execute finishes?

为什么我直到执行完成后才收到信号?

The process waiting for network I/O is in an uninterruptible state (UNIX thing, not related to Python or MySQL). It gets the signal after the system call finishes (probably as EINTRerror code, although I am not sure).

等待网络 I/O 的进程处于不可中断状态(UNIX 的东西,与 Python 或 MySQL 无关)。它在系统调用完成后获取信号(可能作为EINTR错误代码,虽然我不确定)。

Is there another reliable way to limit query execution time?

是否有另一种可靠的方法来限制查询执行时间?

I think that it is usually done by an external tool like mkillthat monitors MySQL for long running queries and kills them.

我认为它通常是由一个外部工具完成的,比如mkill监视 MySQL 以获取长时间运行的查询并杀死它们。