如何在多线程 Python 应用程序中共享单个 SQLite 连接

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/22739590/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 01:33:14  来源:igfitidea点击:

How to share single SQLite connection in multi-threaded Python application

pythonmultithreadingsqlite

提问by Larry Lustig

I am trying to write a multi-threaded Python application in which a single SQlite connection is shared among threads. I am unable to get this to work. The real application is a cherrypy web server, but the following simple code demonstrates my problem.

我正在尝试编写一个多线程 Python 应用程序,其中在线程之间共享单个 SQlite 连接。我无法让它发挥作用。真正的应用程序是一个cherrypy web 服务器,但下面的简单代码演示了我的问题。

What change or changes to I need to make to run the sample code, below, successfully?

为了成功运行下面的示例代码,我需要做哪些更改?

When I run this program with THREAD_COUNT set to 1 it works fine and my database is updated as I expect (that is, letter "X" is added to the text value in the SectorGroup column).

当我在 THREAD_COUNT 设置为 1 的情况下运行此程序时,它工作正常,并且我的数据库按预期更新(即,将字母“X”添加到 SectorGroup 列中的文本值中)。

When I run it with THREAD_COUNT set to anything higher than 1, all threads but 1 terminate prematurely with SQLite related exceptions. Different threads throw different exceptions (with no discernible pattern) including:

当我将 THREAD_COUNT 设置为大于 1 的任何值运行它时,除 1 之外的所有线程都因 SQLite 相关异常而过早终止。不同的线程会抛出不同的异常(没有明显的模式),包括:

OperationalError: cannot start a transaction within a transaction 

(occurs on the UPDATEstatement)

(出现在UPDATE语句中)

OperationalError: cannot commit - no transaction is active 

(occurs on the .commit() call)

(发生在 .commit() 调用上)

InterfaceError: Error binding parameter 0 - probably unsupported type. 

(occurs on the UPDATEand the SELECTstatements)

(发生在UPDATESELECT语句上)

IndexError: tuple index out of range

(this one has me completely puzzled, it occurs on the statement group = rows[0][0] or '', but only when multiple threads are running)

(这个让我完全困惑,它发生在语句上group = rows[0][0] or '',但仅在多个线程运行时)

Here is the code:

这是代码:

CONNECTION = sqlite3.connect('./database/mydb', detect_types=sqlite3.PARSE_DECLTYPES, check_same_thread = False)
CONNECTION.row_factory = sqlite3.Row

def commands(start_id):

    # loop over 100 records, read the SectorGroup column, and write it back with "X" appended.
    for inv_id in range(start_id, start_id + 100):

        rows = CONNECTION.execute('SELECT SectorGroup FROM Investment WHERE InvestmentID = ?;', [inv_id]).fetchall()
        if rows:
            group = rows[0][0] or ''
            msg = '{} inv {} = {}'.format(current_thread().name, inv_id, group)
            print msg
            CONNECTION.execute('UPDATE Investment SET SectorGroup = ? WHERE InvestmentID = ?;', [group + 'X', inv_id])

        CONNECTION.commit()

if __name__ == '__main__':

    THREAD_COUNT = 10

    for i in range(THREAD_COUNT):
        t = Thread(target=commands, args=(i*100,))
        t.start()

采纳答案by Martijn Pieters

It's not safe to share a connection between threads; at the very least you need to use a lock to serialize access. Do also read http://docs.python.org/2/library/sqlite3.html#multithreadingas older SQLite versions have more issues still.

在线程之间共享连接是不安全的;至少您需要使用锁来序列化访问。也请阅读http://docs.python.org/2/library/sqlite3.html#multithreading,因为较旧的 SQLite 版本仍有更多问题。

The check_same_threadoption appears deliberately under-documented in that respect, see http://bugs.python.org/issue16509.

check_same_thread选项在这方面故意记录不足,请参阅http://bugs.python.org/issue16509

You could use a connection per thread instead, or look to SQLAlchemy for a connection pool (and a very efficient statement-of-work and queuing system to boot).

您可以改为使用每个线程的连接,或者在 SQLAlchemy 中寻找连接池(以及一个非常有效的工作说明和排队系统来引导)。

回答by Jacques de Hooge

I ran into the SqLite threading problem when writing a simple WSGI server for fun and learning. WSGI is multi-threaded by nature when running under Apache. The following code seems to work for me:

我在编写一个简单的 WSGI 服务器以获取乐趣和学习时遇到了 Sqlite 线程问题。WSGI 在 Apache 下运行时本质上是多线程的。以下代码似乎对我有用:

import sqlite3
import threading

class LockableCursor:
    def __init__ (self, cursor):
        self.cursor = cursor
        self.lock = threading.Lock ()

    def execute (self, arg0, arg1 = None):
        self.lock.acquire ()

        try:
            self.cursor.execute (arg1 if arg1 else arg0)

            if arg1:
                if arg0 == 'all':
                    result = self.cursor.fetchall ()
                elif arg0 == 'one':
                    result = self.cursor.fetchone ()
        except Exception as exception:
            raise exception

        finally:
            self.lock.release ()
            if arg1:
                return result

def dictFactory (cursor, row):
    aDict = {}
    for iField, field in enumerate (cursor.description):
        aDict [field [0]] = row [iField]
    return aDict

class Db:
    def __init__ (self, app):
        self.app = app

    def connect (self):
        self.connection = sqlite3.connect (self.app.dbFileName, check_same_thread = False, isolation_level = None)  # Will create db if nonexistent
        self.connection.row_factory = dictFactory
        self.cs = LockableCursor (self.connection.cursor ())

Example of use:

使用示例:

if not ok and self.user:    # Not logged out
    # Get role data for any later use
    userIdsRoleIds = self.cs.execute ('all', 'SELECT role_id FROM users_roles WHERE user_id == {}'.format (self.user ['id']))

    for userIdRoleId in userIdsRoleIds:
        self.userRoles.append (self.cs.execute ('one', 'SELECT name FROM roles WHERE id == {}'.format (userIdRoleId ['role_id'])))

Another example:

另一个例子:

self.cs.execute ('CREATE TABLE users (id INTEGER PRIMARY KEY, email_address, password, token)')         
self.cs.execute ('INSERT INTO users (email_address, password) VALUES ("{}", "{}")'.format (self.app.defaultUserEmailAddress, self.app.defaultUserPassword))

# Create roles table and insert default role
self.cs.execute ('CREATE TABLE roles (id INTEGER PRIMARY KEY, name)')
self.cs.execute ('INSERT INTO roles (name) VALUES ("{}")'.format (self.app.defaultRoleName))

# Create users_roles table and assign default role to default user
self.cs.execute ('CREATE TABLE users_roles (id INTEGER PRIMARY KEY, user_id, role_id)') 

defaultUserId = self.cs.execute ('one', 'SELECT id FROM users WHERE email_address = "{}"'.format (self.app.defaultUserEmailAddress)) ['id']         
defaultRoleId = self.cs.execute ('one', 'SELECT id FROM roles WHERE name = "{}"'.format (self.app.defaultRoleName)) ['id']

self.cs.execute ('INSERT INTO users_roles (user_id, role_id) VALUES ({}, {})'.format (defaultUserId, defaultRoleId))

Complete program using this construction downloadable at: http://www.josmith.org/

使用此结构的完整程序可从以下网址下载:http: //www.josmith.org/

N.B. The code above is experimental, there may be (fundamental) issues when using this with (many) concurrent requests (e.g. as part of a WSGI server). Performance is not critical for my application. The simplest thing probably would have been to just use MySql, but I like to experiment a little, and the zero installation thing about SqLite appealed to me. If anyone thinks the code above is fundamentally flawed, please react, as my purpose is to learn. If not, I hope this is useful for others.

注意上面的代码是实验性的,当将它与(许多)并发请求(例如作为 WSGI 服务器的一部分)一起使用时,可能存在(基本)问题。性能对我的应用程序并不重要。最简单的方法可能是只使用 MySql,但我喜欢尝试一些,而 Sqlite 的零安装对我很有吸引力。如果有人认为上面的代码存在根本缺陷,请回复,因为我的目的是学习。如果没有,我希望这对其他人有用。

回答by Erik Aronesty

I'm guessing here, but it looks like the reason why you are doing this is a performance concern.

我在这里猜测,但看起来你这样做的原因是性能问题。

Python threads aren't performant in any meaningful way for this use case. Instead, use sqlite transactions, which are super fast.

对于这个用例,Python 线程没有任何有意义的性能。相反,使用超级快的 sqlite 事务。

If you do all your updates in a transaction, you'll find an order of magnitude speedup.

如果您在一个事务中完成所有更新,您会发现一个数量级的加速。