Python 如何使 Tornado 中的 SQLAlchemy 异步?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/16491564/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to make SQLAlchemy in Tornado to be async?
提问by Damir
How to make SQLAlchemyin Tornadoto be async?
I found example for MongoDB on async mongo examplebut I couldn't find anything like motorfor SQLAlchemy. Does anyone know how to make SQLAlchemyqueries to execute with tornado.gen( I am using MySQLbelow SQLAlchemy, at the moment my handlers reads from database and return result, I would like to make this async).
如何使SQLAlchemy中Tornado是async?我在async mongo 示例上找到了 MongoDB 的示例,但找不到类似motorfor 的任何内容SQLAlchemy。有谁知道如何SQLAlchemy执行查询tornado.gen(我在MySQL下面使用SQLAlchemy,目前我的处理程序从数据库读取并返回结果,我想让这个异步)。
采纳答案by zzzeek
ORMs are poorly suited for explicit asynchronous programming, that is, where the programmer must produce explicit callbacks anytime something that uses network access occurs. A primary reason for this is that ORMs make extensive use of the lazy loadingpattern, which is more or less incompatible with explicit async. Code that looks like this:
ORM 不太适合显式异步编程,也就是说,程序员必须在任何使用网络访问的情况发生时生成显式回调。一个主要原因是 ORM 广泛使用延迟加载模式,这或多或少与显式异步不兼容。看起来像这样的代码:
user = Session.query(User).first()
print user.addresses
will actually emit two separate queries - one when you say first()to load a row, and the next when you say user.addresses, in the case that the .addressescollection isn't already present, or has been expired. Essentially, nearly every line of code that deals with ORM constructs might block on IO, so you'd be in extensive callback spaghetti within seconds - and to make matters worse, the vast majority of those code lines won't actuallyblock on IO, so all the overhead of connecting callbacks together for what would otherwise be simple attribute access operations will make your program vastly less efficient too.
实际上会发出两个单独的查询 - 一个当你说first()要加载一行时,下一个当你说 时user.addresses,如果.addresses集合不存在,或者已经过期。从本质上讲,几乎每一行处理 ORM 结构的代码都可能在 IO 上阻塞,所以你会在几秒钟内陷入大量的回调意大利面——更糟糕的是,这些代码行中的绝大多数实际上不会在 IO 上阻塞,因此,将回调连接在一起的所有开销都将是简单的属性访问操作,这也将使您的程序效率大大降低。
A major issue with explicit asynchronous models is that they add tremendous Python function call overhead to complex systems - not just on the user-facing side like you get with lazy loading, but on the internal side as well regarding how the system provides abstraction around the Python database API (DBAPI). For SQLAlchemy to even have basic async support would impose a severe performance penalty on the vast majority of programs that don't use async patterns, and even those async programs that are not highly concurrent. Consider SQLAlchemy, or any other ORM or abstraction layer, might have code like the following:
显式异步模型的一个主要问题是它们为复杂系统增加了巨大的 Python 函数调用开销 - 不仅在面向用户的方面(如延迟加载),而且在内部方面,以及系统如何围绕系统提供抽象Python 数据库 API (DBAPI)。对于 SQLAlchemy 来说,即使拥有基本的异步支持,也会对绝大多数不使用异步模式的程序,甚至那些非高度并发的异步程序造成严重的性能损失。考虑 SQLAlchemy 或任何其他 ORM 或抽象层,可能具有如下代码:
def execute(connection, statement):
cursor = connection.cursor()
cursor.execute(statement)
results = cursor.fetchall()
cursor.close()
return results
The above code performs what seems to be a simple operation, executing a SQL statement on a connection. But using a fully async DBAPI like psycopg2's async extension, the above code blocks on IO at least three times. So to write the above code in explicit async style, even when there's no async engine in use and the callbacks aren't actually blocking, means the above outer function call becomes at least three function calls, instead of one, not including the overhead imposed by the explicit asynchronous system or the DBAPI calls themselves. So a simple application is automatically given a penalty of 3x the function call overhead surrounding a simple abstraction around statement execution. And in Python, function call overhead is everything.
上面的代码执行看似简单的操作,即在连接上执行 SQL 语句。但是使用像 psycopg2 的异步扩展这样的完全异步 DBAPI,上面的代码在 IO 上至少阻塞了 3 次。因此,以显式异步样式编写上述代码,即使没有使用异步引擎并且回调实际上没有阻塞,也意味着上述外部函数调用至少变为三个函数调用,而不是一个,不包括施加的开销通过显式异步系统或 DBAPI 调用自身。因此,围绕语句执行的简单抽象,一个简单的应用程序会自动受到 3 倍的函数调用开销的惩罚。而在 Python 中,函数调用开销就是一切。
For these reasons, I continue to be less than excited about the hype surrounding explicit async systems, at least to the degree that some folks seem to want to go all async for everything, like delivering web pages (see node.js). I'd recommend using implicit async systems instead, most notably gevent, where you get all the non-blocking IO benefits of an asynchronous model and none of the structural verbosity/downsides of explicit callbacks. I continue to try to understand use cases for these two approaches, so I'm puzzled by the appeal of the explicit async approach as a solution to all problems, i.e. as you see with node.js - we're using scripting languages in the first place to cut down on verbosity and code complexity, and explicit async for simple things like delivering web pages seems to do nothing but add boilerplate that can just as well be automated by gevent or similar, if blocking IO is even such a problem in a case like that (plenty of high volume websites do fine with a synchronous IO model). Gevent-based systems are production proven and their popularity is growing, so if you like the code automation that ORMs provide, you might also want to embrace the async-IO-scheduling automation that a system like gevent provides.
由于这些原因,我对围绕显式异步系统的炒作仍然不那么兴奋,至少在某种程度上,有些人似乎想要对所有事情都采用异步,比如交付网页(参见 node.js)。我建议改用隐式异步系统,最显着的是gevent,在这里您可以获得异步模型的所有非阻塞 IO 优点,并且没有显式回调的结构冗长/缺点。我继续尝试理解这两种方法的用例,所以我对显式异步方法作为所有问题的解决方案的吸引力感到困惑,即正如您在 node.js 中看到的那样 - 我们在第一个减少冗长和代码复杂性的地方,对于像交付网页这样的简单事情的显式异步似乎什么都不做,只是添加可以由 gevent 或类似工具自动化的样板,如果阻塞 IO 在一个像这样的情况(很多大容量网站都可以使用同步 IO 模型)。基于 Gevent 的系统已经过生产验证并且它们的受欢迎程度正在增长,所以如果您喜欢 ORM 提供的代码自动化,
Update: Nick Coghlan pointed out his great article on the subject of explicit vs. implicit asyncwhich is also a must read here. And I've also been updated to the fact that pep-3156 now welcomes interoperability with gevent, reversing its previously stated disinterest in gevent, largely thanks to Nick's article. So in the future I would recommend a hybrid of Tornado using gevent for the database logic, once the system of integrating these approaches is available.
更新:Nick Coghlan 指出了他关于显式与隐式异步主题的精彩文章,这也是这里必须阅读的内容。而且我还了解到,pep-3156 现在欢迎与 gevent 的互操作性,扭转了先前声明的对 gevent 不感兴趣的事实,这在很大程度上要归功于 Nick 的文章。因此,将来,一旦集成这些方法的系统可用,我将推荐使用 gevent 作为数据库逻辑的 Tornado 的混合体。
回答by Ander
I had this same issue in the past and I couldn't find a reliable Async-MySQL library. However there is a cool solution using Asyncio+ Postgres. You just need to use the aiopglibrary, which comes with SQLAlchemy support out of the box:
我过去遇到过同样的问题,但找不到可靠的 Async-MySQL 库。但是,使用Asyncio+ Postgres有一个很酷的解决方案。你只需要使用aiopg库,它带有开箱即用的 SQLAlchemy 支持:
import asyncio
from aiopg.sa import create_engine
import sqlalchemy as sa
metadata = sa.MetaData()
tbl = sa.Table('tbl', metadata,
sa.Column('id', sa.Integer, primary_key=True),
sa.Column('val', sa.String(255)))
async def create_table(engine):
async with engine.acquire() as conn:
await conn.execute('DROP TABLE IF EXISTS tbl')
await conn.execute('''CREATE TABLE tbl (
id serial PRIMARY KEY,
val varchar(255))''')
async def go():
async with create_engine(user='aiopg',
database='aiopg',
host='127.0.0.1',
password='passwd') as engine:
async with engine.acquire() as conn:
await conn.execute(tbl.insert().values(val='abc'))
async for row in conn.execute(tbl.select()):
print(row.id, row.val)
loop = asyncio.get_event_loop()
loop.run_until_complete(go())
Updated as mentioned by @cglacet
如@cglacet所述更新
回答by Mykola Kharechko
I am using tornado with sqlalchemy in next way:
我正在以下一种方式将龙卷风与 sqlalchemy 一起使用:
from tornado_mysql import pools
from sqlalchemy.sql import table, column, select, join
from sqlalchemy.dialects import postgresql, mysql
# from models import M, M2
t = table(...)
t2 = table(...)
xxx_id = 10
j = join(t, t2, t.c.t_id == t2.c.id)
s = select([t]).select_from(j).where(t.c.xxx == xxx_id)
sql_str = s.compile(dialect=mysql.dialect(),compile_kwargs={"literal_binds": True})
pool = pools.Pool(conn_data...)
cur = yield pool.execute(sql_str)
data = cur.fetchone()
In that case we are able to use sqlalchemy models, and sqlalchemy tools for constructig queries.
在这种情况下,我们可以使用 sqlalchemy 模型和 sqlalchemy 工具来构建查询。
回答by Fantix King
Not tornado, but we sort ofmade SQLAlchemy async in asyncio in the GINO project:
不是龙卷风,但是我们有点发SQLAlchemy的异步在ASYNCIO GINO项目:
import asyncio
from gino import Gino, enable_task_local
from sqlalchemy import Column, Integer, Unicode, cast
db = Gino()
class User(db.Model):
__tablename__ = 'users'
id = Column(Integer(), primary_key=True)
nickname = Column(Unicode(), default='noname')
async def main():
await db.create_pool('postgresql://localhost/gino')
# Create object, `id` is assigned by database
u1 = await User.create(nickname='fantix')
print(u1.id, u1.nickname) # 1 fantix
# Retrieve the same row, as a different object
u2 = await User.get(u1.id)
print(u2.nickname) # fantix
# Update affects only database row and the operating object
await u2.update(nickname='daisy')
print(u2.nickname) # daisy
print(u1.nickname) # fantix
# Returns all user objects with "d" in their nicknames
users = await User.query.where(User.nickname.contains('d')).gino.all()
# Find one user object, None if not found
user = await User.query.where(User.nickname == 'daisy').gino.first()
# Execute complex statement and return command status
status = await User.update.values(
nickname='No.' + cast(User.id, Unicode),
).where(
User.id > 10,
).gino.status()
# Iterate over the results of a large query in a transaction as required
async with db.transaction():
async for u in User.query.order_by(User.id).gino.iterate():
print(u.id, u.nickname)
loop = asyncio.get_event_loop()
enable_task_local(loop)
loop.run_until_complete(main())
It looks a bit like, but actually quite differentthan SQLAlchemy ORM. Because we used only a part of SQLAlchemy core, and built a simple ORM on top of it. It uses asyncpgunderneath, so it is for PostgreSQL only.
它看起来有点像,但实际上与 SQLAlchemy ORM大不相同。因为我们只使用了 SQLAlchemy 核心的一部分,并在其上构建了一个简单的 ORM。它在下面使用asyncpg,因此它仅适用于 PostgreSQL。
Update: GINO supports Tornado now, thanks to the contribution of Vladimir Goncharov. See docs here
回答by DachuanZhao
I am using tornado6 with sqlalchemy in next way:
我正在以下一种方式将 tornado6 与 sqlalchemy 一起使用:
from tornado.ioloop import IOLoop
def sql_function():
pass
class Handler(tornado.web.RequestHandler):
async def post(self):
args = get_front_end_args()
result = await IOLoop.current().run_in_executor(None,sql_function,*(args))
self.write({"result":result})

