Python SQLAlchemy 过滤器 in_ 运算符

Question

提问by user1988705

I am trying to do a simple filter operation on a query in sqlalchemy, like this:

我正在尝试对 sqlalchemy 中的查询执行简单的过滤操作，如下所示：

q = session.query(Genotypes).filter(Genotypes.rsid.in_(inall))

where

在哪里

inall is a list of strings Genotypes is mapped to a table: class Genotypes(object): pass

inall 是一个字符串列表 Genotypes 被映射到一个表： class Genotypes(object): pass

Genotypes.mapper = mapper(Genotypes, kg_table, properties={'rsid': getattr(kg_table.c, 'rs#')})

This seems pretty straightforward to me, but I get the following error when I execute the above query by doing q.first():

这对我来说似乎很简单，但是当我通过执行上述查询时出现以下错误q.first()：

"sqlalchemy.exc.OperationalError: (OperationalError) too many SQL variables u'SELECT" followed by a list of the 1M items in the inalllist. But they aren't supposed to be SQL variables, just a list whose membership is the filtering criteria.

“sqlalchemy.exc.OperationalError：（OperationalError）太多的SQL变量u'SELECT”随后在1M的项目列表inall列表。但它们不应该是 SQL 变量，只是一个列表，其成员资格是过滤条件。

Am I doing the filtering incorrectly?

我做的过滤不正确吗？

(the db is sqlite)

（数据库是sqlite）

Answer 1

回答by Sean Vieira

If the table where you are getting your rsids from is available in the same database I'd use a subqueryto pass them into your Genotypesquery rather than passing the one million entries around in your Python code.

如果您rsid从中获取s的表在同一数据库中可用，我将使用子查询将它们传递到您的Genotypes查询中，而不是在您的 Python 代码中传递一百万个条目。

sq = session.query(RSID_Source).subquery()
q = session.query(Genotypes).filter(Genotypes.rsid.in_(sq))

The issue is that in order to pass that list to SQLite (or any database, really), SQLAlchemy has to pass over each entry for your inclause as a variable. The SQL translates roughly to:

问题是，为了将该列表传递给 SQLite（或任何数据库，实际上），SQLAlchemy 必须将in子句的每个条目作为变量传递。SQL 大致翻译为：

-- Not valid SQLite SQL
DECLARE @Param1 TEXT;
SET @Param1 = ?;
DECLARE @Param2 TEXT;
SET @Param2 = ?;
-- snip 999,998 more

SELECT field1, field2, -- etc.
FROM Genotypes G
WHERE G.rsid IN (@Param1, @Param2, /* snip */)

Answer 2

回答by Ido S

The below workaround worked for me:

以下解决方法对我有用：

q = session.query(Genotypes).filter(Genotypes.rsid.in_(inall))
query_as_string = str(q.statement.compile(compile_kwargs={"literal_binds": True}))
session.execute(query_as_string).first()

This basically forces the query to compile as a string before execution, which bypasses the whole variables issue. Some details on this are available in SQLAlchemy's docs here.

这基本上强制查询在执行之前编译为字符串，从而绕过整个变量问题。此处的SQLAlchemy 文档中提供了有关此的一些详细信息。

BTW, if you're not using SQLite you can make use of the ANY operator to pass the list object as a single parameter (see my answer to this question here).

顺便说一句，如果您不使用 SQLite，您可以使用 ANY 运算符将列表对象作为单个参数传递（请参阅我对这个问题的回答here）。

Python SQLAlchemy 过滤器 in_ 运算符

提问by user1988705

回答by Sean Vieira

回答by Ido S

相关推荐

最近更新

标签

Python SQLAlchemy 过滤器 in_ 运算符

提问by user1988705

回答by Sean Vieira

回答by Ido S

相关推荐

在 Python 中使用 try-except-else 是一个好习惯吗？

Python Pandas 直方图对数刻度

Python 从日期时间中提取时间并确定时间（不是日期）是否在范围内？

如何将文本文件 (.py) 加载/编辑/运行/保存到 IPython 笔记本单元格中？

相关推荐

最近更新

标签