pandas 从使用绑定变量的数据库查询创建熊猫数据框
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/14884686/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
creating a pandas dataframe from a database query that uses bind variables
提问by David Marx
I'm working with an Oracle database. I can do this much:
我正在使用 Oracle 数据库。我可以做这么多:
import pandas as pd
import pandas.io.sql as psql
import cx_Oracle as odb
conn = odb.connect(_user +'/'+ _pass +'@'+ _dbenv)
sqlStr = "SELECT * FROM customers"
df = psql.frame_query(sqlStr, conn)
But I don't know how to handle bind variables, like so:
但我不知道如何处理绑定变量,如下所示:
sqlStr = """SELECT * FROM customers
WHERE id BETWEEN :v1 AND :v2
"""
I've tried these variations:
我试过这些变化:
params = (1234, 5678)
params2 = {"v1":1234, "v2":5678}
df = psql.frame_query((sqlStr,params), conn)
df = psql.frame_query((sqlStr,params2), conn)
df = psql.frame_query(sqlStr,params, conn)
df = psql.frame_query(sqlStr,params2, conn)
The following works:
以下工作:
curs = conn.cursor()
curs.execute(sqlStr, params)
df = pd.DataFrame(curs.fetchall())
df.columns = [rec[0] for rec in curs.description]
but this solution is just...inellegant. If I can, I'd like to do this without creating the cursor object. Is there a way to do the whole thing using just pandas?
但这个解决方案只是......不雅。如果可以,我想在不创建游标对象的情况下执行此操作。有没有办法只使用Pandas来完成整个事情?
回答by privod
Try using pandas.io.sql.read_sql_query. I used pandas version 0.20.1, I used it, it worked out:
尝试使用pandas.io.sql.read_sql_query. 我使用了 0.20.1 版的 Pandas,我使用了它,结果如下:
import pandas as pd
import pandas.io.sql as psql
import cx_Oracle as odb
conn = odb.connect(_user +'/'+ _pass +'@'+ _dbenv)
sqlStr = """SELECT * FROM customers
WHERE id BETWEEN :v1 AND :v2
"""
pars = {"v1":1234, "v2":5678}
df = psql.frame_query(sqlStr, conn, params=pars)
回答by Paul H
As far as I can tell, pandas expects that the SQL string be completely formed prior to passing it along. With that in mind, I would (and always do) use string interpolation:
据我所知,pandas 期望 SQL 字符串在传递之前完全形成。考虑到这一点,我会(并且总是这样做)使用字符串插值:
params = (1234, 5678)
sqlStr = """
SELECT * FROM customers
WHERE id BETWEEN %d AND %d
""" % params
print(sqlStr)
which gives
这使
SELECT * FROM customers
WHERE id BETWEEN 1234 AND 5678
So that should feed into psql.frame_queryjust fine. (it does in my experience with postgres, mysql, and sql server).
所以这应该psql.frame_query很好。(根据我使用 postgres、mysql 和 sql server 的经验,它确实如此)。

