pandas 将存储过程选择结果读入熊​​猫数据帧

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/26132718/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 22:31:36  来源:igfitidea点击:

Read stored procedure select results into pandas dataframe

sql-serverstored-procedurespandassqlalchemypyodbc

提问by joeb1415

Given:

鉴于:

CREATE PROCEDURE my_procedure
    @Param INT
AS
    SELECT Col1, Col2
    FROM Table
    WHERE Col2 = @Param

I would like to be able to use this as:

我希望能够将其用作:

import pandas as pd
import pyodbc

query = 'EXEC my_procedure @Param = {0}'.format(my_param)
conn = pyodbc.connect(my_connection_string)

df = pd.read_sql(query, conn)

But this throws an error:

但这会引发错误:

ValueError: Reading a table with read_sql is not supported for a DBAPI2 connection. Use an SQLAlchemy engine or specify an sql query

SQLAlchemy does not work either:

SQLAlchemy 也不起作用:

import sqlalchemy
engine = sqlalchemy.create_engine(my_connection_string)
df = pd.read_sql(query, engine)

Throws:

抛出:

ValueError: Could not init table 'my_procedure'

I can in fact execute the statement using pyodbcdirectly:

事实上,我可以pyodbc直接使用以下语句执行语句:

cursor = conn.cursor()
cursor.execute(query)
results = cursor.fetchall()
df = pd.DataFrame.from_records(results)

Is there a way to send these procedure results directly to a DataFrame?

有没有办法将这些过程结果直接发送到 DataFrame?

回答by CRAFTY DBA

https://code.google.com/p/pyodbc/wiki/StoredProcedures

https://code.google.com/p/pyodbc/wiki/StoredProcedures

I am not a python expert, but SQL Server sometimes returns counts for statement executions. For instance, a update will tell how many rows are updated.

我不是 Python 专家,但 SQL Server 有时会返回语句执行的计数。例如,更新将说明更新了多少行。

Just use the 'SET NO COUNT;' at the front of your batch call. This will remove the counts for inserts, updates, and deletes.

只需使用'SET NO COUNT;' 在您批量调用的前面。这将删除插入、更新和删除的计数。

Make sure you are using the correct native client module.

确保您使用的是正确的本机客户端模块。

Take a look at this stack overflow example.

看看这个堆栈溢出示例。

It has both a adhoc SQL and call stored procedure example.

它有一个即席 SQL 和调用存储过程示例。

Calling a stored procedure python

调用存储过程python

Good luck

祝你好运

回答by steamer25

Use read_sql_query()instead.

请改用read_sql_query()

Looks like @joris (+1) already had this in a comment directly under the question but I didn't see it because it wasn't in the answers section.

看起来@joris (+1) 已经在问题正下方的评论中包含了这个,但我没有看到它,因为它不在答案部分。

Use the SQLA engine--apart from SQLAlchemy, Pandas only supports SQLite. Then use read_sql_query()instead of read_sql(). The latter tries to auto-detect whether you're passing a table name or a fully-fledged query but it doesn't appear to do so well with the 'EXEC' keyword. Using read_sql_query() skips the auto-detection and allows you to explicitly indicate that you're using a query (there's also a read_sql_table()).

使用 SQLA 引擎——除了 SQLAlchemy,Pandas 只支持 SQLite。然后使用read_sql_query()而不是 read_sql()。后者尝试自动检测您传递的是表名还是完整的查询,但它似乎与 'EXEC' 关键字效果不佳。使用 read_sql_query() 跳过自动检测并允许您明确指出您正在使用查询(还有一个 read_sql_table())。

import pandas as pd
import sqlalchemy

query = 'EXEC my_procedure @Param = {0}'.format(my_param)
engine = sqlalchemy.create_engine(my_connection_string)
df = pd.read_sql_query(query, engine)

回答by as - if

This worked for me after added SET NOCOUNT ONthanks @CRAFTY DBA

添加SET NOCOUNT ON感谢@CRAFTY DBA后,这对我有用

sql_query = """SET NOCOUNT ON; EXEC db_name.dbo.StoreProc '{0}';""".format(input)

df = pandas.read_sql_query(sql_query , conn)

回答by Bryan

Using ODBC syntax for calling stored procedures (with parameters instead of string formatting) works for loading dataframes using pandas 0.14.1 and pyodbc 3.0.7. The following examples use the AdventureWorks2008R2 sample database.

使用 ODBC 语法调用存储过程(使用参数而不是字符串格式)适用于使用 pandas 0.14.1 和 pyodbc 3.0.7 加载数据帧。以下示例使用AdventureWorks2008R2 示例数据库

First confirm expected results calling the stored procedure using pyodbc:

首先使用pyodbc确认调用存储过程的预期结果:

import pandas as pd
import pyodbc
connection = pyodbc.connect(driver='{SQL Server Native Client 11.0}', server='ServerInstance', database='AdventureWorks2008R2', trusted_connection='yes')
sql = "{call dbo.uspGetEmployeeManagers(?)}"
params = (3,)
cursor = connection.cursor()
rows = cursor.execute(sql, params).fetchall()
print(rows)

Should return:

应该返回:

[(0, 3, 'Roberto', 'Tamburello', '/1/1/', 'Terri', 'Duffy'), (1, 2, 'Terri', 'Duffy',
'/1/', 'Ken', 'Sánchez')]

Now use pandas to load the results into a dataframe:

现在使用 pandas 将结果加载到数据帧中:

df = pd.read_sql(sql=sql, con=connection, params=params)
print(df)

Should return:

应该返回:

   RecursionLevel  BusinessEntityID FirstName    LastName OrganizationNode  \
0               0                 3   Roberto  Tamburello            /1/1/
1               1                 2     Terri       Duffy              /1/

  ManagerFirstName ManagerLastName
0            Terri           Duffy
1              Ken         Sánchez

EDIT

编辑

Since you can't update to pandas 0.14.1, load the results from pyodbc using pandas.DataFrame.from_records:

由于您无法更新到 pandas 0.14.1,请使用pandas.DataFrame.from_records从 pyodbc 加载结果:

# get column names from pyodbc results
columns = [column[0] for column in cursor.description]
df = pd.DataFrame.from_records(rows, columns=columns)