pandas 将存储过程选择结果读入熊猫数据帧

Question

提问by joeb1415

Given:

鉴于：

CREATE PROCEDURE my_procedure
    @Param INT
AS
    SELECT Col1, Col2
    FROM Table
    WHERE Col2 = @Param

I would like to be able to use this as:

我希望能够将其用作：

import pandas as pd
import pyodbc

query = 'EXEC my_procedure @Param = {0}'.format(my_param)
conn = pyodbc.connect(my_connection_string)

df = pd.read_sql(query, conn)

But this throws an error:

但这会引发错误：

ValueError: Reading a table with read_sql is not supported for a DBAPI2 connection. Use an SQLAlchemy engine or specify an sql query

SQLAlchemy does not work either:

SQLAlchemy 也不起作用：

import sqlalchemy
engine = sqlalchemy.create_engine(my_connection_string)
df = pd.read_sql(query, engine)

Throws:

抛出：

ValueError: Could not init table 'my_procedure'

I can in fact execute the statement using pyodbcdirectly:

事实上，我可以pyodbc直接使用以下语句执行语句：

cursor = conn.cursor()
cursor.execute(query)
results = cursor.fetchall()
df = pd.DataFrame.from_records(results)

Is there a way to send these procedure results directly to a DataFrame?

有没有办法将这些过程结果直接发送到 DataFrame？

Answer 1

回答by CRAFTY DBA

https://code.google.com/p/pyodbc/wiki/StoredProcedures

I am not a python expert, but SQL Server sometimes returns counts for statement executions. For instance, a update will tell how many rows are updated.

我不是 Python 专家，但 SQL Server 有时会返回语句执行的计数。例如，更新将说明更新了多少行。

Just use the 'SET NO COUNT;' at the front of your batch call. This will remove the counts for inserts, updates, and deletes.

只需使用'SET NO COUNT;' 在您批量调用的前面。这将删除插入、更新和删除的计数。

Make sure you are using the correct native client module.

确保您使用的是正确的本机客户端模块。

Take a look at this stack overflow example.

看看这个堆栈溢出示例。

It has both a adhoc SQL and call stored procedure example.

它有一个即席 SQL 和调用存储过程示例。

Calling a stored procedure python

调用存储过程python

Good luck

祝你好运

Answer 2

回答by steamer25

Use read_sql_query()instead.

请改用read_sql_query()。

Looks like @joris (+1) already had this in a comment directly under the question but I didn't see it because it wasn't in the answers section.

看起来@joris (+1) 已经在问题正下方的评论中包含了这个，但我没有看到它，因为它不在答案部分。

Use the SQLA engine--apart from SQLAlchemy, Pandas only supports SQLite. Then use read_sql_query()instead of read_sql(). The latter tries to auto-detect whether you're passing a table name or a fully-fledged query but it doesn't appear to do so well with the 'EXEC' keyword. Using read_sql_query() skips the auto-detection and allows you to explicitly indicate that you're using a query (there's also a read_sql_table()).

使用 SQLA 引擎——除了 SQLAlchemy，Pandas 只支持 SQLite。然后使用read_sql_query()而不是 read_sql()。后者尝试自动检测您传递的是表名还是完整的查询，但它似乎与 'EXEC' 关键字效果不佳。使用 read_sql_query() 跳过自动检测并允许您明确指出您正在使用查询（还有一个 read_sql_table()）。

import pandas as pd
import sqlalchemy

query = 'EXEC my_procedure @Param = {0}'.format(my_param)
engine = sqlalchemy.create_engine(my_connection_string)
df = pd.read_sql_query(query, engine)

Answer 3

回答by as - if

This worked for me after added SET NOCOUNT ONthanks @CRAFTY DBA

添加SET NOCOUNT ON感谢@CRAFTY DBA后，这对我有用

sql_query = """SET NOCOUNT ON; EXEC db_name.dbo.StoreProc '{0}';""".format(input)

df = pandas.read_sql_query(sql_query , conn)

Answer 4

回答by Bryan

Using ODBC syntax for calling stored procedures (with parameters instead of string formatting) works for loading dataframes using pandas 0.14.1 and pyodbc 3.0.7. The following examples use the AdventureWorks2008R2 sample database.

使用 ODBC 语法调用存储过程（使用参数而不是字符串格式）适用于使用 pandas 0.14.1 和 pyodbc 3.0.7 加载数据帧。以下示例使用AdventureWorks2008R2 示例数据库。

First confirm expected results calling the stored procedure using pyodbc:

首先使用pyodbc确认调用存储过程的预期结果：

import pandas as pd
import pyodbc
connection = pyodbc.connect(driver='{SQL Server Native Client 11.0}', server='ServerInstance', database='AdventureWorks2008R2', trusted_connection='yes')
sql = "{call dbo.uspGetEmployeeManagers(?)}"
params = (3,)
cursor = connection.cursor()
rows = cursor.execute(sql, params).fetchall()
print(rows)

Should return:

应该返回：

[(0, 3, 'Roberto', 'Tamburello', '/1/1/', 'Terri', 'Duffy'), (1, 2, 'Terri', 'Duffy',
'/1/', 'Ken', 'Sánchez')]

Now use pandas to load the results into a dataframe:

现在使用 pandas 将结果加载到数据帧中：

df = pd.read_sql(sql=sql, con=connection, params=params)
print(df)

Should return:

应该返回：

   RecursionLevel  BusinessEntityID FirstName    LastName OrganizationNode  \
0               0                 3   Roberto  Tamburello            /1/1/
1               1                 2     Terri       Duffy              /1/

  ManagerFirstName ManagerLastName
0            Terri           Duffy
1              Ken         Sánchez

EDIT

编辑

Since you can't update to pandas 0.14.1, load the results from pyodbc using pandas.DataFrame.from_records:

由于您无法更新到 pandas 0.14.1，请使用pandas.DataFrame.from_records从 pyodbc 加载结果：

# get column names from pyodbc results
columns = [column[0] for column in cursor.description]
df = pd.DataFrame.from_records(rows, columns=columns)

pandas 将存储过程选择结果读入熊猫数据帧

提问by joeb1415

回答by CRAFTY DBA

回答by steamer25

回答by as - if

回答by Bryan

相关推荐

最近更新

标签

pandas 将存储过程选择结果读入熊​​猫数据帧

提问by joeb1415

回答by CRAFTY DBA

回答by steamer25

回答by as - if

回答by Bryan

相关推荐

Python：在多张工作表上将 Pandas DataFrame 写入 Excel 的最快方法

pandas 如何按特定月/日过滤日期数据框？

Python Pandas 如果 B 列中的值 = 等于 [X, Y, Z] 将 A 列替换为“T”

Excel 输出中的 Python Pandas 自定义时间格式

相关推荐

最近更新

标签

pandas 将存储过程选择结果读入熊猫数据帧