Python 将 sql select 解压到 Pandas 数据框中

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/17156084/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 00:38:01  来源:igfitidea点击:

unpacking a sql select into a pandas dataframe

pythonsqlpandas

提问by Chris Withers

Suppose I have a select roughly like this:

假设我有一个大致像这样的选择:

select instrument, price, date from my_prices;

How can I unpack the prices returned into a single dataframe with a series for each instrument and indexed on date?

如何将返回的价格解包到单个数据帧中,其中包含每个工具的系列并按日期编制索引?

To be clear: I'm looking for:

要明确:我正在寻找:

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: ...
Data columns (total 2 columns):
inst_1    ...
inst_2    ...
dtypes: float64(1), object(1) 

I'm NOT looking for:

我不是在寻找:

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: ...
Data columns (total 2 columns):
instrument    ...
price         ...
dtypes: float64(1), object(1)

...which is easy ;-)

...这很容易;-)

采纳答案by Andy Hayden

Update: recent pandas have the following functions: read_sql_tableand read_sql_query.

更新:最近的熊猫有以下功能:read_sql_tableread_sql_query

First create a db engine (a connection can also work here):

首先创建一个数据库引擎(连接也可以在这里工作):

from sqlalchemy import create_engine
# see sqlalchemy docs for how to write this url for your database type:
engine = create_engine('mysql://scott:tiger@localhost/foo')

See sqlalchemy database urls.

请参阅sqlalchemy 数据库网址

pandas_read_sql_table

pandas_read_sql_table

table_name = 'my_prices'
df = pd.read_sql_table(table_name, engine)

pandas_read_sql_query

pandas_read_sql_query

df = pd.read_sql_query("SELECT instrument, price, date FROM my_prices;", engine)


The old answer had referenced read_frame which is has been deprecated (see the version historyof this question for that answer).

旧答案引用了已弃用的 read_frame (有关该答案,请参阅此问题的版本历史记录)。



It's often makes sense to read first, and thenperform transformations to your requirements (as these are usually efficient and readable in pandas). In your example, you can pivotthe result:

首先阅读,然后根据您的要求执行转换通常是有意义的(因为这些在 Pandas 中通常是高效和可读的)。在您的示例中,您可以pivot得到以下结果:

df.reset_index().pivot('date', 'instrument', 'price')

Note: You could miss out the reset_indexyou don't specify an index_colin the read_frame.

注意:您可能会错过reset_indexindex_colread_frame.

回答by jdennison

You can pass a cursor object to the DataFrame constructor. For postgres:

您可以将游标对象传递给 DataFrame 构造函数。对于 postgres:

import psycopg2
conn = psycopg2.connect("dbname='db' user='user' host='host' password='pass'")
cur = conn.cursor()
cur.execute("select instrument, price, date from my_prices")
df = DataFrame(cur.fetchall(), columns=['instrument', 'price', 'date'])

then set index like

然后设置索引

df.set_index('date', drop=False)

or directly:

或直接:

df.index =  df['date']

回答by Mani Abi Anand

This connect with postgres and pandas with remote postgresql

这与带有远程 postgresql 的 postgres 和 pandas 连接

# CONNECT TO POSTGRES USING PANDAS
import psycopg2 as pg
import pandas.io.sql as psql

this is used to establish the connection with postgres db

这用于建立与 postgres db 的连接

connection = pg.connect("host=192.168.0.1 dbname=db user=postgres")

this is used to read the table from postgres db

这用于从 postgres db 读取表

dataframe = psql.read_sql("SELECT * FROM DB.Table", connection)