Python 将 sql select 解压到 Pandas 数据框中
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/17156084/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
unpacking a sql select into a pandas dataframe
提问by Chris Withers
Suppose I have a select roughly like this:
假设我有一个大致像这样的选择:
select instrument, price, date from my_prices;
How can I unpack the prices returned into a single dataframe with a series for each instrument and indexed on date?
如何将返回的价格解包到单个数据帧中,其中包含每个工具的系列并按日期编制索引?
To be clear: I'm looking for:
要明确:我正在寻找:
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: ...
Data columns (total 2 columns):
inst_1 ...
inst_2 ...
dtypes: float64(1), object(1)
I'm NOT looking for:
我不是在寻找:
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: ...
Data columns (total 2 columns):
instrument ...
price ...
dtypes: float64(1), object(1)
...which is easy ;-)
...这很容易;-)
采纳答案by Andy Hayden
Update: recent pandas have the following functions: read_sql_tableand read_sql_query.
更新:最近的熊猫有以下功能:read_sql_table和read_sql_query。
First create a db engine (a connection can also work here):
首先创建一个数据库引擎(连接也可以在这里工作):
from sqlalchemy import create_engine
# see sqlalchemy docs for how to write this url for your database type:
engine = create_engine('mysql://scott:tiger@localhost/foo')
请参阅sqlalchemy 数据库网址。
pandas_read_sql_table
pandas_read_sql_table
table_name = 'my_prices'
df = pd.read_sql_table(table_name, engine)
pandas_read_sql_query
pandas_read_sql_query
df = pd.read_sql_query("SELECT instrument, price, date FROM my_prices;", engine)
The old answer had referenced read_frame which is has been deprecated (see the version historyof this question for that answer).
旧答案引用了已弃用的 read_frame (有关该答案,请参阅此问题的版本历史记录)。
It's often makes sense to read first, and thenperform transformations to your requirements (as these are usually efficient and readable in pandas). In your example, you can pivotthe result:
首先阅读,然后根据您的要求执行转换通常是有意义的(因为这些在 Pandas 中通常是高效和可读的)。在您的示例中,您可以pivot得到以下结果:
df.reset_index().pivot('date', 'instrument', 'price')
Note: You could miss out the reset_indexyou don't specify an index_colin the read_frame.
注意:您可能会错过reset_index未index_col在read_frame.
回答by jdennison
You can pass a cursor object to the DataFrame constructor. For postgres:
您可以将游标对象传递给 DataFrame 构造函数。对于 postgres:
import psycopg2
conn = psycopg2.connect("dbname='db' user='user' host='host' password='pass'")
cur = conn.cursor()
cur.execute("select instrument, price, date from my_prices")
df = DataFrame(cur.fetchall(), columns=['instrument', 'price', 'date'])
then set index like
然后设置索引
df.set_index('date', drop=False)
or directly:
或直接:
df.index = df['date']
回答by Mani Abi Anand
This connect with postgres and pandas with remote postgresql
这与带有远程 postgresql 的 postgres 和 pandas 连接
# CONNECT TO POSTGRES USING PANDAS
import psycopg2 as pg
import pandas.io.sql as psql
this is used to establish the connection with postgres db
这用于建立与 postgres db 的连接
connection = pg.connect("host=192.168.0.1 dbname=db user=postgres")
this is used to read the table from postgres db
这用于从 postgres db 读取表
dataframe = psql.read_sql("SELECT * FROM DB.Table", connection)

