如何将 Amazon Redshift 连接到 Python
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/45212281/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to connect Amazon Redshift to python
提问by vihaa_vrutti
This is my python code and I want to connect my Amazon Redshift database to Python, but it is showing error in host.
这是我的 Python 代码,我想将我的 Amazon Redshift 数据库连接到 Python,但它在主机中显示错误。
Can anyone tell me the correct syntax? Am I passing all the parameters correctly?
谁能告诉我正确的语法?我是否正确传递了所有参数?
con=psycopg2.connect("dbname = pg_table_def, host=redshifttest-icp.cooqucvshoum.us-west-2.redshift.amazonaws.com, port= 5439, user=me, password= secret")
This is the error:
这是错误:
OperationalError: could not translate host name "redshift://redshifttest-xyz.cooqucvshoum.us-west-2.redshift.amazonaws.com," to address: Unknown host
OperationalError:无法将主机名“redshift://redshifttest-xyz.cooqucvshoum.us-west-2.redshift.amazonaws.com”转换为地址:未知主机
回答by John Rotenstein
It appears that you wish to run Amazon Redshift queries from Python code.
您似乎希望从 Python 代码运行 Amazon Redshift 查询。
The parameters you would want to use are:
您要使用的参数是:
- dbname: This is the name of the database you entered in the
Database name
field when the cluster was created. - user:This is you entered in the
Master user name
field when the cluster was created. - password:This is you entered in the
Master user password
field when the cluster was created. - host:This is the Endpoint provided in the Redshift management console (without the port at the end):
redshifttest-xyz.cooqucvshoum.us-west-2.redshift.amazonaws.com
- port:
5439
- dbname:这是您在
Database name
创建集群时在字段中输入的数据库的名称。 - 用户:这是您在
Master user name
创建集群时在字段中输入的信息。 - 密码:这是您在
Master user password
创建集群时在字段中输入的密码。 - host:这是 Redshift 管理控制台中提供的 Endpoint(末尾没有端口):
redshifttest-xyz.cooqucvshoum.us-west-2.redshift.amazonaws.com
- 港口:
5439
For example:
例如:
con=psycopg2.connect("dbname=sales host=redshifttest-xyz.cooqucvshoum.us-west-2.redshift.amazonaws.com port=5439 user=master password=secret")
回答by sat
The easiest way to query AWS Redshift from python is through this Jupyter extension - Jupyter Redshift
从 python 查询 AWS Redshift 的最简单方法是通过这个 Jupyter 扩展 - Jupyter Redshift
Not only can you query and save your results but also write them back to the database from within the notebook environment.
您不仅可以查询和保存结果,还可以从笔记本环境中将它们写回数据库。
回答by Paulo Victor
Well, for Redshift the idea is made COPY from S3, is faster than every different way, but here is some example to do it:
好吧,对于 Redshift 来说,这个想法是从 S3 复制而来的,比任何不同的方式都快,但这里有一些例子来做到这一点:
first you must install some dependencies
首先你必须安装一些依赖项
for linux users
sudo apt-get install libpq-dev
对于 linux 用户
sudo apt-get install libpq-dev
for mac users
brew install libpq
对于 mac 用户
brew install libpq
install with pip this dependencies
pip3 install psycopg2-binary
pip3 install sqlalchemy
pip3 install sqlalchemy-redshift
使用 pip 安装此依赖项
pip3 install psycopg2-binary
pip3 install sqlalchemy
pip3 install sqlalchemy-redshift
import sqlalchemy as sa
from sqlalchemy.orm import sessionmaker
#>>>>>>>> MAKE CHANGES HERE <<<<<<<<<<<<<
DATABASE = "dwtest"
USER = "youruser"
PASSWORD = "yourpassword"
HOST = "dwtest.awsexample.com"
PORT = "5439"
SCHEMA = "public"
S3_FULL_PATH = 's3://yourbucket/category_pipe.txt'
ARN_CREDENTIALS = 'arn:aws:iam::YOURARN:YOURROLE'
REGION = 'us-east-1'
############ CONNECTING AND CREATING SESSIONS ############
connection_string = "redshift+psycopg2://%s:%s@%s:%s/%s" % (USER,PASSWORD,HOST,str(PORT),DATABASE)
engine = sa.create_engine(connection_string)
session = sessionmaker()
session.configure(bind=engine)
s = session()
SetPath = "SET search_path TO %s" % SCHEMA
s.execute(SetPath)
###########################################################
############ RUNNING COPY ############
copy_command = '''
copy category from '%s'
credentials 'aws_iam_role=%s'
delimiter '|' region '%s';
''' % (S3_FULL_PATH, ARN_CREDENTIALS, REGION)
s.execute(copy_command)
s.commit()
######################################
############ GETTING DATA ############
query = "SELECT * FROM category;"
rr = s.execute(query)
all_results = rr.fetchall()
def pretty(all_results):
for row in all_results :
print("row start >>>>>>>>>>>>>>>>>>>>")
for r in row :
print(" ---- %s" % r)
print("row end >>>>>>>>>>>>>>>>>>>>>>")
pretty(all_results)
s.close()
######################################