如何将 Amazon Redshift 连接到 Python

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/45212281/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 16:50:03  来源:igfitidea点击:

How to connect Amazon Redshift to python

pythonamazon-web-servicesamazon-redshift

提问by vihaa_vrutti

This is my python code and I want to connect my Amazon Redshift database to Python, but it is showing error in host.

这是我的 Python 代码,我想将我的 Amazon Redshift 数据库连接到 Python,但它在主机中显示错误。

Can anyone tell me the correct syntax? Am I passing all the parameters correctly?

谁能告诉我正确的语法?我是否正确传递了所有参数?

con=psycopg2.connect("dbname = pg_table_def, host=redshifttest-icp.cooqucvshoum.us-west-2.redshift.amazonaws.com, port= 5439, user=me, password= secret")

This is the error:

这是错误:

OperationalError: could not translate host name "redshift://redshifttest-xyz.cooqucvshoum.us-west-2.redshift.amazonaws.com," to address: Unknown host

OperationalError:无法将主机名“redshift://redshifttest-xyz.cooqucvshoum.us-west-2.redshift.amazonaws.com”转换为地址:未知主机

回答by John Rotenstein

It appears that you wish to run Amazon Redshift queries from Python code.

您似乎希望从 Python 代码运行 Amazon Redshift 查询。

The parameters you would want to use are:

您要使用的参数是:

  • dbname: This is the name of the database you entered in the Database namefield when the cluster was created.
  • user:This is you entered in the Master user namefield when the cluster was created.
  • password:This is you entered in the Master user passwordfield when the cluster was created.
  • host:This is the Endpoint provided in the Redshift management console (without the port at the end): redshifttest-xyz.cooqucvshoum.us-west-2.redshift.amazonaws.com
  • port:5439
  • dbname:这是您在Database name创建集群时在字段中输入的数据库的名称。
  • 用户:这是您在Master user name创建集群时在字段中输入的信息。
  • 密码:这是您在Master user password创建集群时在字段中输入的密码
  • host:这是 Redshift 管理控制台中提供的 Endpoint(末尾没有端口):redshifttest-xyz.cooqucvshoum.us-west-2.redshift.amazonaws.com
  • 港口:5439

For example:

例如:

con=psycopg2.connect("dbname=sales host=redshifttest-xyz.cooqucvshoum.us-west-2.redshift.amazonaws.com port=5439 user=master password=secret")

回答by sat

The easiest way to query AWS Redshift from python is through this Jupyter extension - Jupyter Redshift

从 python 查询 AWS Redshift 的最简单方法是通过这个 Jupyter 扩展 - Jupyter Redshift

Not only can you query and save your results but also write them back to the database from within the notebook environment.

您不仅可以查询和保存结果,还可以从笔记本环境中将它们写回数据库。

回答by Paulo Victor

Well, for Redshift the idea is made COPY from S3, is faster than every different way, but here is some example to do it:

好吧,对于 Redshift 来说,这个想法是从 S3 复制而来的,比任何不同的方式都快,但这里有一些例子来做到这一点:

first you must install some dependencies

首先你必须安装一些依赖项

for linux users sudo apt-get install libpq-dev

对于 linux 用户 sudo apt-get install libpq-dev

for mac users brew install libpq

对于 mac 用户 brew install libpq

install with pip this dependencies pip3 install psycopg2-binarypip3 install sqlalchemypip3 install sqlalchemy-redshift

使用 pip 安装此依赖项 pip3 install psycopg2-binarypip3 install sqlalchemypip3 install sqlalchemy-redshift

import sqlalchemy as sa
from sqlalchemy.orm import sessionmaker


#>>>>>>>> MAKE CHANGES HERE <<<<<<<<<<<<<
DATABASE = "dwtest"
USER = "youruser"
PASSWORD = "yourpassword"
HOST = "dwtest.awsexample.com"
PORT = "5439"
SCHEMA = "public"

S3_FULL_PATH = 's3://yourbucket/category_pipe.txt'
ARN_CREDENTIALS = 'arn:aws:iam::YOURARN:YOURROLE'
REGION = 'us-east-1'

############ CONNECTING AND CREATING SESSIONS ############
connection_string = "redshift+psycopg2://%s:%s@%s:%s/%s" % (USER,PASSWORD,HOST,str(PORT),DATABASE)
engine = sa.create_engine(connection_string)
session = sessionmaker()
session.configure(bind=engine)
s = session()
SetPath = "SET search_path TO %s" % SCHEMA
s.execute(SetPath)
###########################################################



############ RUNNING COPY ############
copy_command = '''
copy category from '%s'
credentials 'aws_iam_role=%s'
delimiter '|' region '%s';
''' % (S3_FULL_PATH, ARN_CREDENTIALS, REGION)
s.execute(copy_command)
s.commit()
######################################



############ GETTING DATA ############
query = "SELECT * FROM category;"
rr = s.execute(query)
all_results =  rr.fetchall()

def pretty(all_results):
    for row in all_results :
        print("row start >>>>>>>>>>>>>>>>>>>>")
        for r in row :
            print(" ---- %s" % r)
        print("row end >>>>>>>>>>>>>>>>>>>>>>")

pretty(all_results)
s.close()
######################################