postgresql 如何将数据从 AWS Postgres RDS 传输到 S3（然后是 Redshift）？

Question

提问by jenswirf

I'm using AWS data pipeline service to pipe data from a RDS MySqldatabase to s3and then on to Redshift, which works nicely.

我正在使用 AWS 数据管道服务将数据从RDS MySql数据库传输到s3，然后再传输到Redshift，效果很好。

However, I also have data living in an RDS Postresinstance which I would like to pipe the same way but I'm having a hard time setting up the jdbc-connection. If this is unsupported, is there a work-around?

但是，我也有数据存在于一个RDS Postres实例中，我想以相同的方式进行管道传输，但是我很难设置 jdbc 连接。如果这不受支持，是否有解决方法？

"connectionString": "jdbc:postgresql://THE_RDS_INSTANCE:5432/THE_DB”

Answer 1

采纳答案by xgess

this doesn't work yet. aws hasnt built / released the functionality to connect nicely to postgres. you can do it in a shellcommandactivity though. you can write a little ruby or python code to do it and drop that in a script on s3 using scriptUri. you could also just write a psql command to dump the table to a csv and then pipe that to OUTPUT1_STAGING_DIR with "staging: true" in that activity node.

这还不行。aws 还没有构建/发布可以很好地连接到 postgres 的功能。不过，您可以在 shellcommandactivity 中执行此操作。您可以编写一些 ruby 或 python 代码来执行此操作，然后使用 scriptUri 将其放入 s3 上的脚本中。您也可以只编写一个 psql 命令将表转储到 csv，然后在该活动节点中使用“staging: true”将其通过管道传输到 OUTPUT1_STAGING_DIR。

something like this:

像这样：

{
  "id": "DumpCommand",
  "type": "ShellCommandActivity",
  "runsOn": { "ref": "MyEC2Resource" },
  "stage": "true",
  "output": { "ref": "S3ForRedshiftDataNode" },
  "command": "PGPASSWORD=password psql -h HOST -U USER -d DATABASE -p 5432 -t -A -F\",\" -c \"select blah_id from blahs\" > ${OUTPUT1_STAGING_DIR}/my_data.csv"
}

i didn't run this to verify because it's a pain to spin up a pipeline :( so double check the escaping in the command.

我没有运行它来验证，因为启动管道很痛苦:(所以仔细检查命令中的转义。

pros: super straightforward and requires no additional script files to upload to s3
cons: not exactly secure. your db password will be transmitted over the wire without encryption.

优点：超级简单，不需要额外的脚本文件上传到 s3
缺点：不完全安全。您的数据库密码将在没有加密的情况下通过网络传输。

look into the new stuff aws just launched on parameterized templating data pipelines: http://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-custom-templates.html. it looks like it will allow encryption of arbitrary parameters.

查看 aws 刚刚在参数化模板数据管道上推出的新内容：http: //docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-custom-templates.html。看起来它将允许对任意参数进行加密。

Answer 2

回答by PeterssonJesper

Nowadays you can define a copy-activity to extract data from a Postgres RDS instance into S3. In the Data Pipeline interface:

如今，您可以定义复制活动以将数据从 Postgres RDS 实例提取到 S3。在数据管道界面中：

Create a data node of the type SqlDataNode. Specify table name and select query
Setup the database connection by specifying RDS instance ID (the instance ID is in your URL, e.g. your-instance-id.xxxxx.eu-west-1.rds.amazonaws.com) along with username, password and database name.
Create a data node of the type S3DataNode
Create a Copy activity and set the SqlDataNode as input and the S3DataNode as output

创建一个 SqlDataNode 类型的数据节点。指定表名并选择查询
通过指定 RDS 实例 ID（实例 ID 在您的 URL 中，例如 your-instance-id.xxxxx.eu-west-1.rds.amazonaws.com）以及用户名、密码和数据库名称来设置数据库连接。
创建 S3DataNode 类型的数据节点
创建一个 Copy 活动并将 SqlDataNode 设置为输入，将 S3DataNode 设置为输出

Answer 3

回答by Manuel G

AWS now allow partners to do near real time RDS -> Redshift inserts.

AWS 现在允许合作伙伴进行近乎实时的 RDS -> Redshift 插入。

https://aws.amazon.com/blogs/aws/fast-easy-free-sync-rds-to-redshift/

postgresql 如何将数据从 AWS Postgres RDS 传输到 S3（然后是 Redshift）？

提问by jenswirf

采纳答案by xgess

回答by PeterssonJesper

回答by Manuel G

相关推荐

最近更新

标签

postgresql 如何将数据从 AWS Postgres RDS 传输到 S3（然后是 Redshift）？

提问by jenswirf

采纳答案by xgess

回答by PeterssonJesper

回答by Manuel G

相关推荐

PostgreSQL 设置 PGDATA 变量

postgresql Postgres 看不到我的 PGDATA 环境变量

如何使用 JDBC 调用 PostgreSQL 存储过程

如何从 JDBC+postgreSql 时间戳获取 UTC 时间戳？

相关推荐

最近更新

标签