postgresql 将 spark 数据帧写入 postgres 数据库
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/38825836/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Write spark dataframe to postgres Database
提问by Yiliang
The spark cluster setting is as follows:
spark集群设置如下:
conf['SparkConfiguration'] = SparkConf() \
.setMaster('yarn-client') \
.setAppName("test") \
.set("spark.executor.memory", "20g") \
.set("spark.driver.maxResultSize", "20g") \
.set("spark.executor.instances", "20")\
.set("spark.executor.cores", "3") \
.set("spark.memory.fraction", "0.2") \
.set("user", "test_user") \
.set("spark.executor.extraClassPath", "/usr/share/java/postgresql-jdbc3.jar")
When I try to write the dataframe to the Postgres DB using the following code:
当我尝试使用以下代码将数据帧写入 Postgres 数据库时:
from pyspark.sql import DataFrameWriter
my_writer = DataFrameWriter(df)
url_connect = "jdbc:postgresql://198.123.43.24:1234"
table = "test_result"
mode = "overwrite"
properties = {"user":"postgres", "password":"password"}
my_writer.jdbc(url_connect, table, mode, properties)
I encounter the below error:
我遇到以下错误:
Py4JJavaError: An error occurred while calling o1120.jdbc.
:java.sql.SQLException: No suitable driver
at java.sql.DriverManager.getDriver(DriverManager.java:278)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun.apply(JdbcUtils.scala:50)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun.apply(JdbcUtils.scala:50)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.createConnectionFactory(JdbcUtils.scala:49)
at org.apache.spark.sql.DataFrameWriter.jdbc(DataFrameWriter.scala:278)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:381)
at py4j.Gateway.invoke(Gateway.java:259)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:209)
at java.lang.Thread.run(Thread.java:745)
Can anyone provide some suggestions on this? Thank you!
任何人都可以就此提供一些建议吗?谢谢!
回答by Abhishek Pansotra
Try write.jdbc and pass the parameters individually created outside the write.jdbc(). Also check the port on which postgres is available for writing mine is 5432 for Postgres 9.6 and 5433 for Postgres 8.4.
尝试 write.jdbc 并传递在 write.jdbc() 之外单独创建的参数。还要检查可用于编写我的 postgres 的端口,Postgres 9.6 为 5432,Postgres 8.4 为 5433。
mode = "overwrite"
url = "jdbc:postgresql://198.123.43.24:5432/kockpit"
properties = {"user": "postgres","password": "password","driver": "org.postgresql.Driver"}
data.write.jdbc(url=url, table="test_result", mode=mode, properties=properties)
回答by Mary
Have you downloaded the PostgreSQL JDBC Driver? Download it here: https://jdbc.postgresql.org/download.html.
您是否下载了 PostgreSQL JDBC 驱动程序?在此处下载:https: //jdbc.postgresql.org/download.html。
For the pyspark shell you use the SPARK_CLASSPATH environment variable:
对于 pyspark shell,您使用 SPARK_CLASSPATH 环境变量:
$ export SPARK_CLASSPATH=/path/to/downloaded/jar
$ pyspark
For submitting a script via spark-submit use the --driver-class-path flag:
要通过 spark-submit 提交脚本,请使用 --driver-class-path 标志:
$ spark-submit --driver-class-path /path/to/downloaded/jar script.py
回答by Daniel de Paula
Maybe you can try passing the JDBC driver class explicitly (Note that you may need to put the driver jar in the classpath for all spark nodes):
也许您可以尝试显式传递 JDBC 驱动程序类(请注意,您可能需要将驱动程序 jar 放在所有 spark 节点的类路径中):
df.write.option('driver', 'org.postgresql.Driver').jdbc(url_connect, table, mode, properties)