pyspark mysql jdbc load 调用 o23.load 时发生错误没有合适的驱动程序

Question

提问by shellbye

I use docker image sequenceiq/sparkon my Mac to study these spark examples, during the study process, I upgrade the spark inside that image to 1.6.1 according to this answer, and the error occurred when I start the Simple Data Operationsexample, here is what happened:

我在我的Mac上使用docker image sequenceiq/spark来研究这些spark示例，在研究过程中，我根据this answer将该图像内部的spark升级到1.6.1 ，并且在我启动Simple Data Operations示例时出现错误，这是什么发生了：

when I run df = sqlContext.read.format("jdbc").option("url",url).option("dbtable","people").load()it raise a error, and the full stack with the pyspark console is as followed:

当我运行df = sqlContext.read.format("jdbc").option("url",url).option("dbtable","people").load()它时会引发错误，pyspark 控制台的完整堆栈如下：

Python 2.6.6 (r266:84292, Jul 23 2015, 15:22:56)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-11)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
16/04/12 22:45:28 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /__ / .__/\_,_/_/ /_/\_\   version 1.6.1
      /_/

Using Python version 2.6.6 (r266:84292, Jul 23 2015 15:22:56)
SparkContext available as sc, HiveContext available as sqlContext.
>>> url = "jdbc:mysql://localhost:3306/test?user=root;password=myPassWord"
>>> df = sqlContext.read.format("jdbc").option("url",url).option("dbtable","people").load()
16/04/12 22:46:05 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
16/04/12 22:46:06 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
16/04/12 22:46:11 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0
16/04/12 22:46:11 WARN ObjectStore: Failed to get database default, returning NoSuchObjectException
16/04/12 22:46:16 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
16/04/12 22:46:17 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/spark/python/pyspark/sql/readwriter.py", line 139, in load
    return self._df(self._jreader.load())
  File "/usr/local/spark/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py", line 813, in __call__
  File "/usr/local/spark/python/pyspark/sql/utils.py", line 45, in deco
    return f(*a, **kw)
  File "/usr/local/spark/python/lib/py4j-0.9-src.zip/py4j/protocol.py", line 308, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling o23.load.
: java.sql.SQLException: No suitable driver
    at java.sql.DriverManager.getDriver(DriverManager.java:278)
    at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun.apply(JdbcUtils.scala:50)
    at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun.apply(JdbcUtils.scala:50)
    at scala.Option.getOrElse(Option.scala:120)
    at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.createConnectionFactory(JdbcUtils.scala:49)
    at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:120)
    at org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation.<init>(JDBCRelation.scala:91)
    at org.apache.spark.sql.execution.datasources.jdbc.DefaultSource.createRelation(DefaultSource.scala:57)
    at org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:158)
    at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231)
    at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:381)
    at py4j.Gateway.invoke(Gateway.java:259)
    at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
    at py4j.commands.CallCommand.execute(CallCommand.java:79)
    at py4j.GatewayConnection.run(GatewayConnection.java:209)
    at java.lang.Thread.run(Thread.java:744)

>>>

Here is what I have tried till now:

这是我迄今为止尝试过的：

Download mysql-connector-java-5.0.8-bin.jar, and put it in to /usr/local/spark/lib/. It still the same error.

Create t.pylike this:

from pyspark import SparkContext  
from pyspark.sql import SQLContext  

sc = SparkContext(appName="PythonSQL")  
sqlContext = SQLContext(sc)  
df = sqlContext.read.format("jdbc").option("url",url).option("dbtable","people").load()  

df.printSchema()  
countsByAge = df.groupBy("age").count()  
countsByAge.show()  
countsByAge.write.format("json").save("file:///usr/local/mysql/mysql-connector-java-5.0.8/db.json")

下载mysql-connector-java-5.0.8-bin.jar，并将其放入/usr/local/spark/lib/. 它仍然是同样的错误。

t.py像这样创建：

from pyspark import SparkContext  
from pyspark.sql import SQLContext  

sc = SparkContext(appName="PythonSQL")  
sqlContext = SQLContext(sc)  
df = sqlContext.read.format("jdbc").option("url",url).option("dbtable","people").load()  

df.printSchema()  
countsByAge = df.groupBy("age").count()  
countsByAge.show()  
countsByAge.write.format("json").save("file:///usr/local/mysql/mysql-connector-java-5.0.8/db.json")

then, I tried spark-submit --conf spark.executor.extraClassPath=mysql-connector-java-5.0.8-bin.jar --driver-class-path mysql-connector-java-5.0.8-bin.jar --jars mysql-connector-java-5.0.8-bin.jar --master local[4] t.py. The result is still the same.

然后，我尝试了spark-submit --conf spark.executor.extraClassPath=mysql-connector-java-5.0.8-bin.jar --driver-class-path mysql-connector-java-5.0.8-bin.jar --jars mysql-connector-java-5.0.8-bin.jar --master local[4] t.py。结果还是一样。

Then I tried pyspark --conf spark.executor.extraClassPath=mysql-connector-java-5.0.8-bin.jar --driver-class-path mysql-connector-java-5.0.8-bin.jar --jars mysql-connector-java-5.0.8-bin.jar --master local[4] t.py, both with and without the following t.py, still the same.

然后我尝试了pyspark --conf spark.executor.extraClassPath=mysql-connector-java-5.0.8-bin.jar --driver-class-path mysql-connector-java-5.0.8-bin.jar --jars mysql-connector-java-5.0.8-bin.jar --master local[4] t.py，无论有没有以下t.py，还是一样。

During all of this, the mysql is running. And here is my os info:

在所有这些过程中，mysql 正在运行。这是我的操作系统信息：

# rpm --query centos-release  
centos-release-6-5.el6.centos.11.2.x86_64

And the hadoop version is 2.6.

而hadoop版本是2.6。

Now I don't where to go next, so I hope some one can help give some advice, thanks!

现在我不知道下一步要去哪里，所以我希望有人可以提供一些建议，谢谢！

Answer 1

回答by Aristide Niyungeko

I ran into "java.sql.SQLException: No suitable driver" when I tried to have my script write to MySQL.

当我尝试将脚本写入 MySQL 时，我遇到了“java.sql.SQLException：没有合适的驱动程序”。

Here's what I did to fix that.

这是我为解决这个问题所做的。

In script.py

在脚本.py

df.write.jdbc(url="jdbc:mysql://localhost:3333/my_database"
                  "?user=my_user&password=my_password",
              table="my_table",
              mode="append",
              properties={"driver": 'com.mysql.jdbc.Driver'})

Then I ran spark-submit this way

然后我以这种方式运行 spark-submit

SPARK_HOME=/usr/local/Cellar/apache-spark/1.6.1/libexec spark-submit --packages mysql:mysql-connector-java:5.1.39 ./script.py

Note that SPARK_HOME is specific to where spark is installed. For your environment this https://github.com/sequenceiq/docker-spark/blob/master/README.mdmight help.

请注意， SPARK_HOME 特定于安装 spark 的位置。对于您的环境，这个https://github.com/sequenceiq/docker-spark/blob/master/README.md可能会有所帮助。

In case all the above is confusing, try this:
In t.py replace

如果以上所有内容都令人困惑，请尝试以下操作：
在 t.py 中替换

sqlContext.read.format("jdbc").option("url",url).option("dbtable","people").load()

with

和

sqlContext.read.format("jdbc").option("dbtable","people").option("driver", 'com.mysql.jdbc.Driver').load()

And run that with

并运行它

spark-submit --packages mysql:mysql-connector-java:5.1.39 --master local[4] t.py

pyspark mysql jdbc load 调用 o23.load 时发生错误没有合适的驱动程序

提问by shellbye

回答by Aristide Niyungeko

相关推荐

最近更新

标签

pyspark mysql jdbc load 调用 o23.load 时发生错误 没有合适的驱动程序

提问by shellbye

回答by Aristide Niyungeko

相关推荐

MySQL 用于存储人与人之间聊天消息的数据库设计

MySQL “具有条款”中的未知列

java.lang.ClassNotFoundException: com.mysql.jdbc.Driver 在运行时 (eclipse/maven/tomcat)

MySQL “忽略对其他数据库的查询”命令行

相关推荐

最近更新

标签

pyspark mysql jdbc load 调用 o23.load 时发生错误没有合适的驱动程序