scala 使用案例类对 JSON 进行编码时,为什么会出现错误“无法找到存储在数据集中的类型的编码器”?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/34715611/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-22 07:56:13  来源:igfitidea点击:

Why is the error "Unable to find encoder for type stored in a Dataset" when encoding JSON using case classes?

scalaapache-sparkapache-spark-datasetapache-spark-encoders

提问by Milad Khajavi

I've written spark job:

我写过火花作业:

object SimpleApp {
  def main(args: Array[String]) {
    val conf = new SparkConf().setAppName("Simple Application").setMaster("local")
    val sc = new SparkContext(conf)
    val ctx = new org.apache.spark.sql.SQLContext(sc)
    import ctx.implicits._

    case class Person(age: Long, city: String, id: String, lname: String, name: String, sex: String)
    case class Person2(name: String, age: Long, city: String)

    val persons = ctx.read.json("/tmp/persons.json").as[Person]
    persons.printSchema()
  }
}

In IDE when I run the main function, 2 error occurs:

在 IDE 中运行 main 函数时,出现 2 个错误:

Error:(15, 67) Unable to find encoder for type stored in a Dataset.  Primitive types (Int, String, etc) and Product types (case classes) are supported by importing sqlContext.implicits._  Support for serializing other types will be added in future releases.
    val persons = ctx.read.json("/tmp/persons.json").as[Person]
                                                                  ^

Error:(15, 67) not enough arguments for method as: (implicit evidence: org.apache.spark.sql.Encoder[Person])org.apache.spark.sql.Dataset[Person].
Unspecified value parameter evidence.
    val persons = ctx.read.json("/tmp/persons.json").as[Person]
                                                                  ^

but in Spark Shell I can run this job without any error. what is the problem?

但是在 Spark Shell 中,我可以毫无错误地运行此作业。问题是什么?

回答by Developer

The error message says that the Encoderis not able to take the Personcase class.

错误消息表示Encoder无法接受Person案例类。

Error:(15, 67) Unable to find encoder for type stored in a Dataset.  Primitive types (Int, String, etc) and Product types (case classes) are supported by importing sqlContext.implicits._  Support for serializing other types will be added in future releases.

Move the declaration of the case class outside the scope of SimpleApp.

将 case 类的声明移到 的范围之外SimpleApp

回答by Paul Leclercq

You have the same error if you add sqlContext.implicits._and spark.implicits._in SimpleApp(the order doesn't matter).

如果添加sqlContext.implicits._spark.implicits._输入SimpleApp(顺序无关紧要),则会出现相同的错误。

Removing one or the other will be the solution:

删除一个或另一个将是解决方案:

val spark = SparkSession
  .builder()
  .getOrCreate()

val sqlContext = spark.sqlContext
import sqlContext.implicits._ //sqlContext OR spark implicits
//import spark.implicits._ //sqlContext OR spark implicits

case class Person(age: Long, city: String)
val persons = ctx.read.json("/tmp/persons.json").as[Person]

Tested with Spark 2.1.0

Spark 2.1.0测试

The funny thing is if you add the same object implicits twice you will not have problems.

有趣的是,如果你两次添加相同的对象隐式,你就不会有问题。

回答by Santhoshm

@Milad Khajavi

@Milad Khajavi

Define Person case classes outside object SimpleApp. Also, add import sqlContext.implicits._ inside main() function.

在对象 SimpleApp 之外定义 Person 案例类。另外,在 main() 函数中添加 import sqlContext.implicits._。