Scala/Spark 应用程序在“def main”样式应用程序中出现“No TypeTag available”错误
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/29143756/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Scala/Spark App with "No TypeTag available" Error in "def main" style App
提问by Fabio Fantoni
I'm new to Scala/Spark stack and I'm trying to figure out how to test my basic skills using SparkSql to "map" RDDs in TempTables and viceversa.
我是 Scala/Spark 堆栈的新手,我正在尝试弄清楚如何使用 SparkSql 测试我的基本技能,以“映射”TempTables 中的 RDD,反之亦然。
I have 2 distinct .scala files with the same code: a simple object (with def main...) and an object extending App.
我有 2 个不同的 .scala 文件,它们具有相同的代码:一个简单的对象(带有 def main ...)和一个扩展 App 的对象。
In the simple object one I get an error due to "No TypeTag available" connected to my case class Log:
在一个简单的对象中,由于“没有可用的 TypeTag”连接到我的案例类日志,我收到一个错误:
object counter {
def main(args: Array[String]) {
.
.
.
val sqlContext = new org.apache.spark.sql.SQLContext(sc)
import sqlContext.createSchemaRDD
case class Log(visitatore: String, data: java.util.Date, pagina: String, count: Int)
val log = triple.map(p => Log(p._1,p._2,p._3,p._4))
log.registerTempTable("logs")
val logSessioni= sqlContext.sql("SELECT visitor, data, pagina, count FROM logs")
logSessioni.foreach(println)
}
The error at line: log.registerTempTable("logs")says "No TypeTag available for Log".
行中的错误log.registerTempTable("logs")说:“没有可用于日志的 TypeTag”。
In the other file (object extends App) all works fine:
在另一个文件(对象扩展应用程序)中一切正常:
object counterApp extends App {
.
.
.
val sqlContext = new org.apache.spark.sql.SQLContext(sc)
import sqlContext.createSchemaRDD
case class Log(visitatore: String, data: java.util.Date, pagina: String, count: Int)
val log = triple.map(p => Log(p._1,p._2,p._3,p._4))
log.registerTempTable("logs")
val logSessioni= sqlContext.sql("SELECT visitor, data, pagina, count from logs")
logSessioni.foreach(println)
}
Since I've just started, I'm not getting two main points: 1) Why does the same code work fine in the second file (object extend App) while in the first one (simple object) I get the error?
由于我刚刚开始,我没有得到两个要点:1) 为什么相同的代码在第二个文件(对象扩展应用程序)中可以正常工作,而在第一个文件(简单对象)中却出现错误?
2) (and most important) What should I do in my code (simple object file) to fix this error in order to deal with case class and TypeTag (which I barely know)?
2)(也是最重要的)我应该在我的代码(简单的目标文件)中做什么来修复这个错误以处理案例类和 TypeTag(我几乎不知道)?
Every answer, code examples will be much appreciated!
每个答案,代码示例将不胜感激!
Thanks in advance
提前致谢
FF
FF
回答by Justin Pihony
TL;DR;
TL; 博士;
Just move your case class out of the method definition
只需将您的案例类移出方法定义
The problem is that your case class Logis defined inside of the method that it is being used. So, simply move your case class definition outside of the method and it will work. I will have to take a look at how this compiles down, but my guess is that this is more of a chicken-egg problem. The TypeTag(used for reflection) is not able to be implicitly defined as it has not been fully defined at that point. Here aretwo SO questionswith the same problem that exhibit that Spark would need to use a WeakTypeTag. And, here is the JIRAexplaining this more officially
问题是您case class Log是在正在使用的方法内部定义的。因此,只需将您的案例类定义移到方法之外,它就会起作用。我将不得不看看它是如何编译的,但我的猜测是这更像是一个鸡蛋问题。的TypeTag,因为它尚未完全在该点定义的(用于反射)是不能被隐含定义。这是两个具有相同问题的SO问题,它们表明 Spark 需要使用WeakTypeTag. 而且,这是 JIRA更正式地解释了这一点

