scala 如何处理 spark map() 函数中的异常?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/30024052/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
how to handle the Exception in spark map() function?
提问by user2848932
I want to ignore Exception in map() function , for example:
我想忽略 map() 函数中的异常,例如:
rdd.map(_.toInt)
where rdd is a RDD[String].
其中 rdd 是一个RDD[String].
but if it meets non-number string, it will failed.
但如果遇到非数字字符串,则会失败。
what is the easist way to ignore any Exception and ignore that line? (I do not want to use filter to handle exception, because there may be so many other exceptions...)
忽略任何异常并忽略该行的最简单方法是什么?(我不想使用过滤器来处理异常,因为可能还有很多其他异常......)
回答by gamsd
You can use a combination of Tryand map/filter.
您可以结合使用Try和 map/filter。
Try will wrap your computation into Success, if they behave as expected, or Failure, if an exception is thrown. Then you can filter what you want - in this case the successful computations, but you could also filter the error cases for logging purposes, for example.
如果它们的行为符合预期,Try 会将您的计算包装为成功,如果抛出异常,则将其包装为失败。然后你可以过滤你想要的 - 在这种情况下是成功的计算,但你也可以过滤错误情况以用于记录目的,例如。
The following code is a possible starting point. You can run and explore it in scastie.orgto see if it fits your needs.
以下代码是一个可能的起点。您可以在scastie.org 中运行和探索它,看看它是否符合您的需求。
import scala.util.Try
object Main extends App {
val in = List("1", "2", "3", "abc")
val out1 = in.map(a => Try(a.toInt))
val results = out1.filter(_.isSuccess).map(_.get)
println(results)
}
回答by banjara
I recommend you to use filter/map
我建议你使用过滤器/地图
rdd.filter(r=>NumberUtils.isNumber(r)).map(r=> r.toInt)
or flatmap
或平面图
exampleRDD.flatMap(r=> {if (NumberUtils.isNumber(r)) Some(r.toInt) else None})
Otherwise you can catch exception in map function
否则你可以在 map 函数中捕获异常
myRDD.map(r => { try{
r.toInt
}catch {
case runtime: RuntimeException => {
-1
}
}
})
and then apply filter(on -1)
然后应用过滤器(在 -1 上)

