scala Spark:找不到 CoarseGrainedScheduler
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 
原文地址: http://stackoverflow.com/questions/41338617/
Warning: these are provided under cc-by-sa 4.0 license.  You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Spark: Could not find CoarseGrainedScheduler
提问by Adetiloye Philip Kehinde
Am not sure what's causing this exception running my Spark job after running for some few hours.
在运行几个小时后,我不确定是什么导致了运行我的 Spark 作业的异常。
Am running Spark 2.0.2
我正在运行 Spark 2.0.2
Any debugging tip ?
任何调试提示?
2016-12-27 03:11:22,199 [shuffle-server-3] ERROR org.apache.spark.network.server.TransportRequestHandler - Error while invoking RpcHandler#receive() for one-way message.
org.apache.spark.SparkException: Could not find CoarseGrainedScheduler.
    at org.apache.spark.rpc.netty.Dispatcher.postMessage(Dispatcher.scala:154)
    at org.apache.spark.rpc.netty.Dispatcher.postOneWayMessage(Dispatcher.scala:134)
    at org.apache.spark.rpc.netty.NettyRpcHandler.receive(NettyRpcEnv.scala:571)
    at org.apache.spark.network.server.TransportRequestHandler.processOneWayMessage(TransportRequestHandler.java:180)
    at org.apache.spark.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:109)
    at org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:119)
    at org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:51)
    at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
    at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:266)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
    at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
    at org.apache.spark.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:85)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
    at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846)
    at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
    at io.netty.util.concurrent.SingleThreadEventExecutor.run(SingleThreadEve
采纳答案by Adetiloye Philip Kehinde
Yeah now I know the meaning of that cryptic exception, the executor got killed because it exceeds the container memory threshold. 
There are couple of reasons that could happen but the first culprit is to check your job (e.g. repartition) or try adding more nodes/executors to your cluster.
是的,现在我知道那个神秘异常的含义了,执行程序被杀死了,因为它超过了容器内存阈值。
可能发生的原因有几个,但第一个罪魁祸首是检查您的工作(例如重新分区)或尝试向集群添加更多节点/执行程序。
回答by Tomer
Basically it means that there is another reason for the failure. Try to find other exception in your job logs.
基本上这意味着失败还有另一个原因。尝试在您的作业日志中查找其他异常。
See "Exceptions" sections here: https://medium.com/@wx.london.cun/spark-on-yarn-f74e82ab6070
请参阅此处的“例外”部分:https: //medium.com/@wx.london.cun/spark-on-yarn-f74e82ab6070
回答by Beniamino Del Pizzo
It could be a resource problem. Try to increase the number of cores and executor and also to assign more RAM to the application then you should increase the partition number of your RDD by calling a repartition. The ideal number of partitions depends on previous settings. Hope this helps.
可能是资源问题。尝试增加内核和执行器的数量,并为应用程序分配更多的 RAM,然后您应该通过调用重新分区来增加 RDD 的分区数。理想的分区数取决于之前的设置。希望这可以帮助。
回答by Sathish
Another silly reason could be that your time in spark streaming awaitTermination is set to much less time, and it got terminated before completing
另一个愚蠢的原因可能是您在 Spark Streaming awaitTermination 中的时间设置得少得多,并且它在完成之前就被终止了
ssc.awaitTermination(timeout)
@param timeout: time to wait in seconds
回答by Carlos Bribiescas
For me, this has happened when I specified a path that doesn't exist for a spark.read.loadOr if I specify the wrong format for input ie parquetinstead of csv.
对我来说,当我为 a 指定了一个不存在的路径时就会发生这种情况,spark.read.load或者如果我为输入指定了错误的格式,即parquet而不是csv.
Unfortunately, the actual error sometime is silent and happens above the stack trace. Sometimes though you can find another set of stack traces along with this one that will be more meaningful.
不幸的是,实际的错误有时是无声的,并且发生在堆栈跟踪之上。有时,尽管您可以找到另一组堆栈跟踪以及更有意义的堆栈跟踪。

