scala Spark：找不到 CoarseGrainedScheduler

Question

提问by Adetiloye Philip Kehinde

Am not sure what's causing this exception running my Spark job after running for some few hours.

在运行几个小时后，我不确定是什么导致了运行我的 Spark 作业的异常。

Am running Spark 2.0.2

我正在运行 Spark 2.0.2

Any debugging tip ?

任何调试提示？

2016-12-27 03:11:22,199 [shuffle-server-3] ERROR org.apache.spark.network.server.TransportRequestHandler - Error while invoking RpcHandler#receive() for one-way message.
org.apache.spark.SparkException: Could not find CoarseGrainedScheduler.
    at org.apache.spark.rpc.netty.Dispatcher.postMessage(Dispatcher.scala:154)
    at org.apache.spark.rpc.netty.Dispatcher.postOneWayMessage(Dispatcher.scala:134)
    at org.apache.spark.rpc.netty.NettyRpcHandler.receive(NettyRpcEnv.scala:571)
    at org.apache.spark.network.server.TransportRequestHandler.processOneWayMessage(TransportRequestHandler.java:180)
    at org.apache.spark.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:109)
    at org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:119)
    at org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:51)
    at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
    at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:266)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
    at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
    at org.apache.spark.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:85)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
    at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846)
    at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
    at io.netty.util.concurrent.SingleThreadEventExecutor.run(SingleThreadEve

Answer 1

采纳答案by Adetiloye Philip Kehinde

Yeah now I know the meaning of that cryptic exception, the executor got killed because it exceeds the container memory threshold.
There are couple of reasons that could happen but the first culprit is to check your job (e.g. repartition) or try adding more nodes/executors to your cluster.

是的，现在我知道那个神秘异常的含义了，执行程序被杀死了，因为它超过了容器内存阈值。
可能发生的原因有几个，但第一个罪魁祸首是检查您的工作（例如重新分区）或尝试向集群添加更多节点/执行程序。

Answer 2

回答by Tomer

Basically it means that there is another reason for the failure. Try to find other exception in your job logs.

基本上这意味着失败还有另一个原因。尝试在您的作业日志中查找其他异常。

See "Exceptions" sections here: https://medium.com/@wx.london.cun/spark-on-yarn-f74e82ab6070

请参阅此处的“例外”部分：https: //medium.com/@wx.london.cun/spark-on-yarn-f74e82ab6070

Answer 3

回答by Beniamino Del Pizzo

It could be a resource problem. Try to increase the number of cores and executor and also to assign more RAM to the application then you should increase the partition number of your RDD by calling a repartition. The ideal number of partitions depends on previous settings. Hope this helps.

可能是资源问题。尝试增加内核和执行器的数量，并为应用程序分配更多的 RAM，然后您应该通过调用重新分区来增加 RDD 的分区数。理想的分区数取决于之前的设置。希望这可以帮助。

Answer 4

回答by Sathish

Another silly reason could be that your time in spark streaming awaitTermination is set to much less time, and it got terminated before completing

另一个愚蠢的原因可能是您在 Spark Streaming awaitTermination 中的时间设置得少得多，并且它在完成之前就被终止了

ssc.awaitTermination(timeout)
@param timeout: time to wait in seconds

Answer 5

回答by Carlos Bribiescas

For me, this has happened when I specified a path that doesn't exist for a spark.read.loadOr if I specify the wrong format for input ie parquetinstead of csv.

对我来说，当我为 a 指定了一个不存在的路径时就会发生这种情况，spark.read.load或者如果我为输入指定了错误的格式，即parquet而不是csv.

Unfortunately, the actual error sometime is silent and happens above the stack trace. Sometimes though you can find another set of stack traces along with this one that will be more meaningful.

不幸的是，实际的错误有时是无声的，并且发生在堆栈跟踪之上。有时，尽管您可以找到另一组堆栈跟踪以及更有意义的堆栈跟踪。

scala Spark：找不到 CoarseGrainedScheduler

提问by Adetiloye Philip Kehinde

采纳答案by Adetiloye Philip Kehinde

回答by Tomer

回答by Beniamino Del Pizzo

回答by Sathish

回答by Carlos Bribiescas

相关推荐

最近更新

标签

scala Spark：找不到 CoarseGrainedScheduler

提问by Adetiloye Philip Kehinde

采纳答案by Adetiloye Philip Kehinde

回答by Tomer

回答by Beniamino Del Pizzo

回答by Sathish

回答by Carlos Bribiescas

相关推荐

scala 程序执行期间Apache-Spark中的超时异常

scala 如何在 Spark SQL 中使用 CROSS JOIN 和 CROSS APPLY

Spark java.lang.ClassCastException：scala.collection.mutable.WrappedArray$ofRef 不能转换为 java.util.ArrayList

scala 在集群模式下使用 Spark 将文件写入本地系统

相关推荐

最近更新

标签