scala spark <console>:12: 错误: 未找到: 值 sc

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/25203815/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-22 06:29:46  来源:igfitidea点击:

spark <console>:12: error: not found: value sc

scalaapache-sparkdistributed-computing

提问by Amitesh Ranjan

I wrote the following:

我写了以下内容:

val a = 1 to 10000
val b = sc.parallelize(a)

and it shows error saying:

它显示错误说:

<console>:12: error: not found: value sc

Any help?

有什么帮助吗?

回答by satish sasate

In my case I have spark installed on local windows system and I observed the same error but it was because of below issue

在我的情况下,我在本地 Windows 系统上安装了 spark,我观察到了同样的错误,但这是因为以下问题

Issue:Caused by: java.lang.RuntimeException: The root scratch dir: /tmp/hive on HDFS should be writable.

问题:Caused by: java.lang.RuntimeException: The root scratch dir: /tmp/hive on HDFS should be writable

This was because of permission issue.I resolved it by changing the permissions using below command.Though log says "on hdfs" this is on windows system

这是因为权限问题。我通过使用下面的命令更改权限来解决它。虽然日志说“在 hdfs 上”这是在 Windows 系统上

E:\winutils\bin\winutils.exe chmod 777 E:\tmp\hive

E:\winutils\bin\winutils.exe chmod 777 E:\tmp\hive

回答by Shyamendra Solanki

It happens when your classpath is not correct. This is an open issuein Spark at the moment.

当您的类路径不正确时会发生这种情况。这是目前 Spark 中的一个悬而未决的问题

> spark-shell 

...
...
14/08/08 18:41:50 INFO SparkILoop: Created spark context..
Spark context available as sc.

scala> sc
res0: org.apache.spark.SparkContext = org.apache.spark.SparkContext@2c1c5c2e

scala> :cp /tmp
Added '/tmp'.  Your new classpath is:
...

scala> sc
<console>:8: error: not found: value sc

You may need to correct your classpath from outside the repl.

您可能需要从 repl 外部更正您的类路径。

回答by gsamaras

You get this error, because scis not defined. I would try:

您收到此错误,因为sc未定义。我会尝试:

sc = SparkContext(appName = "foo")


Another thing that usually happens to me is not getting a Kerberos ticket in the cluster, because I forgot too.

我经常发生的另一件事是没有在集群中获得 Kerberos 票证,因为我也忘记了。



As for the "open issue in Spark" mentioned by Solnanki, I am pretty sure this is notthe case any more.

至于 Solnanki 提到的“Spark 中的未解决问题”,我很确定不再是这种情况。

回答by uday

First check the log file after spark-shell command run whether SparkContext is initinalized as sc. if SparkContext is not initialized properly

运行 spark-shell 命令后首先检查日志文件是否将 SparkContext 初始化为 sc。如果 SparkContext 没有正确初始化

you have to set the IP address in spark environment.

您必须在 spark 环境中设置 IP 地址。

Open the env file in conf/spark.env.shand add the below line

conf/spark.env.sh 中打开 env 文件并添加以下行

export SPARK_LOCAL_IP="127.0.0.1"

导出 SPARK_LOCAL_IP="127.0.0.1"

回答by Santosh Gaikwad

i faced the same problem. In my case the JAVA_HOME was not set properly which cause this issue. surprisingly SPARK would start but the sc context had issues creating an instance. When i fixed the JAVA_HOME to point to the correct java directory, this issue was resolved. I had to close the session and re-open a new one in order to ensure the path is updated and fresh session is turned on.

我遇到了同样的问题。在我的情况下,JAVA_HOME 未正确设置导致此问题。令人惊讶的是,SPARK 会启动,但 sc 上下文在创建实例时出现问题。当我修复 JAVA_HOME 以指向正确的 java 目录时,这个问题就解决了。我不得不关闭会话并重新打开一个新会话,以确保更新路径并打开新会话。

I hope this helps.

我希望这有帮助。

回答by user3321437

I hit this error when trying out Spark on Cloudera Quickstart VM. Turned out to be an hdfs file permission issue on /user/spark.

我在 Cloudera Quickstart VM 上尝试 Spark 时遇到了这个错误。原来是 .hdfs 文件权限问题/user/spark

I could not switch to user "spark", I got a user not available error. Changing file permissions with the below command solved it for me.

我无法切换到用户“spark”,出现用户不可用错误。使用以下命令更改文件权限为我解决了这个问题。

sudo -u hdfs hadoop fs -chmod -R 1777 /user/spark

scala> val data = 1 to 10000
data: scala.collection.immutable.Range.Inclusive = Range(1, 2, 3, 4, 5, 6, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170...
scala> val distData = sc.parallelize(data)
distData: org.apache.spark.rdd.RDD[Int] = ParallelCollectionRDD[0] at parallelize at <console>:14

回答by Franc Drobni?

As stated in this thread, one solution may be to switch off permissions checking.

该线程所述,一种解决方案可能是关闭权限检查。

In cloudera manager, go to hdfs configuration under advanced and put the following code in "HDFS Service Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml" :

在cloudera manager中,进入advanced下的hdfs配置,在“HDFS Service Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml”中加入以下代码:

<property>
<name>dfs.permissions</name>
<value>false</value>
</property>

After that, it is necessary to restart the HDFS component.

之后需要重启HDFS组件。

It worked for me. It might not be appropriate for a production environment, however.

它对我有用。但是,它可能不适合生产环境。