可供 Jupyter/IPython 选择的众多 Spark/Scala 内核中的哪一个?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/32858203/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-22 07:40:00  来源:igfitidea点击:

Which of the many Spark/Scala kernels for Jupyter/IPython to choose?

scalaapache-sparkipythonjupyter

提问by Lunigorn

There are a lot of Scala/Spark kernels for IPython/Jupyter:

IPython/Jupyter 有很多 Scala/Spark 内核:

  1. IScala
  2. ISpark
  3. Jupyter Scala
  4. Apache Toree(prev Spark Kernel)
  1. 斯卡拉
  2. 斯帕克
  3. Jupyter Scala
  4. Apache Toree(上一个Spark 内核

Does anybody know wich of them is most compatible with IPython/Jupyter and most comfortable to use with:

有谁知道它们中的哪一个与 IPython/Jupyter 最兼容并且最适合使用:

  1. Scala
  2. Spark(Scala)
  1. 斯卡拉
  2. 火花(斯卡拉)

采纳答案by Al M

I can't speak for all of them, but I use Spark Kernel and it works very well for using both Scala and Spark.

我不能代表所有这些,但我使用 Spark Kernel,它非常适合同时使用 Scala 和 Spark。

I found IScala and Jupyter Scala less stable and less polished. Jupyter Scala always prints every variable value after I execute a cell; I don't want to see this 99% of the time.

我发现 IScala 和 Jupyter Scala 不太稳定,也不太完善。Jupyter Scala 总是在我执行一个单元格后打印每个变量值;我不想在 99% 的时间里看到这个。

Spark Kernel is my favourite. Both for Spark and plain old Scala.

Spark Kernel 是我的最爱。适用于 Spark 和普通的旧 Scala。

回答by artyomboyko

Spark Kernelhas been accepted into Apache Incubatorand has moved all development to Apache Toree.

Spark Kernel已被Apache Incubator接受,并将所有开发转移到Apache Toree

回答by Antoni

I have been using spark-kernel (your option #4) and quite satisfied.

我一直在使用 spark-kernel(您的选项 #4)并且非常满意。

You can find a nice how-to installation (CDH 5.5 on CentOS 7) here (I have used it myself to install it in a Single node in pseudo-distributed mode).

你可以在这里找到一个很好的安装方法(CentOS 7 上的 CDH 5.5)(我自己使用它以伪分布式模式将它安装在单个节点中)。

http://www.davidgreco.me/blog/2015/12/24/how-to-use-jupyter-with-spark-kernel-and-cloudera-hadoop-slash-spark/

http://www.davidgreco.me/blog/2015/12/24/how-to-use-jupyter-with-spark-kernel-and-cloudera-hadoop-slash-spark/