在 Scala Spark 中找不到 reduceByKey 方法

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/23943852/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-22 06:17:04  来源:igfitidea点击:

reduceByKey method not being found in Scala Spark

scalaapache-sparkrdd

提问by blue-sky

Attempting to run http://spark.apache.org/docs/latest/quick-start.html#a-standalone-app-in-scalafrom source.

试图从源代码运行http://spark.apache.org/docs/latest/quick-start.html#a-standalone-app-in-scala

This line:

这一行:

val wordCounts = textFile.flatMap(line => line.split(" ")).map(word => (word, 1)).reduceByKey((a, b) => a + b)

is throwing error

正在抛出错误

value reduceByKey is not a member of org.apache.spark.rdd.RDD[(String, Int)]
  val wordCounts = logData.flatMap(line => line.split(" ")).map(word => (word, 1)).reduceByKey((a, b) => a + b)

logData.flatMap(line => line.split(" ")).map(word => (word, 1))returns a MappedRDD but I cannot find this type in http://spark.apache.org/docs/0.9.1/api/core/index.html#org.apache.spark.rdd.RDD

logData.flatMap(line => line.split(" ")).map(word => (word, 1))返回一个 MappedRDD,但我在http://spark.apache.org/docs/0.9.1/api/core/index.html#org.apache.spark.rdd.RDD 中找不到这种类型

I'm running this code from Spark source so could be a classpath problem ? But required dependencies are on my classpath.

我正在从 Spark 源运行此代码,所以可能是类路径问题?但是所需的依赖项在我的类路径上。

回答by maasg

You should import the implicit conversions from SparkContext:

您应该从SparkContext以下位置导入隐式转换:

import org.apache.spark.SparkContext._

They use the 'pimp up my library' pattern to add methods to RDD's of specific types. If curious, see SparkContext:1296

他们使用“拉皮条我的库”模式向特定类型的 RDD 添加方法。如果好奇,请参阅SparkContext:1296

回答by Ws576

If you use maven on ScalaIDE I just solved the problem by updating the dependency from spark-streaming version 1.2 to version 1.3.

如果你在 ScalaIDE 上使用 maven,我只是通过将依赖从 spark-streaming 1.2 版更新到 1.3 版来解决这个问题。

回答by Hongyang

Actually, you can find it in PairRDDFunctions class. PairRDDFunctions is a class contains extra functions available on RDDs of (key, value) pairs through an implicit conversion.

实际上,您可以在 PairRDDFunctions 类中找到它。PairRDDFunctions 是一个类,包含通过隐式转换在(键,值)对的 RDD 上可用的额外函数。

https://spark.apache.org/docs/2.1.0/api/scala/index.html#org.apache.spark.rdd.PairRDDFunctions

https://spark.apache.org/docs/2.1.0/api/scala/index.html#org.apache.spark.rdd.PairRDDFunctions