scala Spark:错误:值拆分不是 org.apache.spark.rdd.RDD[String] 的成员

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/38138049/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-22 08:26:30  来源:igfitidea点击:

Spark: error: value split is not a member of org.apache.spark.rdd.RDD[String]

scalaapache-sparksplitrdd

提问by Akash Garg

The code snippet I was trying to execute:

我试图执行的代码片段:

val textfile = sc.textFile("small_file.txt")
            val arr = textfile.split(",")
            for (v <- arr) {
                println(v)

The packages that I included:

我包含的包:

import org.apache.spark.SparkContext
import org.apache.spark.SparkConf
import org.apache.spark.SparkContext._
import org.apache.spark.rdd.RDD._
import org.apache.spark.rdd.RDD

The error that I got:

我得到的错误:

<console>:54: error: value split is not a member of org.apache.spark.rdd.RDD[String]
                val arr = textfile.split(",")
                               ^

Any lead would be appreciated!!

任何线索将不胜感激!

回答by Shiv4nsh

It says clearly that split is not the method of RDD. Hence if you want to split the data inside the text file on the basis of ", " then you have to use the map function of RDD.

很明显,split不是RDD的方法。因此,如果要根据“,”分割文本文件内的数据,则必须使用RDD的map函数。

textfile.map(line=>line.split(","))

For more information you can see the example of word count here.

有关更多信息,您可以在此处查看字数统计示例。

http://spark.apache.org/examples.html

http://spark.apache.org/examples.html

回答by nat

 val textfile = sc.textFile("small_file.txt") 

variable textfile is RDD[String] and not string and that's why you are getting exception as split method is not member of RDD[String], so if you have to print content of textfile you can use

变量文本文件是 RDD[String] 而不是字符串,这就是为什么你会得到异常,因为 split 方法不是 RDD[String] 的成员,所以如果你必须打印文本文件的内容,你可以使用

textfile.foreach(println) (shorter version)

or

或者

textfile.foreact(x => println(x)) (longer version)

Thanks

谢谢