使用 Scala 从 HDFS 读取数据

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/41587931/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-22 08:59:49  来源:igfitidea点击:

Read the data from HDFS using Scala

scalahdfs

提问by Kiran

I am new to Scala. How can I read a file from HDFS using Scala (not using Spark)? When I googled it I only found writing option to HDFS.

我是 Scala 的新手。如何使用 Scala(不使用 Spark)从 HDFS 读取文件?当我用谷歌搜索时,我只找到了 HDFS 的写入选项。

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import java.io.PrintWriter;

/**
* @author ${user.name}
*/
object App {

//def foo(x : Array[String]) = x.foldLeft("")((a,b) => a + b)

def main(args : Array[String]) {
println( "Trying to write to HDFS..." )
val conf = new Configuration()
//conf.set("fs.defaultFS", "hdfs://quickstart.cloudera:8020")
conf.set("fs.defaultFS", "hdfs://192.168.30.147:8020")
val fs= FileSystem.get(conf)
val output = fs.create(new Path("/tmp/mySample.txt"))
val writer = new PrintWriter(output)
try {
    writer.write("this is a test") 
    writer.write("\n")
}
finally {
    writer.close()
    println("Closed!")
}
println("Done!")
}

}

Please help me.How can read the file or load file from HDFS using scala.

请帮助我。如何使用 scala 从 HDFS 读取文件或加载文件。

回答by solar

One of the ways (kinda in functional style) could be like this:

其中一种方式(有点功能风格)可能是这样的:

val hdfs = FileSystem.get(new URI("hdfs://yourUrl:port/"), new Configuration()) 
val path = new Path("/path/to/file/")
val stream = hdfs.open(path)
def readLines = Stream.cons(stream.readLine, Stream.continually( stream.readLine))

//This example checks line for null and prints every existing line consequentally
readLines.takeWhile(_ != null).foreach(line => println(line))

Also you could take a look this articleor hereand here, these questions look related to yours and contain working (but more Java-like) code examples if you're interested.

你也可以看看这篇文章这里这里,如果你有兴趣,这些问题看起来与你的相关,并且包含工作(但更像 Java)的代码示例。