scala 在 spark 1.6 中将 csv 读取为数据帧
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/38595893/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Read csv as Data Frame in spark 1.6
提问by user2145299
I have Spark 1.6 and trying to read a csv (or tsv) file as a dataframe. Here are the steps I take:
我有 Spark 1.6 并尝试将 csv(或 tsv)文件作为数据帧读取。以下是我采取的步骤:
scala> val sqlContext= new org.apache.spark.sql.SQLContext(sc)
scala> import sqlContext.implicits._
scala> val df = sqlContext.read
scala> .format("com.databricks.spark.csv")
scala> .option("header", "true")
scala.option("inferSchema", "true")
scala> .load("data.csv")
scala> df.show()
Error:
错误:
<console>:35: error: value show is not a member of org.apache.spark.sql.DataFrameReader df.show()
The last command is supposed to show the first few lines of the dataframe, but I get the error message. Any help will be much appreciated.
最后一个命令应该显示数据帧的前几行,但我收到错误消息。任何帮助都感激不尽。
回答by MrChristine
Looks like you functions are not chained together properly and it's attempting to run "show()" on the val df, which is a reference to the DataFrameReader class. If I run the following, I can reproduce your error:
看起来您的函数没有正确链接在一起,它试图在 val df 上运行“show()”,这是对 DataFrameReader 类的引用。如果我运行以下命令,我可以重现您的错误:
val df = sqlContext.read
df.show()
If you restructure the code, it would work:
如果您重组代码,它将起作用:
val df = sqlContext.read.format("com.databricks.spark.csv").option("header", "true").option("inferSchema", "true").load("data.csv")
df.show()
回答by Rajeev Rathor
In java first add dependency in POM.xml file and run following code to read csv file.
在 java 中首先在 POM.xml 文件中添加依赖项并运行以下代码来读取 csv 文件。
<dependency>
<groupId>com.databricks</groupId>
<artifactId>spark-csv_2.10</artifactId>
<version>1.4.0</version>
</dependency>
Dataset<Row> df = sparkSession.read().format("com.databricks.spark.csv").option`enter code here`("header", true).option("inferSchema", true).load("hdfs://localhost:9000/usr/local/hadoop_data/loan_100.csv");
回答by user3521180
Use the following instead:
请改用以下内容:
val sqlContext = new SQLContext(sc);
It should resolve your issue.
它应该可以解决您的问题。

