Java - Spark SQL DataFrame 映射功能不起作用
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/29790417/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Java - Spark SQL DataFrame map function is not working
提问by user3206330
In Spark SQL when I tried to use map function on DataFrame then I am getting below error.
在 Spark SQL 中,当我尝试在 DataFrame 上使用 map 函数时,出现以下错误。
The method map(Function1, ClassTag) in the type DataFrame is not applicable for the arguments (new Function(){})
类型DataFrame中的方法map(Function1, ClassTag)不适用于参数(new Function(){})
I am following spark 1.3 documentation as well. https://spark.apache.org/docs/latest/sql-programming-guide.html#inferring-the-schema-using-reflectionHave any one solution?
我也在关注 spark 1.3 文档。https://spark.apache.org/docs/latest/sql-programming-guide.html#inferring-the-schema-using-reflection有任何解决方案吗?
Here is my testing code.
这是我的测试代码。
// SQL can be run over RDDs that have been registered as tables.
DataFrame teenagers = sqlContext.sql("SELECT name FROM people WHERE age >= 13 AND age <= 19");
List<String> teenagerNames = teenagers.map(
new Function<Row, String>() {
public String call(Row row) {
return "Name: " + row.getString(0);
}
}).collect();
回答by econn
Change this to:
将此更改为:
Java 6 & 7
爪哇 6 和 7
List<String> teenagerNames = teenagers.javaRDD().map(
new Function<Row, String>() {
public String call(Row row) {
return "Name: " + row.getString(0);
}
}).collect();
Java 8
爪哇 8
List<String> t2 = teenagers.javaRDD().map(
row -> "Name: " + row.getString(0)
).collect();
Once you call javaRDD() it works just like any other RDD map function.
一旦你调用 javaRDD() 它就像任何其他 RDD 映射函数一样工作。
This works with Spark 1.3.0 and up.
这适用于 Spark 1.3.0 及更高版本。
回答by Vijay Anantharamu
No need to convert to RDD, its delays the execution it can be done as below
无需转换为RDD,它会延迟执行它可以如下完成
`public static void mapMethod() { // Read the data from file, where the file is in the classpath. Dataset df = sparkSession.read().json("file1.json");
`public static void mapMethod() { // 从文件中读取数据,文件在类路径中。数据集 df = sparkSession.read().json("file1.json");
// Prior to java 1.8
Encoder<String> encoder = Encoders.STRING();
List<String> rowsList = df.map((new MapFunction<Row, String>() {
private static final long serialVersionUID = 1L;
@Override
public String call(Row row) throws Exception {
return "string:>" + row.getString(0).toString() + "<";
}
}), encoder).collectAsList();
// from java 1.8 onwards
List<String> rowsList1 = df.map((row -> "string >" + row.getString(0) + "<" ), encoder).collectAsList();
System.out.println(">>> " + rowsList);
System.out.println(">>> " + rowsList1);
}`
}`
回答by urug
Do you have the correct dependency set in your pom. Set this and try
您的 pom.xml 文件中是否设置了正确的依赖项?设置这个并尝试
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.10</artifactId>
<version>1.3.1</version>
</dependency>
回答by Yassine Jouini
try this:
试试这个:
// SQL can be run over RDDs that have been registered as tables.
DataFrame teenagers = sqlContext.sql("SELECT name FROM people WHERE age >= 13 AND age <= 19");
List<String> teenagerNames = teenagers.toJavaRDD().map(
new Function<Row, String>() {
public String call(Row row) {
return "Name: " + row.getString(0);
}
}).collect();
you have to transforme your DataFrame to javaRDD
你必须将你的 DataFrame 转换为 javaRDD
回答by Swaminathan S
check if you are using the correct import for
检查您是否使用了正确的导入
Row(import org.apache.spark.sql.Row) Remove any other imports related to Row.otherwise ur syntax is correct
Row(import org.apache.spark.sql.Row) 删除与 Row 相关的任何其他导入。否则你的语法是正确的
回答by ankitbeohar90
Please check your input file's data and your dataframe sql query same thing I am facing and when I look back to the data so it was not matching with my query. So probably same issue you are facing. toJavaRDD and JavaRDD both are working.
请检查您的输入文件的数据和您的数据框 sql 查询我面临的同样的事情,当我回顾数据时,它与我的查询不匹配。所以你可能面临同样的问题。toJavaRDD 和 JavaRDD 都在工作。