java foreach 函数在 Spark DataFrame 中不起作用
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/41502896/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
foreach function not working in Spark DataFrame
提问by user6325753
According to DataFrames API, definition is:
根据DataFrames API,定义是:
public void foreach(scala.Function1<Row,scala.runtime.BoxedUnit> f)
Applies a function f to all rows.
将函数 f 应用于所有行。
But when I am trying like
但是当我尝试像
Dataframe df = sql.read()
.format("com.databricks.spark.csv")
.option("header","true")
.load("file:///home/hadoop/Desktop/examples.csv");
df.foreach(x->
{
System.out.println(x);
});
I am getting compile time error. any mistake?
我收到编译时错误。有什么错误吗?
回答by Thomas Decaux
You can cast it as Java RDD in order to use the lambda as you which:
您可以将其转换为 Java RDD,以便像您一样使用 lambda:
df.toJavaRDD().foreach(x->
System.out.println(x)
);
回答by abaghel
First extend scala.runtime.AbstractFunction1
and implement Serializable like below
首先scala.runtime.AbstractFunction1
像下面这样扩展和实现 Serializable
public abstract class SerializableFunction1<T,R>
extends AbstractFunction1<T, R> implements Serializable
{
}
Now use this SerializableFunction1
class like below.
现在使用这个SerializableFunction1
类,如下所示。
df.foreach(new SerializableFunction1<Row,BoxedUnit>(){
@Override
public BoxedUnit apply(Row row) {
System.out.println(row.get(0));
return BoxedUnit.UNIT;
}
});
回答by 54l3d
Try with this code :
尝试使用此代码:
df.foreach(new VoidFunction<String>(){ public void call(String line) {
//your function code here
}});
If you want just to show df content, this much easier :
如果你只想显示 df 内容,这更容易:
df.show();