java foreach 函数在 Spark DataFrame 中不起作用

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/41502896/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-11-03 05:55:44  来源:igfitidea点击:

foreach function not working in Spark DataFrame

javahadoopapache-sparkdataframespark-dataframe

提问by user6325753

According to DataFrames API, definition is:

根据DataFrames API,定义是:

public void foreach(scala.Function1<Row,scala.runtime.BoxedUnit> f)

Applies a function f to all rows.

将函数 f 应用于所有行。

But when I am trying like

但是当我尝试像

Dataframe df = sql.read()
    .format("com.databricks.spark.csv")
    .option("header","true")
    .load("file:///home/hadoop/Desktop/examples.csv");

df.foreach(x->
{
   System.out.println(x);
});

I am getting compile time error. any mistake?

我收到编译时错误。有什么错误吗?

回答by Thomas Decaux

You can cast it as Java RDD in order to use the lambda as you which:

您可以将其转换为 Java RDD,以便像您一样使用 lambda:

df.toJavaRDD().foreach(x->
   System.out.println(x)
);

回答by abaghel

First extend scala.runtime.AbstractFunction1and implement Serializable like below

首先scala.runtime.AbstractFunction1像下面这样扩展和实现 Serializable

public abstract class SerializableFunction1<T,R> 
      extends AbstractFunction1<T, R> implements Serializable 
{
}

Now use this SerializableFunction1class like below.

现在使用这个SerializableFunction1类,如下所示。

df.foreach(new SerializableFunction1<Row,BoxedUnit>(){
        @Override
        public BoxedUnit apply(Row row) {
            System.out.println(row.get(0));
            return BoxedUnit.UNIT;
        }
});

回答by 54l3d

Try with this code :

尝试使用此代码:

df.foreach(new VoidFunction<String>(){ public void call(String line) {
          //your function code here
}});

If you want just to show df content, this much easier :

如果你只想显示 df 内容,这更容易:

df.show();