Java Spark Dataframe API (1.4.1) 中未定义 max() 和 sum() 的方法

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/32450487/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-11-02 20:13:37  来源:igfitidea点击:

Methods of max() and sum() undefined in the Java Spark Dataframe API (1.4.1)

javaapache-spark-sqlspark-dataframe

提问by Jingyu Zhang

Putting sample code of DataFrame.groupBy()into my code, but it shown the methods of max()and sum()undefined.

将 的示例代码DataFrame.groupBy()放入我的代码中,但它显示了max()sum()未定义的方法。

df.groupBy("department").agg(max("age"), sum("expense"));

Which Java package should I import if I want to use max()and sum()method?

如果我想使用max()sum()方法,我应该导入哪个 Java 包?

Is the Syntax correct of this sample code?

此示例代码的语法是否正确?

回答by vishak

The import didn't work for me. Eclipse IDE still showed the compilation error.

导入对我不起作用。Eclipse IDE 仍然显示编译错误。

But the following method call worked

但以下方法调用有效

df.groupBy("Gender").agg(org.apache.spark.sql.functions.max(df.col("Id")), org.apache.spark.sql.functions.sum(df.col("Income")));

In case the aggregation involves only one field, we can also use the following syntax,

如果聚合只涉及一个字段,我们还可以使用以下语法,

df.groupBy("Gender").max("Income");

回答by Ganesh Krishnan

import static org.apache.spark.sql.functions.* 

Try this to import all functions including maxand sum

试试这个导入所有功能,包括maxsum

回答by Niemand

Try import org.apache.spark.sql.functions._

尝试 import org.apache.spark.sql.functions._

EDIT.

编辑。

From what I've noticed you are using scala syntax, trying to acces columns via apply method. For Java, you have to pass columns like with .colmethod this:

从我注意到您正在使用 scala 语法,尝试通过 apply 方法访问列。对于 Java,您必须像这样传递列.col

df.groupBy("department").agg(max(df.col("age")), sum(df.col("expense")));

See Java example here

请参阅此处的Java 示例

回答by Aron_dc

It seems you're searching for "org.apache.spark.sql.GroupedData"

您似乎正在搜索“org.apache.spark.sql.GroupedData”

To use them in your code like you've written it, you'll need a static import.

要像编写代码一样在代码中使用它们,您需要静态导入。

Link to Api

链接到 Api

Always try to have a look at the API descriptions first.

始终尝试先查看 API 描述。