scala DataFrame 错误:“重载的方法值选择与替代”

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/42184191/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-22 09:04:40  来源:igfitidea点击:

DataFrame error: “overloaded method value select with alternatives”

scalaapache-sparkdataframe

提问by Jason.Liu

I tried to create a new dataframe by select hour+minute/60 and other columns from a dataframe as follows:

我尝试通过从数据框中选择小时+分钟/60 和其他列来创建一个新的数据框,如下所示:

val logon11 = logon1.select("User","PC","Year","Month","Day","Hour","Minute",$"Hour"+$"Minute"/60)

I got the error below:

我收到以下错误:

<console>:38: error: overloaded method value select with alternatives:
  (col: String,cols: String*)org.apache.spark.sql.DataFrame <and>
  (cols: org.apache.spark.sql.Column*)org.apache.spark.sql.DataFrame
cannot be applied to (String, String, String, String, String, String, String,org.apache.spark.sql.Colum)

...

Maybe I have known the reason is that I cannot get a DataFrame with these types using "select" at the same time. Then how can I get such dataframe?

也许我已经知道原因是我无法同时使用“select”获得这些类型的 DataFrame。那我怎样才能得到这样的数据框呢?

回答by avr

DF's selectmethod takes arguments of type either all Strings or all org.apache.spark.sql.Columns but doesn't take mix of both.

DF 的select方法采用 all Strings 或 all org.apache.spark.sql.Columns类型的参数,但不混合使用两者。

In your case you are passing both Stringand Columntype parameters to selectmethod.

在您的情况下,您将StringColumn类型参数传递给select方法。

val logon11 = logon1.select($"User",$"PC",$"Year",$"Month",$"Day",$"Hour",$"Minute",$"Hour"+$"Minute"/60 as "total_hours")

Hope it helps!

希望能帮助到你!

回答by Prasad Khode

you can use withColumncreate a new column from existing columns or based on some conditions like below

您可以使用withColumn从现有列或基于以下条件创建新列

val logon1 = Seq(("User1","PC1",2017,2,12,12,10)).toDF("User","PC","Year","Month","Day","Hour","Minute")
val logon11 = logon1.withColumn("new_col", $"Hour"+$"Minute"/60)
logon11.printSchema()
logon11.show

output:

输出:

root
 |-- User: string (nullable = true)
 |-- PC: string (nullable = true)
 |-- Year: integer (nullable = false)
 |-- Month: integer (nullable = false)
 |-- Day: integer (nullable = false)
 |-- Hour: integer (nullable = false)
 |-- Minute: integer (nullable = false)
 |-- new_col: double (nullable = true)


+-----+---+----+-----+---+----+------+------------------+
| User| PC|Year|Month|Day|Hour|Minute|           new_col|
+-----+---+----+-----+---+----+------+------------------+
|User1|PC1|2017|    2| 12|  12|    10|12.166666666666666|
+-----+---+----+-----+---+----+------+------------------+