Spark，在 Scala 中添加具有相同值的新列

Question

提问by Alessandro

I have some problem with the withColumnfunction in Spark-Scala environment. I would like to add a new Column in my DataFrame like that:

我对withColumnSpark-Scala 环境中的函数有一些问题。我想在我的 DataFrame 中添加一个新列，如下所示：

+---+----+---+
|  A|   B|  C|
+---+----+---+
|  4|blah|  2|
|  2|    |  3|
| 56| foo|  3|
|100|null|  5|
+---+----+---+

became:

变成了：

+---+----+---+-----+
|  A|   B|  C|  D  |
+---+----+---+-----+
|  4|blah|  2|  750|
|  2|    |  3|  750|
| 56| foo|  3|  750|
|100|null|  5|  750|
+---+----+---+-----+

the column D in one value repeated N-time for each row in my DataFrame.

一个值中的 D 列对于我的 DataFrame 中的每一行重复 N 次。

The code are this:

代码是这样的：

var totVehicles : Double = df_totVehicles(0).getDouble(0); //return 750

The variable totVehicles returns the correct value, it's works!

变量 totVehicles 返回正确的值，它有效！

The second DataFrame has to calculate 2 fields (id_zipcode, n_vehicles), and add the third column (with the same value -750):

第二个 DataFrame 必须计算 2 个字段（id_zipcode、n_vehicles），并添加第三列（具有相同的值 -750）：

var df_nVehicles =
df_carPark.filter(
      substring($"id_time",1,4) < 2013
    ).groupBy(
      $"id_zipcode"
    ).agg(
      sum($"n_vehicles") as 'n_vehicles
    ).select(
      $"id_zipcode" as 'id_zipcode,
      'n_vehicles
    ).orderBy(
      'id_zipcode,
      'n_vehicles
    );

Finally, I add the new column with withColumnfunction:

最后，我添加了具有以下withColumn功能的新列：

var df_nVehicles2 = df_nVehicles.withColumn(totVehicles, df_nVehicles("n_vehicles") + df_nVehicles("id_zipcode"))

But Spark returns me this error:

但是 Spark 返回给我这个错误：

 error: value withColumn is not a member of Unit
         var df_nVehicles2 = df_nVehicles.withColumn(totVehicles, df_nVehicles("n_vehicles") + df_nVehicles("id_zipcode"))

Can you help me? Thank you very much!

你能帮助我吗？非常感谢你！

Answer 1

回答by Rockie Yang

litfunction is for adding literal values as a column

lit功能是将文字值添加为列

import org.apache.spark.sql.functions._
df.withColumn("D", lit(750))

Spark，在 Scala 中添加具有相同值的新列

提问by Alessandro

回答by Rockie Yang

相关推荐

最近更新

标签

Spark，在 Scala 中添加具有相同值的新列

提问by Alessandro

回答by Rockie Yang

相关推荐

scala build.sbt：如何添加火花依赖

scala 如何使用字符串数组在火花数据框中将列名设置为 toDF() 函数？

无法在 IntelliJ 上创建 Scala 类

scala Spark - 提交应用程序时出现错误“必须在您的配置中设置主 URL”

相关推荐

最近更新

标签