spark - scala：不是 org.apache.spark.sql.Row 的成员

Question

提问by Edamame

I am trying to convert a data frame to RDD, then perform some operations below to return tuples:

我正在尝试将数据帧转换为 RDD，然后执行下面的一些操作以返回元组：

df.rdd.map { t=>
 (t._2 + "_" + t._3 , t)
}.take(5)

Then I got the error below. Anyone have any ideas? Thanks!

然后我得到了下面的错误。有人有想法么？谢谢！

<console>:37: error: value _2 is not a member of org.apache.spark.sql.Row
               (t._2 + "_" + t._3 , t)
                  ^

Answer 1

回答by Daniel de Paula

When you convert a DataFrame to RDD, you get an RDD[Row], so when you use map, your function receives a Rowas parameter. Therefore, you must use the Rowmethods to access its members (note that the index starts from 0):

当您将 DataFrame 转换为 RDD 时，您会得到一个RDD[Row]，因此当您使用时map，您的函数会收到一个Row作为参数。因此，必须使用Row方法访问其成员（注意索引从0开始）：

df.rdd.map { 
  row: Row => (row.getString(1) + "_" + row.getString(2), row)
}.take(5)

You can view more examples and check all methods available for Rowobjects in the Spark scaladoc.

您可以Row在Spark scaladoc 中查看更多示例并检查可用于对象的所有方法。

Edit:I don't know the reason why you are doing this operation, but for concatenating String columns of a DataFrame you may consider the following option:

编辑：我不知道您执行此操作的原因，但是为了连接 DataFrame 的 String 列，您可以考虑以下选项：

import org.apache.spark.sql.functions._
val newDF = df.withColumn("concat", concat(df("col2"), lit("_"), df("col3")))

Answer 2

回答by Alberto Bonsanto

You can access every element of a Rowlike if it was a Listor Array, it means using (index), however you can use the method getalso.

您可以访问Row 的每个元素，就像它是 aList或一样Array，这意味着使用(index)，但是您也可以使用该方法get。

For example:

例如：

df.rdd.map {t =>
  (t(2).toString + "_" + t(3).toString, t)
}.take(5)

spark - scala：不是 org.apache.spark.sql.Row 的成员

提问by Edamame

回答by Daniel de Paula

回答by Alberto Bonsanto

相关推荐

最近更新

标签

spark - scala：不是 org.apache.spark.sql.Row 的成员

提问by Edamame

回答by Daniel de Paula

回答by Alberto Bonsanto

相关推荐

scala 如何修复“MetadataFetchFailedException：缺少随机播放的输出位置”？

在 zeppelin scala 中读取大型 JSON 文件时出现 org.apache.thrift.transport.TTransportException 错误

scala 如何在打开新 SparkContext 之前停止正在运行的 SparkContext

在 Scala 中检查一个数是否为素数

相关推荐

最近更新

标签