scala 如何迭代ScalawrappedArray?(火花)

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/38257630/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-22 08:27:29  来源:igfitidea点击:

How to iterate scala wrappedArray? (Spark)

scalaapache-sparkapache-spark-sql

提问by boY

I perform the following operations:

我执行以下操作:

val tempDict = sqlContext.sql("select words.pName_token,collect_set(words.pID) as docids 
                               from words
                               group by words.pName_token").toDF()

val wordDocs = tempDict.filter(newDict("pName_token")===word)

val listDocs = wordDocs.map(t => t(1)).collect()

listDocs: Array

[Any] = Array(WrappedArray(123, 234, 205876618, 456))

My question is how do I iterate over this wrapped array or convert this into a list?

我的问题是如何遍历这个包装的数组或将其转换为列表?

The options I get for the listDocsare apply, asInstanceOf, clone, isInstanceOf, length, toString, and update.

我得到的选项listDocsapplyasInstanceOfcloneisInstanceOflengthtoString,和 update

How do I proceed?

我该如何进行?

采纳答案by Rockie Yang

Here is one way to solve this.

这是解决此问题的一种方法。

import org.apache.spark.sql.Row
import org.apache.spark.sql.functions._
import scala.collection.mutable.WrappedArray

val data = Seq((Seq(1,2,3),Seq(4,5,6),Seq(7,8,9)))
val df = sqlContext.createDataFrame(data)
val first = df.first

// use a pattern match to deferral the type
val mapped = first.getAs[WrappedArray[Int]](0)

// now we can use it like normal collection
mapped.mkString("\n")

// get rows where has array
val rows = df.collect.map {
    case Row(a: Seq[Any], b: Seq[Any], c: Seq[Any]) => 
        (a, b, c)
}
rows.mkString("\n")