将一行转换为 spark scala 中的列表

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/44531937/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-22 09:18:23  来源:igfitidea点击:

Convert a row to a list in spark scala

scalaapache-sparkdataframe

提问by Mr.cysl

Is that possible to do that? All the data in my dataframe(~1000 cols) are Doubles and I'm wandering whether I could turn a row of data to a list of Doubles?

这样做有可能吗?我的数据框中的所有数据(~1000 列)都是双精度数,我在犹豫是否可以将一行数据转换为双精度数列表?

回答by Psidom

You can use toSeqmethod on the Row and then convert the type from Seq[Any]to Seq[Double](if you are sure the data types of all the columns are Double):

您可以toSeq在 Row 上使用方法,然后将类型从Seq[Any]to转换Seq[Double](如果您确定所有列的数据类型都是 Double):

val df = Seq((1.0,2.0),(2.1,2.2)).toDF("A", "B")
// df: org.apache.spark.sql.DataFrame = [A: double, B: double]

df.show
+---+---+
|  A|  B|
+---+---+
|1.0|2.0|
|2.1|2.2|
+---+---+

df.first.toSeq.asInstanceOf[Seq[Double]]
// res1: Seq[Double] = WrappedArray(1.0, 2.0)

In case you have String type columns, use toSeqand then use mapwith pattern matching to convert the Stringto Double:

如果您有 String 类型的列,请使用toSeq然后map与模式匹配一​​起使用将String转换为Double

val df = Seq((1.0,"2.0"),(2.1,"2.2")).toDF("A", "B")
// df: org.apache.spark.sql.DataFrame = [A: double, B: string]

df.first.toSeq.map{ 
    case x: String => x.toDouble
    case x: Double => x 
}
// res3: Seq[Double] = ArrayBuffer(1.0, 2.0)

回答by Ramesh Maharjan

If you have a dataframewith doubleswhich you want to convert into Listof doubles, then just convert the dataframeinto rddwhich will give you RDD[Row]you can covert this to Listas

如果你有一个dataframedoubles它要转换成Listdoubles,那么就转换dataframerdd它会给你RDD[Row],你可以隐蔽这List

dataframe.rdd.map(_.toSeq.toList)

You will get list of doubles

你会得到 list of doubles