scala 如何在 Spark 中将 unix 时间戳转换为日期

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/31134969/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-22 07:19:05  来源:igfitidea点击:

How to convert unix timestamp to date in Spark

scaladatetimeapache-sparktimestampnscala-time

提问by youngchampion

I have a data frame with a column of unix timestamp(eg.1435655706000), and I want to convert it to data with format 'yyyy-MM-DD', I've tried nscala-time but it doesn't work.

我有一个带有 unix 时间戳列的数据框(例如 1435655706000),我想将其转换为格式为“yyyy-MM-DD”的数据,我试过 nscala-time 但它不起作用。

val time_col = sqlc.sql("select ts from mr").map(_(0).toString.toDateTime)
time_col.collect().foreach(println)

and I got error: java.lang.IllegalArgumentException: Invalid format: "1435655706000" is malformed at "6000"

我得到了错误:java.lang.IllegalArgumentException: Invalid format: "1435655706000" is malformed at "6000"

回答by Yuan Zhao

Since spark1.5 , there is a builtin UDF for doing that.

从 spark1.5 开始,有一个内置的 UDF 可以做到这一点。

val df = sqlContext.sql("select from_unixtime(ts,'YYYY-MM-dd') as `ts` from mr")

Please check Spark 1.5.2 API Docfor more info.

请查看Spark 1.5.2 API 文档了解更多信息。

回答by Marsellus Wallace

Here it is using Scala DataFrame functions: from_unixtimeand to_date

这里使用的是 Scala DataFrame 函数:from_unixtimeto_date

// NOTE: divide by 1000 required if milliseconds
// e.g. 1446846655609 -> 2015-11-06 21:50:55 -> 2015-11-06 
mr.select(to_date(from_unixtime($"ts" / 1000))) 

回答by Hammad Haleem

import org.joda.time.{DateTimeZone}
import org.joda.time.format.DateTimeFormat

You need to import the following libraries.

您需要导入以下库。

val stri = new DateTime(timeInMillisec).toDateTime.toString("yyyy/MM/dd")

Or adjusting to your case :

或根据您的情况进行调整:

 val time_col = sqlContext.sql("select ts from mr")
                     .map(line => new DateTime(line(0).toInt).toDateTime.toString("yyyy/MM/dd"))

There could be another way :

可能有另一种方式:

  import com.github.nscala_time.time.Imports._

  val date = (new DateTime() + ((threshold.toDouble)/1000).toInt.seconds )
             .toString("yyyy/MM/dd")

Hope this helps :)

希望这可以帮助 :)

回答by Orar

You needn't convert to String before applying toDataTime with nscala_time

在使用 nscala_time 应用 toDataTime 之前,您无需转换为 String

import com.github.nscala_time.time.Imports._

import com.github.nscala_time.time.Imports._

scala> 1435655706000L.toDateTime
res4: org.joda.time.DateTime = 2015-06-30T09:15:06.000Z

`

`

回答by youngchampion

I have solved this issue using the joda-timelibrary by mapping on the DataFrameand converting the DateTimeinto a String :

我已经使用joda-time库解决了这个问题,方法是在 上映射DataFrame并将其DateTime转换为 String :

import org.joda.time._
val time_col = sqlContext.sql("select ts from mr")
                         .map(line => new DateTime(line(0)).toString("yyyy-MM-dd"))

回答by Alex Stanovsky

You can use the following syntax in Java

您可以在 Java 中使用以下语法

input.select("timestamp)
            .withColumn("date", date_format(col("timestamp").$div(1000).cast(DataTypes.TimestampType), "yyyyMMdd").cast(DataTypes.IntegerType))

回答by Abhinav Kaushal Keshari

What you can do is:

你可以做的是:

input.withColumn("time", concat(from_unixtime(input.col("COL_WITH_UNIX_TIME")/1000,
"yyyy-MM-dd'T'HH:mm:ss"), typedLit("."), substring(input.col("COL_WITH_UNIX_TIME"), 11, 3), 
typedLit("Z")))

where time is a new column name and COL_WITH_UNIX_TIME is the name of the column which you want to convert. This will give data in millis, making your data more accurate, like: "yyyy-MM-dd'T'HH:mm:ss.SSS'Z'"

其中 time 是新的列名,COL_WITH_UNIX_TIME 是要转换的列的名称。这将以毫秒为单位提供数据,使您的数据更准确,例如:"yyyy-MM-dd'T'HH:mm:ss.SSS'Z'"