scala 如何在 Spark 中将 unix 时间戳转换为日期
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/31134969/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to convert unix timestamp to date in Spark
提问by youngchampion
I have a data frame with a column of unix timestamp(eg.1435655706000), and I want to convert it to data with format 'yyyy-MM-DD', I've tried nscala-time but it doesn't work.
我有一个带有 unix 时间戳列的数据框(例如 1435655706000),我想将其转换为格式为“yyyy-MM-DD”的数据,我试过 nscala-time 但它不起作用。
val time_col = sqlc.sql("select ts from mr").map(_(0).toString.toDateTime)
time_col.collect().foreach(println)
and I got error: java.lang.IllegalArgumentException: Invalid format: "1435655706000" is malformed at "6000"
我得到了错误:java.lang.IllegalArgumentException: Invalid format: "1435655706000" is malformed at "6000"
回答by Yuan Zhao
Since spark1.5 , there is a builtin UDF for doing that.
从 spark1.5 开始,有一个内置的 UDF 可以做到这一点。
val df = sqlContext.sql("select from_unixtime(ts,'YYYY-MM-dd') as `ts` from mr")
Please check Spark 1.5.2 API Docfor more info.
请查看Spark 1.5.2 API 文档了解更多信息。
回答by Marsellus Wallace
Here it is using Scala DataFrame functions: from_unixtimeand to_date
这里使用的是 Scala DataFrame 函数:from_unixtime和to_date
// NOTE: divide by 1000 required if milliseconds
// e.g. 1446846655609 -> 2015-11-06 21:50:55 -> 2015-11-06
mr.select(to_date(from_unixtime($"ts" / 1000)))
回答by Hammad Haleem
import org.joda.time.{DateTimeZone}
import org.joda.time.format.DateTimeFormat
You need to import the following libraries.
您需要导入以下库。
val stri = new DateTime(timeInMillisec).toDateTime.toString("yyyy/MM/dd")
Or adjusting to your case :
或根据您的情况进行调整:
val time_col = sqlContext.sql("select ts from mr")
.map(line => new DateTime(line(0).toInt).toDateTime.toString("yyyy/MM/dd"))
There could be another way :
可能有另一种方式:
import com.github.nscala_time.time.Imports._
val date = (new DateTime() + ((threshold.toDouble)/1000).toInt.seconds )
.toString("yyyy/MM/dd")
Hope this helps :)
希望这可以帮助 :)
回答by Orar
You needn't convert to String before applying toDataTime with nscala_time
在使用 nscala_time 应用 toDataTime 之前,您无需转换为 String
import com.github.nscala_time.time.Imports._
import com.github.nscala_time.time.Imports._
scala> 1435655706000L.toDateTime
res4: org.joda.time.DateTime = 2015-06-30T09:15:06.000Z
`
`
回答by youngchampion
I have solved this issue using the joda-timelibrary by mapping on the DataFrameand converting the DateTimeinto a String :
我已经使用joda-time库解决了这个问题,方法是在 上映射DataFrame并将其DateTime转换为 String :
import org.joda.time._
val time_col = sqlContext.sql("select ts from mr")
.map(line => new DateTime(line(0)).toString("yyyy-MM-dd"))
回答by Alex Stanovsky
You can use the following syntax in Java
您可以在 Java 中使用以下语法
input.select("timestamp)
.withColumn("date", date_format(col("timestamp").$div(1000).cast(DataTypes.TimestampType), "yyyyMMdd").cast(DataTypes.IntegerType))
回答by Abhinav Kaushal Keshari
What you can do is:
你可以做的是:
input.withColumn("time", concat(from_unixtime(input.col("COL_WITH_UNIX_TIME")/1000,
"yyyy-MM-dd'T'HH:mm:ss"), typedLit("."), substring(input.col("COL_WITH_UNIX_TIME"), 11, 3),
typedLit("Z")))
where time is a new column name and COL_WITH_UNIX_TIME is the name of the column which you want to convert. This will give data in millis, making your data more accurate, like: "yyyy-MM-dd'T'HH:mm:ss.SSS'Z'"
其中 time 是新的列名,COL_WITH_UNIX_TIME 是要转换的列的名称。这将以毫秒为单位提供数据,使您的数据更准确,例如:"yyyy-MM-dd'T'HH:mm:ss.SSS'Z'"

