scala 如何将时间戳列转换为纪元秒?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/51270784/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to convert timestamp column to epoch seconds?
提问by troutinator
How do you convert a timestamp column to epoch seconds?
如何将时间戳列转换为纪元秒?
var df = sc.parallelize(Seq("2018-07-01T00:00:00Z")).toDF("date_string")
df = df.withColumn("timestamp", $"date_string".cast("timestamp"))
df.show(false)
DataFrame:
数据框:
+--------------------+---------------------+
|date_string |timestamp |
+--------------------+---------------------+
|2018-07-01T00:00:00Z|2018-07-01 00:00:00.0|
+--------------------+---------------------+
回答by troutinator
If you have a timestamp you can cast it to a long to get the epoch seconds
如果您有时间戳,则可以将其转换为 long 以获取纪元秒数
df = df.withColumn("epoch_seconds", $"timestamp".cast("long"))
df.show(false)
DataFrame
数据框
+--------------------+---------------------+-------------+
|date_string |timestamp |epoch_seconds|
+--------------------+---------------------+-------------+
|2018-07-01T00:00:00Z|2018-07-01 00:00:00.0|1530403200 |
+--------------------+---------------------+-------------+
回答by Shaido - Reinstate Monica
Use unix_timestampfrom org.apache.spark.functions. It can a timestamp column or from a string column where it is possible to specify the format. From the documentation:
unix_timestamp从使用org.apache.spark.functions。它可以是时间戳列,也可以是可以指定格式的字符串列。从文档:
public static Column unix_timestamp(Column s)Converts time string in format yyyy-MM-dd HH:mm:ss to Unix timestamp (in seconds), using the default timezone and the default locale, return null if fail.
public static Column unix_timestamp(Column s, String p)Convert time string with given pattern (see http://docs.oracle.com/javase/tutorial/i18n/format/simpleDateFormat.html) to Unix time stamp (in seconds), return null if fail.
public static Column unix_timestamp(Column s)将格式为 yyyy-MM-dd HH:mm:ss 的时间字符串转换为 Unix 时间戳(以秒为单位),使用默认时区和默认语言环境,如果失败则返回 null。
public static Column unix_timestamp(Column s, String p)将具有给定模式的时间字符串(参见http://docs.oracle.com/javase/tutorial/i18n/format/simpleDateFormat.html)转换为 Unix 时间戳(以秒为单位),如果失败则返回 null。
Use as follows:
使用方法如下:
import org.apache.spark.functions._
df.withColumn("epoch_seconds", unix_timestamp($"timestamp")))
or if the column is a string with other format:
或者如果该列是其他格式的字符串:
df.withColumn("epoch_seconds", unix_timestamp($"date_string", "yyyy-MM-dd'T'HH:mm:ss'Z'")))
回答by dyatchenko
It can be easily done with unix_timestampfunction in spark SQL like this:
可以使用unix_timestampspark SQL 中的函数轻松完成,如下所示:
spark.sql("SELECT unix_timestamp(inv_time) AS time_as_long FROM agg_counts LIMIT 10").show()
Hope this helps.
希望这可以帮助。
回答by Samrat
You can use the function unix_timestampand cast it into any datatype.
您可以使用该函数unix_timestamp并将其转换为任何数据类型。
Example:
例子:
val df1 = df.select(unix_timestamp($"date_string", "yyyy-MM-dd HH:mm:ss").cast(LongType).as("epoch_seconds"))

