scala 如何将时间戳列转换为纪元秒?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/51270784/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-22 09:35:22  来源:igfitidea点击:

How to convert timestamp column to epoch seconds?

scalaapache-sparktimestampapache-spark-sql

提问by troutinator

How do you convert a timestamp column to epoch seconds?

如何将时间戳列转换为纪元秒?

var df = sc.parallelize(Seq("2018-07-01T00:00:00Z")).toDF("date_string")
df = df.withColumn("timestamp", $"date_string".cast("timestamp"))
df.show(false)

DataFrame:

数据框:

+--------------------+---------------------+
|date_string         |timestamp            |
+--------------------+---------------------+
|2018-07-01T00:00:00Z|2018-07-01 00:00:00.0|
+--------------------+---------------------+

回答by troutinator

If you have a timestamp you can cast it to a long to get the epoch seconds

如果您有时间戳,则可以将其转换为 long 以获取纪元秒数

df = df.withColumn("epoch_seconds", $"timestamp".cast("long"))
df.show(false)

DataFrame

数据框

+--------------------+---------------------+-------------+
|date_string         |timestamp            |epoch_seconds|
+--------------------+---------------------+-------------+
|2018-07-01T00:00:00Z|2018-07-01 00:00:00.0|1530403200   |
+--------------------+---------------------+-------------+

回答by Shaido - Reinstate Monica

Use unix_timestampfrom org.apache.spark.functions. It can a timestamp column or from a string column where it is possible to specify the format. From the documentation:

unix_timestamp从使用org.apache.spark.functions。它可以是时间戳列,也可以是可以指定格式的字符串列。从文档:

public static Column unix_timestamp(Column s)

Converts time string in format yyyy-MM-dd HH:mm:ss to Unix timestamp (in seconds), using the default timezone and the default locale, return null if fail.

public static Column unix_timestamp(Column s, String p)

Convert time string with given pattern (see http://docs.oracle.com/javase/tutorial/i18n/format/simpleDateFormat.html) to Unix time stamp (in seconds), return null if fail.

public static Column unix_timestamp(Column s)

将格式为 yyyy-MM-dd HH:mm:ss 的时间字符串转换为 Unix 时间戳(以秒为单位),使用默认时区和默认语言环境,如果失败则返回 null。

public static Column unix_timestamp(Column s, String p)

将具有给定模式的时间字符串(参见http://docs.oracle.com/javase/tutorial/i18n/format/simpleDateFormat.html)转换为 Unix 时间戳(以秒为单位),如果失败则返回 null。

Use as follows:

使用方法如下:

import org.apache.spark.functions._

df.withColumn("epoch_seconds", unix_timestamp($"timestamp")))

or if the column is a string with other format:

或者如果该列是其他格式的字符串:

df.withColumn("epoch_seconds", unix_timestamp($"date_string", "yyyy-MM-dd'T'HH:mm:ss'Z'")))

回答by dyatchenko

It can be easily done with unix_timestampfunction in spark SQL like this:

可以使用unix_timestampspark SQL 中的函数轻松完成,如下所示:

spark.sql("SELECT unix_timestamp(inv_time) AS time_as_long FROM agg_counts LIMIT 10").show()

Hope this helps.

希望这可以帮助。

回答by Samrat

You can use the function unix_timestampand cast it into any datatype.

您可以使用该函数unix_timestamp并将其转换为任何数据类型。

Example:

例子:

val df1 = df.select(unix_timestamp($"date_string", "yyyy-MM-dd HH:mm:ss").cast(LongType).as("epoch_seconds"))