Spark SQL 将字符串转换为时间戳

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/45558499/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-01 05:17:15  来源:igfitidea点击:

Spark SQL converting string to timestamp

sqlapache-sparkapache-spark-sqlpyspark-sql

提问by Jessica

I'm new to Spark SQL and am trying to convert a string to a timestamp in a spark data frame. I have a string that looks like '2017-08-01T02:26:59.000Z'in a column called time_string

我是 Spark SQL 的新手,正在尝试将字符串转换为 spark 数据框中的时间戳。我有一个看起来像'2017-08-01T02:26:59.000Z'在名为 time_string 的列中的字符串

My code to convert this string to timestamp is

我将此字符串转换为时间戳的代码是

CAST (time_string AS Timestamp)

But this gives me a timestamp of 2017-07-31 19:26:59

但这给了我一个时间戳 2017-07-31 19:26:59

Why is it changing the time? Is there a way to do this without changing the time?

为什么时间会变?有没有办法在不改变时间的情况下做到这一点?

Thanks for any help!

谢谢你的帮助!

回答by ktheitroadalo

You could use unix_timestampfunction to convert the utc formatted date to timestamp

您可以使用unix_timestamp函数将 utc 格式的日期转换为时间戳

val df2 = Seq(("a3fac", "2017-08-01T02:26:59.000Z")).toDF("id", "eventTime")

df2.withColumn("eventTime1", unix_timestamp($"eventTime", "yyyy-MM-dd'T'HH:mm:ss.SSS'Z'").cast(TimestampType))

Output:

输出:

+-------------+---------------------+
|userid       |eventTime            |
+-------------+---------------------+
|a3fac        |2017-08-01 02:26:59.0|
+-------------+---------------------+

Hope this helps!

希望这可以帮助!

回答by Rzv Razvan

Solution on Java

Java 上的解决方案

There are some Spark SQL functions which let you to play with the date format.

有一些 Spark SQL 函数可以让您使用日期格式。

Conversion example : 20181224091530-> 2018-12-24 09:15:30

转换示例:20181224091530-> 2018-12-24 09:15:30

Solution (Spark SQL statement) :

解决方案(Spark SQL 语句):

SELECT
 ...
 to_timestamp(cast(DECIMAL_DATE as string),'yyyyMMddHHmmss') as `TIME STAMP DATE`,
 ...
FROM some_table

You can use the SQL statements by using an instance of org.apache.spark.sql.SparkSession. For example if you want to execute an sql statement, Spark provide the following solution:

您可以通过使用org.apache.spark.sql.SparkSession的实例来使用 SQL 语句。比如要执行一条sql语句,Spark提供如下解决方案:

...
// You have to create an instance of SparkSession
sparkSession.sql(sqlStatement); 
...

Notes:

笔记:

  • You have to convert the decimal to string and after you can achieve the parsing to timestamp format
  • You can play with the format the get however format you want...
  • 您必须将十进制转换为字符串,然后才能实现解析为时间戳格式
  • 您可以使用您想要的格式来获取任何格式...