scala SparkSQL 时间戳查询失败
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/27069537/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
SparkSQL Timestamp query failure
提问by bobo32
I put some log files into sql tables through Spark and my schema looks like this:
我通过 Spark 将一些日志文件放入 sql 表中,我的架构如下所示:
|-- timestamp: timestamp (nullable = true)
|-- c_ip: string (nullable = true)
|-- cs_username: string (nullable = true)
|-- s_ip: string (nullable = true)
|-- s_port: string (nullable = true)
|-- cs_method: string (nullable = true)
|-- cs_uri_stem: string (nullable = true)
|-- cs_query: string (nullable = true)
|-- sc_status: integer (nullable = false)
|-- sc_bytes: integer (nullable = false)
|-- cs_bytes: integer (nullable = false)
|-- time_taken: integer (nullable = false)
|-- User_Agent: string (nullable = true)
|-- Referrer: string (nullable = true)
As you can notice I created a timestamp field which I read is supported by Spark (Date wouldn't work as far as I understood). I would love to use for queries like "where timestamp>(2012-10-08 16:10:36.0)" but when I run it I keep getting errors. I tried these 2 following sintax forms: For the second one I parse a string so Im sure Im actually pass it in a timestamp format. I use 2 functions: parseand date2timestamp.
正如您所注意到的,我创建了一个时间戳字段,我读取了它由 Spark 支持(据我所知,日期不起作用)。我很想使用诸如“where timestamp>(2012-10-08 16:10:36.0)”之类的查询,但是当我运行它时,我不断收到错误消息。我尝试了以下两种语法形式:对于第二个,我解析一个字符串,所以我确定我实际上以时间戳格式传递它。我使用 2 个函数:parse和 date2timestamp。
Any hint on how I should handle timestamp values?
关于我应该如何处理时间戳值的任何提示?
Thanks!
谢谢!
1) scala> sqlContext.sql("SELECT * FROM Logs as l where l.timestamp=(2012-10-08 16:10:36.0)").collect
1) scala> sqlContext.sql("SELECT * FROM Logs as l where l.timestamp=(2012-10-08 16:10:36.0)").collect
java.lang.RuntimeException: [1.55] failure: ``)'' expected but 16 found
SELECT * FROM Logs as l where l.timestamp=(2012-10-08 16:10:36.0)
^
2) sqlContext.sql("SELECT * FROM Logs as l where l.timestamp="+date2timestamp(formatTime3.parse("2012-10-08 16:10:36.0"))).collect
2) sqlContext.sql("SELECT * FROM Logs as l where l.timestamp="+date2timestamp(formatTime3.parse("2012-10-08 16:10:36.0"))).collect
java.lang.RuntimeException: [1.54] failure: ``UNION'' expected but 16 found
SELECT * FROM Logs as l where l.timestamp=2012-10-08 16:10:36.0
^
回答by bobo32
I figured that the problem was the precision of the timestamp first of all and also the string that I pass representing the timestamp has to be casted as a String
我认为问题首先是时间戳的精度,而且我传递的代表时间戳的字符串必须被转换为字符串
So this query works now:
所以这个查询现在有效:
sqlContext.sql("SELECT * FROM Logs as l where cast(l.timestampLog as String) <= '2012-10-08 16:10:36'")
回答by Guillaume
You forgot the quotation marks.
你忘了引号。
Try something with this syntax:
尝试使用以下语法:
L.timestamp = '2012-07-16 00:00:00'
Alternatively, try
或者,尝试
L.timestamp = CAST('2012-07-16 00:00:00' AS TIMESTAMP)
回答by Azeroth2b
Cast the string representation of the timestamp to timestamp. cast('2012-10-10 12:00:00' as timestamp) Then you can do comparison as timestamps, not strings. Instead of:
将时间戳的字符串表示形式转换为时间戳。cast('2012-10-10 12:00:00' as timestamp) 然后您可以将比较作为时间戳,而不是字符串。代替:
sqlContext.sql("SELECT * FROM Logs as l where cast(l.timestampLog as String) <= '2012-10-08 16:10:36'")
try
尝试
sqlContext.sql("SELECT * FROM Logs as l where l.timestampLog <= cast('2012-10-08 16:10:36' as timestamp)")
回答by Ivan Stoyanov
Sadly this didn't work for me. I am using Apache Spark 1.4.1. The following code is my solution:
可悲的是,这对我不起作用。我正在使用 Apache Spark 1.4.1。以下代码是我的解决方案:
Date date = new Date();
String query = "SELECT * FROM Logs as l where l.timestampLog <= CAST('" + new java.sql.Timestamp(date.getTime()) + "' as TIMESTAMP)";
sqlContext.sql(query);
Casting the timestampLog as string did not throw any errors but returned no data.
将 timestampLog 转换为字符串不会引发任何错误,但没有返回任何数据。

