SparkSQL 时间戳查询失败



我通过Spark将一些日志文件放入sql表中,我的模式如下所示:

|-- timestamp: timestamp (nullable = true) 
|-- c_ip: string (nullable = true) 
|-- cs_username: string (nullable = true) 
|-- s_ip: string (nullable = true) 
|-- s_port: string (nullable = true) 
|-- cs_method: string (nullable = true) 
|-- cs_uri_stem: string (nullable = true) 
|-- cs_query: string (nullable = true) 
|-- sc_status: integer (nullable = false) 
|-- sc_bytes: integer (nullable = false) 
|-- cs_bytes: integer (nullable = false) 
|-- time_taken: integer (nullable = false) 
|-- User_Agent: string (nullable = true) 
|-- Referrer: string (nullable = true) 

正如你所注意到的,我创建了一个时间戳字段,我读到Spark支持该字段(据我所知,日期不起作用)。我很想用于"where timestamp>(2012-10-08 16:10:36.0)"这样的查询,但是当我运行它时,我不断收到错误。我尝试了以下 2 种 sintax 形式:对于第二个,我解析了一个字符串,所以我确定我实际上以时间戳格式传递它。我使用 2 个函数:解析日期 2 时间戳

关于我应该如何处理时间戳值的任何提示?

谢谢!

1) scala> sqlContext.sql("SELECT * FROM Logs as l where l.timestamp=(2012-10-08 16:10:36.0)").collect

java.lang.RuntimeException: [1.55] failure: ``)'' expected but 16 found 
SELECT * FROM Logs as l where l.timestamp=(2012-10-08 16:10:36.0) 
                                                  ^ 

2) sqlContext.sql("SELECT * FROM Logs as l where l.timestamp="+date2timestamp(formatTime3.parse("2012-10-08 16:10:36.0"))).collect

java.lang.RuntimeException: [1.54] failure: ``UNION'' expected but 16 found 
SELECT * FROM Logs as l where l.timestamp=2012-10-08 16:10:36.0 
                                                 ^ 
我认为

问题首先是时间戳的精度,而且我传递的表示时间戳的字符串必须转换为字符串

所以这个查询现在有效:

sqlContext.sql("SELECT * FROM Logs as l where cast(l.timestampLog as String) <= '2012-10-08 16:10:36'")

你忘了引号。

尝试使用以下语法:

L.timestamp = '2012-07-16 00:00:00'

或者,尝试

L.timestamp = CAST('2012-07-16 00:00:00' AS TIMESTAMP)

将时间戳的字符串表示形式转换为时间戳。 cast('2012-10-10 12:00:00' 作为时间戳) 然后,您可以将比较作为时间戳,而不是字符串。而不是:

sqlContext.sql("SELECT * FROM Logs as l where cast(l.timestamp as String) <= '2012-10-08 16:10:36'")

尝试

sqlContext.sql("SELECT * FROM Logs as l where l.timestamp <= cast('2012-10-08 16:10:36' as timestamp)")

可悲的是,这对我不起作用。我正在使用Apache Spark 1.4.1。以下代码是我的解决方案:

Date date = new Date();
String query = "SELECT * FROM Logs as l where l.timestampLog <= CAST('" + new java.sql.Timestamp(date.getTime()) + "' as TIMESTAMP)";
sqlContext.sql(query);

将时间戳日志转换为字符串不会引发任何错误,但未返回任何数据。

相关内容

  • 没有找到相关文章

最新更新