Spark sqltimestamp数据类型实际上存储时区吗？

我正在使用databricks 6.5(Apache Spark 2.4.5，Scala 2.11(

%sql
select 
current_timestamp C1,
from_utc_timestamp(current_timestamp,"Australia/Adelaide") C2,
date_format(from_utc_timestamp(current_timestamp,"Australia/Adelaide"),"Z") C3

给出这个结果

C1                              C2                              C3
=====================================================================
2020-07-02T07:06:57.716+0000    2020-07-02T16:36:57.716+0000    +0000

任何地方都没有时区指示符的标志

我能找到的所有日期时间函数都需要您指定一个时区。在我看来，时区部分不应该显示，因为它实际上并不存在于数据中。

这个问题说：不，它没有存储，但任何人都可以确认吗？

使用阿帕奇火花中的current_timestamp获得正确的时区偏移量

编辑

还有其他人认为 +0000 在这里具有误导性吗？对我来说，这意味着日期的时区为 UTC，并且可能能够存储不同的时区。我来自SQL Server世界，其中不存储时区的日期时间没有时区指示符。

在 Spark 中，所有日期时间操作/函数都可以识别时区，但 Spark 内部从不存储时区，它们以int和long存储时间

来自 Spark doc -

* Helper functions for converting between internal and external date and time representations.
* Dates are exposed externally as java.sql.Date and are represented internally as the number of
* dates since the Unix epoch (1970-01-01). Timestamps are exposed externally as java.sql.Timestamp
* and are stored internally as longs, which are capable of storing timestamps with microsecond
* precision.

参考- 火花-git

Spark sql 时间戳数据类型实际上是否存储时区?

编辑

相关内容

最新更新

热门标签：