Hive:如何将yyyy-mm-ddThh:mm:SS:sssZ转换为小时单位



我有以下时间戳:

2020-03-09T07:34:06:825Z
2020-03-09T07:54:12:220Z
2020-03-09T03:54:11:041Z
2020-03-09T09:22:10:220Z
2020-03-09T11:13:36:217Z
2020-03-09T11:23:26:040Z
2020-03-09T11:43:35:721Z

我想将它们转换为小时单位,例如:

2020-03-09T07:00:00
2020-03-09T07:00:00
2020-03-09T03:00:00
2020-03-09T09:00:00
2020-03-09T11:00:00
2020-03-09T11:00:00
2020-03-09T11:00:00

这可能吗?任何帮助都将不胜感激。堆栈溢出一直是救命稻草。它可以是日期时间或字符串格式。谢谢大家!

使用unix_timestampfrom_unixtime函数转换和格式化所需的时间戳。

select from_unixtime(unix_timestamp(string("2020-03-09T07:34:06:825Z"),"yyyy-MM-dd'T'hh:mm:ss:SSS'Z'"),"yyyy-MM-dd'T'hh:00:00") as new_ts;
+-------------------+
|new_ts             |
+-------------------+
|2020-03-09T07:00:00|
+-------------------+

Explanation:

unix_timestamp(
string("2020-03-09T07:34:06:825Z"), --sample data
"yyyy-MM-dd'T'hh:mm:ss:SSS'Z'") --match the data format

from_unixtime('unix_timestamp...etc',"yyyy-MM-dd'T'hh:00:00") --to format as required

使用regexp_replace:

with your_data as (
select stack(
'2020-03-09T07:34:06:825Z',
'2020-03-09T07:54:12:220Z',
'2020-03-09T03:54:11:041Z',
'2020-03-09T09:22:10:220Z',
'2020-03-09T11:13:36:217Z',
'2020-03-09T11:23:26:040Z',
'2020-03-09T11:43:35:721Z'
) as str
)
select regexp_replace(str,'(\d{4}-\d{2}-\d{2})T(\d{2}).*','$1T$2:00:00') 
from your_data;

结果:

2020-03-09T07:00:00
2020-03-09T07:00:00
2020-03-09T03:00:00
2020-03-09T09:00:00
2020-03-09T11:00:00
2020-03-09T11:00:00
2020-03-09T11:00:00

说明:

正则表达式定义了两组:

$1是日期部分(\d{4}-\d{2}-\d{2})

$2是T"(\d{2}("之后的小时部分则忽略末尾CCD_ 5处的所有其它内容。

提取'$1T$2:00:00'

最新更新