如何在Databricks sql转换时间戳字符串到日期?



我知道这个问题已经被问过好几次了,我参考/搜索了提供的问题和答案。我还阅读了数据库文档,并尝试了几次,但我就是没有得到想要的结果。

给定:

+----------------------------+
|        data_type           |
+----------------------------+
|        timestamp           |
+----------------------------+

,

+------------------------------+
|        data_value            |
+------------------------------+
| 2017-11-22T00:00:00.000+0000 |
+------------------------------+

期望结果:

+----------------------------+
|        data_value          |
+----------------------------+
|        22.11.2017          |
+----------------------------+

到目前为止,我所尝试和失败的:

  date_format(date_value, 'dd.mm.yyyy') AS MFGDate,
  to_date(date_value) AS MFGDate,
  date(date_value) AS MFGDate

结果:

+-------------------------+------------+
|   MFGDate  |   MFGDate  |   MFGDate  |
+------------+------------+------------+
| 22.00.2017 | 2017-11-22 | 2017-11-22 |
+------------+------------+------------+

完整查询:

SELECT
   '01 FUV' AS Stage,
   d1.ps_name AS FUV,
   d1.ps_name AS LOT,
   d2.date_value AS MFGDate
 FROM
  table d1
  INNER JOIN table d2 ON d1.ag_id = d2.ag_id
  AND d1.ag_path = d2.ag_path
  AND d1.ps_name = d2.ps_name
WHERE
  d1.AG_PATH LIKE "sourcepath'

结果:

+--------+--------+--------+------------------------------+
| Stage  | FUV    | Lot    | MFGDate                      | 
+--------+--------+--------+------------------------------+
| 01 FUV | A1U079 | A1U079 | 2019-03-27T00:00:00.000+0000 |
| 01 FUV | A1U255 | A1U255 | 2019-06-22T00:00:00.000+0000 |
| 01 FUV | A1U255 | A1U255 | 2019-11-10T00:00:00.000+0000 |

如何获取MFGDate列的值,格式为:22.11.2017 ?

您可以使用内置函数- date_format,您缺少的是正确的Symbol用法。该链接解释了符号的用法

典型用法

input_list = [
  (1,"2019-11-07 05:30:00")
  ,(2,"2019-07-09 15:30:00")
  ,(3,"2019-12-09 10:30:00")
  ,(4,"2019-02-11 14:30:00")
]

sparkDF = sql.createDataFrame(input_list,['id','date'])
sparkDF = sparkDF.withColumn('date',F.to_timestamp(F.col('date'), 'yyyy-MM-dd HH:mm:ss'))
sparkDF = sparkDF.withColumn('date_formated',F.date_format(F.col('date'), 'dd.MM.yyyy'))
sparkDF.show()
+---+-------------------+-------------+
| id|               date|date_formated|
+---+-------------------+-------------+
|  1|2019-11-07 05:30:00|   07.11.2019|
|  2|2019-07-09 15:30:00|   09.07.2019|
|  3|2019-12-09 10:30:00|   09.12.2019|
|  4|2019-02-11 14:30:00|   11.02.2019|
+---+-------------------+-------------+

你很接近了。你可以使用内置函数- date_format,但是你得到"00"返回当月的值是因为您的格式不正确。您指定了"mm"它返回小时的分钟;你应该指定"MM"它返回一年中的月份。所以正确的代码是:

date_format(date_value, 'dd.MM.yyyy') AS MFGDate

文档在这里:https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html

最新更新