PySpark字符串到时间戳转换



如何将时间戳转换为"yyyy-mm-ddThh:mm:ss. ssz "使用PySpark?

输入时间戳,df:

| col_string            |
| :-------------------- |
| 5/15/2022 2:11:06 AM  |

期望输出(时间戳),df:

| col_timestamp           |
| :---------------------- |
| 2022-05-15T2:11:06.000Z |

to_timestamp可提供可选的format参数。

from pyspark.sql import functions as F
df = spark.createDataFrame([("5/15/2022 2:11:06 AM",)], ["col_string"])
df = df.select(F.to_timestamp("col_string", "M/dd/yyyy h:mm:ss a").alias("col_ts"))
df.show()
# +-------------------+
# |             col_ts|
# +-------------------+
# |2022-05-15 02:11:06|
# +-------------------+
df.printSchema()
# root
#  |-- col_ts: timestamp (nullable = true)

最新更新