如何将时间戳转换为"yyyy-mm-ddThh:mm:ss. ssz "使用PySpark?
输入时间戳,df:
| col_string |
| :-------------------- |
| 5/15/2022 2:11:06 AM |
期望输出(时间戳),df:
| col_timestamp |
| :---------------------- |
| 2022-05-15T2:11:06.000Z |
to_timestamp
可提供可选的format
参数。
from pyspark.sql import functions as F
df = spark.createDataFrame([("5/15/2022 2:11:06 AM",)], ["col_string"])
df = df.select(F.to_timestamp("col_string", "M/dd/yyyy h:mm:ss a").alias("col_ts"))
df.show()
# +-------------------+
# | col_ts|
# +-------------------+
# |2022-05-15 02:11:06|
# +-------------------+
df.printSchema()
# root
# |-- col_ts: timestamp (nullable = true)