嗨,我正试图用am pm从时间戳中获取工作日。所以我有一个数据帧
dataframe = spark.createDataFrame(
data = [ ("1","9/8/2019 10:01:28 PM")],
schema=["id","input_timestamp"])
dataframe.show()
dataframe.printSchema()
如果我这样做,它会给出null作为输出。我必须使用截止日期
dataframe.withColumn("timestamp",to_date("input_timestamp", "MM/dd/yyyy HH:mm:ss am"))
.show(truncate=False)
+---+--------------------+---------+
|id |input_timestamp |timestamp|
+---+--------------------+---------+
|1 |9/8/2019 10:01:28 PM|null |
+---+--------------------+---------+
使用M/d/yyyy hh:mm:ss a
作为格式。
spark.conf.set('spark.sql.legacy.timeParserPolicy', 'LEGACY')
spark.sparkContext.parallelize([("9/8/2019 10:01:28 PM",)]).toDF(['ts_str']).
withColumn('ts', func.to_timestamp('ts_str', 'M/d/yyyy hh:mm:ss a')).
show(truncate=False)
# +--------------------+-------------------+
# |ts_str |ts |
# +--------------------+-------------------+
# |9/8/2019 10:01:28 PM|2019-09-08 22:01:28|
# +--------------------+-------------------+
# root
# |-- ts_str: string (nullable = true)
# |-- ts: timestamp (nullable = true)