01/05/2002
08/03/2002
我有一个表:
使用您提供的示例,stack
可以做到:
from pyspark.sql import functions as F
df = spark.createDataFrame(
[('Finance', 'John Doe', '01/01/2022', '01/05/2002'),
('Marketing', 'Mark Smith', '05/02/2022', '08/03/2002')],
['Department', 'Name', 'Start', 'End'])
df = df.select('Department', 'Name', F.expr("stack(2, 'Start', Start, 'End', End) as (Event, Date)"))
df.show()
# +----------+----------+-----+----------+
# |Department| Name|Event| Date|
# +----------+----------+-----+----------+
# | Finance| John Doe|Start|01/01/2022|
# | Finance| John Doe| End|01/05/2002|
# | Marketing|Mark Smith|Start|05/02/2022|
# | Marketing|Mark Smith| End|08/03/2002|
# +----------+----------+-----+----------+