查询以获取Hive或Pyspark中每个星期日和星期六的日期



我想要所有星期日的日期&从指定日期起的星期六。例如,如果给定的日期是2020-10-01,则需要返回两行sunday_dates、saturday_datees,其值为all sunday&在该日期之后的星期六";2020-10-01";。

我试过这样的东西,但似乎对我不起作用。

spark.sql("select date_sub('2020-10-01', cast(date_format(current_date(),'u')%7 as int)) as sunday_dates").show(10,False)
+------------+
|sunday_dates|
+------------+
|2020-09-29  |
+------------+

有什么方法可以在蜂巢或pyspark中实现这一点吗。

谢谢!

您需要使用date_trunc()才能到达周开始日期张贴date_sub()date_sub()以便获得周六和周日

在此处创建数据帧

df = spark.createDataFrame([("2020-11-02",1),("2020-11-03",2),("2020-11-04",3)],["event_dt","word"])
df.show()
df = df.withColumn("week_start", F.date_trunc('WEEK', F.col("event_dt")))
#`In case you want to get backward weekdays`
df = df.selectExpr('*', 'date_sub(week_start, 2) as backward_Saturday')
df = df.selectExpr('*', 'date_sub(week_start, 1) as backward_Sunday')
# In case you want forward weekends
df = df.selectExpr('*', 'date_add(week_start, 5) as forward_Saturday')
df = df.selectExpr('*', 'date_add(week_start, 6) as forward_Sunday')
df.show()

输入

+----------+----+
|  event_dt|word|
+----------+----+
|2020-11-02|   1|
|2020-11-03|   2|
|2020-11-04|   3|
+----------+----+

输出

+----------+----+-------------------+-----------------+---------------+----------------+--------------+
|  event_dt|word|         week_start|backward_Saturday|backward_Sunday|forward_Saturday|forward_Sunday|
+----------+----+-------------------+-----------------+---------------+----------------+--------------+
|2020-11-02|   1|2020-11-02 00:00:00|       2020-10-31|     2020-11-01|      2020-11-07|    2020-11-08|
|2020-11-03|   2|2020-11-02 00:00:00|       2020-10-31|     2020-11-01|      2020-11-07|    2020-11-08|
|2020-11-04|   3|2020-11-02 00:00:00|       2020-10-31|     2020-11-01|      2020-11-07|    2020-11-08|
+----------+----+-------------------+-----------------+---------------+----------------+--------------+ 

相关内容