将一个df分割为3Y和1M的窗口,日期范围为Python



我有一个df日期索引如下:

ind = pd.date_range(start="2015-12-31", end = "2022-04-26", freq="D")
df = pd.DataFrame(
{
"col1": range(len(ind))
},
index=ind
)

我需要的是将从2017-08-31开始的每个月底到3年加1个月的窗口中的df切片,所以我有下一个代码块

n = timedelta(365 * 3) + relativedelta(months=1)
fechas_ = pd.date_range("2017-08-31", ind.max() - n, freq="M")
# create a for loop to check the beginning and the end of each window
for i in fechas_:
print(f"start: {i}")
print(f"end: {i + n}")
print("n")

我的问题是我需要每月的最后一天作为每个窗口的结束,例如:

# first window
start: 2017-08-31 00:00:00
end: 2020-09-30 00:00:00
# second window
start: 2017-09-30 00:00:00
end: 2020-10-31 00:00:00
# so on

但是我得到:

# first window
start: 2017-08-31 00:00:00
end: 2020-09-29 00:00:00
# second window
start: 2017-09-30 00:00:00
end: 2020-10-29 00:00:00
# 3
2017-10-31 00:00:00
2020-11-29 00:00:00
# 4
2017-11-30 00:00:00
2020-12-29 00:00:00
# 5
2017-12-31 00:00:00
2021-01-30 00:00:00
# 6
2018-01-31 00:00:00
2021-02-27 00:00:00
# 7
2018-02-28 00:00:00
2021-03-27 00:00:00
# 8
2018-03-31 00:00:00
2021-04-29 00:00:00
# 9
2018-04-30 00:00:00
2021-05-29 00:00:00
# 10
2018-05-31 00:00:00
2021-06-29 00:00:00
# 11
2018-06-30 00:00:00
2021-07-29 00:00:00
# 12
2018-07-31 00:00:00
2021-08-30 00:00:00
# 13
2018-08-31 00:00:00
2021-09-29 00:00:00
# 14
2018-09-30 00:00:00
2021-10-29 00:00:00
# 15
2018-10-31 00:00:00
2021-11-29 00:00:00
# 16
2018-11-30 00:00:00
2021-12-29 00:00:00
# 17
2018-12-31 00:00:00
2022-01-30 00:00:00
# 18
2019-01-31 00:00:00
2022-02-27 00:00:00
# 19
2019-02-28 00:00:00
2022-03-27 00:00:00

有人知道我怎么解决这个问题吗?

Thanks a lot

在你的代码中

n = timedelta(365 * 3) + relativedelta(months=1)

试着用

替换
n = relativedelta(years=3, months=1, day=31)

最新更新