我有以下代码:
from datetime import datetime
import pandas as pd
last_day_of_current_year = datetime.now().date().replace(month=12, day=31)
with open(("mtn_mtx.txt").lower(), "r") as rfile:
next(rfile)
for line in rfile:
line = line.rstrip('n')
line = line.upper()
line = line.split('t')
firstcoupondate = (line[5])
month_list = [i.strftime("%Y-%m-%d") for i in pd.date_range(start=firstcoupondate, end=last_day_of_current_year, freq='MS')]
print(month_list)
这允许我从文件中输入日期(firstcoupondate),并填充从第一个coupondate到当年最后一天之间的未来月份。然而,我想将此代码调整为仅生成当年的所有12个月,而不是第一次政变日期的年份。
例如,如果我从我的文件中输入以下第一个coupondate:"2020-02-05",我的上述代码将产生以下列表:
["2020-03-01',"2020-04-01',"2020-05-01','2020-06-01',‘2020-08-01',‘2020-10-01’,‘2020-11-01',’2021-02-01','2021-02-01],‘2021-03-01'’,'2021-04-01'、‘2021-05-01'、’2021-06-01'、2021-12-01']
正如你所看到的,所有日期都显示了每个月的第一天,这是不正确的,我也错过了1月/2月的前两个日期("2020-01-05","2020-02-05"),所以我在代码中输入的任何第一个月日期都不会填充当前年份的前几个月日期,我在输入"2020-02:05"时想要的输出而不是上面的日期列表应该是:
['2021-01-05','2021-02-05','2021-03-05',‘2021-04-05',’2021-05-05',"2021-06-05',"2021-07-05'
不管怎样,我输入的一些firstcoupondate在未来有一年,所以如果我有日期"2025-04-12",我想填充与上面相同的日期(所有12个月),但对于我输入的firstcouponday内的年份,例如firstcoupondate='2025-04-12',我想生成以下月份列表:
【'2025-01-12','2025-02-12',2025-03-12','2025-04-12',‘2025-05-12',’2025-07-12',"2025-08-12',"2025-09-12'
提供一个可复制的示例。生成了我认为与您在选项卡delim文件中发现的数据类似的数据。
关键部件
- 定义一个函数
- 在文件读取上下文中使用
- 熊猫上下文中的使用
import random
import datetime as dt
from pathlib import Path
s = 5
coupondates = [dt.date(2019 + random.randint(0,4), random.randint(1,12), random.randint(1,28)) for _ in range(s)]
cd = pd.DataFrame({
"A":np.random.choice(["A","B","C"], s),"B":np.random.choice(["A","B","C"], s),
"C":np.random.choice(["A","B","C"], s),"D":np.random.choice(["A","B","C"], s),
"E":np.random.choice(["A","B","C"], s),
"coupondate": coupondates
})
p = Path.cwd().joinpath("mtn_mtx.txt")
ye = pd.to_datetime("now").replace(month=12, day=31)
cd.to_csv(p, sep="t", index=False)
def drng(d):
n = pd.to_datetime("now")
s = n.replace(month=1, day=1) if d < n else d.replace(month=1, day=1)
e = n.replace(month=12, day=31) if d<n else d.replace(month=12, day=31)
return list((pd.date_range(s,e, freq="MS") + pd.DateOffset(days=d.day-1)).strftime("%Y-%m-%d"))
# file solution
with open(p) as f:
next(f)
for line in f:
line = line.rstrip("n").upper().split("t")
print(line[5], drng(pd.to_datetime(line[5])))
# pandas solution
df = pd.read_csv(p, sep="t")
df.coupondate = pd.to_datetime(df.coupondate)
df = df.assign(allcd=df.coupondate.apply(drng))
print(df.to_markdown())
文件循环的输出
2019-01-21 ['2021-03-21', '2021-04-21', '2021-05-21', '2021-06-21', '2021-07-21', '2021-08-21', '2021-09-21', '2021-10-21', '2021-11-21', '2021-12-21']
2020-10-22 ['2021-03-22', '2021-04-22', '2021-05-22', '2021-06-22', '2021-07-22', '2021-08-22', '2021-09-22', '2021-10-22', '2021-11-22', '2021-12-22']
2020-11-19 ['2021-03-19', '2021-04-19', '2021-05-19', '2021-06-19', '2021-07-19', '2021-08-19', '2021-09-19', '2021-10-19', '2021-11-19', '2021-12-19']
2023-12-22 ['2023-01-22', '2023-02-22', '2023-03-22', '2023-04-22', '2023-05-22', '2023-06-22', '2023-07-22', '2023-08-22', '2023-09-22', '2023-10-22', '2023-11-22', '2023-12-22']
2020-10-06 ['2021-03-06', '2021-04-06', '2021-05-06', '2021-06-06', '2021-07-06', '2021-08-06', '2021-09-06', '2021-10-06', '2021-11-06', '2021-12-06']
熊猫产量
A | B | >allcd | |||||
---|---|---|---|---|---|---|---|
0 | C | B | |||||
1 | B | ||||||
2 | B | ||||||
3 | B | 4 | A | B | A | B | 020-100-06 00:00:00[‘2021-03-06’,‘2021-04-06’,’2021-05-06’,"2021-06-06",‘2021:07-06’,"2021-08-06‘,’2021-10-06‘,‘2021-11-06’,'2021-12-06’]