我有一个像这样的DataFrame:
data = {'ID':["AB2343","CD4565"],
'StartDate':["01/01/2020","01/01/2021"],
'EndDate':["31/12/2021","31/12/2022"],
'AgreedMonthlyBilling':[1500,4500],
'ActualMonthlyBilling':[1200,3000]}
df = pd.DataFrame(data)
ID、StartDate、EndDate和AgreedMonthlyBilling在我插入各自的值后不会更改(即AgreedMonthlyBilling中的1500和4500将永远保留(最后一个值(在excel表中手动修改的ActualMonthlyBolling(每月都会更改(即1月有1200和3000,2月有0和1500,3月有3000和1000(。我想做的是在每个新的月份添加一行,直到结束日期(这意味着,我们在2022年3月1日,我从2022年起只有3个月的时间添加(,并且只有在新的月份到来时,而不是之前(这是我陷入困境的地方(。我有以下内容:
repeats = (pd.to_datetime(df['EndDate']) - pd.to_datetime(df['StartDate'])) // np.timedelta64(1, 'M') + 1
periods = np.concatenate([pd.period_range(start=pd.to_datetime(d), periods=r, freq='M')
for d, r in zip(df['StartDate'], repeats)])
new_df = (pd.DataFrame(
np.repeat(df.values, repeats, 0),
columns=df.columns,
index=periods)
.assign(Month = [x.month for x in periods],
Year = [x.year for x in periods]))
我得到的是一个新的DataFrame,在StartDate和EndDate之间有完整的月份,但每次我更改ActualMonthlyBilling时,我都会覆盖ActualMonthlyBilling中过去的数据,并且我不会保留我想要跟踪的旧记录。
这就是我得到的
ID | 月 | 年 | 预期计费>>实际计费 | |
---|---|---|---|---|
AB2343 | 01 | 2020 | 1500 | 1350覆盖 |
AB2343 | 02 | 2020 | 1500 | 1350覆盖 |
AB2343 | 03 | 2020 | 1500 | 1350覆盖 |
更新假设您有以下当前数据帧:
ID Month Year ExpectedBilling RealBilling
0 AB2343 1 2022 1500 1290
1 AB2343 2 2022 1500 1350
2 CD4565 1 2022 4500 3000
3 CD4565 2 2022 4500 3250
您想根据ID:为2022年3月添加一个新行
new_df = df.groupby('ID', as_index=False)
.agg({'ID': 'first', 'ExpectedBilling': 'last'})
.assign(Month=today.month, Year=today.year)
out = pd.concat([df, new_df]).sort_values(['ID', 'Month', 'Year'], ignore_index=True)
print(out)
# Output
ID Month Year ExpectedBilling RealBilling
0 AB2343 1 2022 1500 1290.0
1 AB2343 2 2022 1500 1350.0
2 AB2343 3 2022 1500 NaN
3 CD4565 1 2022 4500 3000.0
4 CD4565 2 2022 4500 3250.0
5 CD4565 3 2022 4500 NaN
问题是如何为RealBilling
列插入新值?
旧答案
today = pd.Timestamp.today()
period = df.apply(lambda x: pd.period_range(x['StartDate'], today, freq='M'), axis=1)
new_df = df.assign(Period=period).explode('Period').set_index('Period')
.assign(Month=lambda x: x.index.month, Year=lambda x: x.index.year)
输出:
ID StartDate EndDate AgreedMonthlyPrice ActualMonthlyPrice Month Year
Period
2020-01 AB2343 01/01/2020 31/12/2022 1500 1290 1 2020
2020-02 AB2343 01/01/2020 31/12/2022 1500 1290 2 2020
2020-03 AB2343 01/01/2020 31/12/2022 1500 1290 3 2020
2020-04 AB2343 01/01/2020 31/12/2022 1500 1290 4 2020
2020-05 AB2343 01/01/2020 31/12/2022 1500 1290 5 2020
2020-06 AB2343 01/01/2020 31/12/2022 1500 1290 6 2020
2020-07 AB2343 01/01/2020 31/12/2022 1500 1290 7 2020
2020-08 AB2343 01/01/2020 31/12/2022 1500 1290 8 2020
2020-09 AB2343 01/01/2020 31/12/2022 1500 1290 9 2020
2020-10 AB2343 01/01/2020 31/12/2022 1500 1290 10 2020
2020-11 AB2343 01/01/2020 31/12/2022 1500 1290 11 2020
2020-12 AB2343 01/01/2020 31/12/2022 1500 1290 12 2020
2021-01 AB2343 01/01/2020 31/12/2022 1500 1290 1 2021
2021-02 AB2343 01/01/2020 31/12/2022 1500 1290 2 2021
2021-03 AB2343 01/01/2020 31/12/2022 1500 1290 3 2021
2021-04 AB2343 01/01/2020 31/12/2022 1500 1290 4 2021
2021-05 AB2343 01/01/2020 31/12/2022 1500 1290 5 2021
2021-06 AB2343 01/01/2020 31/12/2022 1500 1290 6 2021
2021-07 AB2343 01/01/2020 31/12/2022 1500 1290 7 2021
2021-08 AB2343 01/01/2020 31/12/2022 1500 1290 8 2021
2021-09 AB2343 01/01/2020 31/12/2022 1500 1290 9 2021
2021-10 AB2343 01/01/2020 31/12/2022 1500 1290 10 2021
2021-11 AB2343 01/01/2020 31/12/2022 1500 1290 11 2021
2021-12 AB2343 01/01/2020 31/12/2022 1500 1290 12 2021
2022-01 AB2343 01/01/2020 31/12/2022 1500 1290 1 2022
2022-02 AB2343 01/01/2020 31/12/2022 1500 1290 2 2022
2022-03 AB2343 01/01/2020 31/12/2022 1500 1290 3 2022
2021-01 CD4565 01/01/2021 31/12/2025 4500 3000 1 2021
2021-02 CD4565 01/01/2021 31/12/2025 4500 3000 2 2021
2021-03 CD4565 01/01/2021 31/12/2025 4500 3000 3 2021
2021-04 CD4565 01/01/2021 31/12/2025 4500 3000 4 2021
2021-05 CD4565 01/01/2021 31/12/2025 4500 3000 5 2021
2021-06 CD4565 01/01/2021 31/12/2025 4500 3000 6 2021
2021-07 CD4565 01/01/2021 31/12/2025 4500 3000 7 2021
2021-08 CD4565 01/01/2021 31/12/2025 4500 3000 8 2021
2021-09 CD4565 01/01/2021 31/12/2025 4500 3000 9 2021
2021-10 CD4565 01/01/2021 31/12/2025 4500 3000 10 2021
2021-11 CD4565 01/01/2021 31/12/2025 4500 3000 11 2021
2021-12 CD4565 01/01/2021 31/12/2025 4500 3000 12 2021
2022-01 CD4565 01/01/2021 31/12/2025 4500 3000 1 2022
2022-02 CD4565 01/01/2021 31/12/2025 4500 3000 2 2022
2022-03 CD4565 01/01/2021 31/12/2025 4500 3000 3 2022