我有一个包含贷款列表的DataFrame,包含起始日期、到期日和金额。我想创建一个DataFrame与曝光的演变如下:
来自:
到期日 | 1 | 100 | 31/12/2022 | 31/12/2042 | 2
---|---|---|---|
90 | 31/12/2023 | 31/12/2052 | |
3 | 50 | 31/12/2024 | 31/12/2035 |
首先通过to_datetime
转换为日期时间,然后通过Index.repeat
减去年份并通过DataFrame.loc
重复行,通过GroupBy.cumcount
创建计数器列并添加年份,最后通过DataFrame.pivot_table
和aggfunc='sum'
进行枢轴转换,如果重复值:
df['inception'] = pd.to_datetime(df['inception'])
df['maturity'] = pd.to_datetime(df['maturity'])
df = (df.loc[df.index.repeat(df['maturity'].dt.year.sub(df['inception'].dt.year) + 1)]
.assign(y = lambda x: x.groupby(level=0).cumcount().add(x['inception'].dt.year))
.pivot_table(index='y',
columns='contract',
values='amount',
fill_value=0,
aggfunc='sum')
.rename(lambda x: f'31/12/{x}'))
print (df)
contract 1 2 3
y
31/12/2022 100 0 0
31/12/2023 100 90 0
31/12/2024 100 90 50
31/12/2025 100 90 50
31/12/2026 100 90 50
31/12/2027 100 90 50
31/12/2028 100 90 50
31/12/2029 100 90 50
31/12/2030 100 90 50
31/12/2031 100 90 50
31/12/2032 100 90 50
31/12/2033 100 90 50
31/12/2034 100 90 50
31/12/2035 100 90 50
31/12/2036 100 90 0
31/12/2037 100 90 0
31/12/2038 100 90 0
31/12/2039 100 90 0
31/12/2040 100 90 0
31/12/2041 100 90 0
31/12/2042 100 90 0
31/12/2043 0 90 0
31/12/2044 0 90 0
31/12/2045 0 90 0
31/12/2046 0 90 0
31/12/2047 0 90 0
31/12/2048 0 90 0
31/12/2049 0 90 0
31/12/2050 0 90 0
31/12/2051 0 90 0
31/12/2052 0 90 0
编辑:如果True
在is_instalment
中,可以通过减去列乘以5
来减去每个amount
的5个值:
print (df)
contract amount inception maturity is_instalment
0 1 100 31/12/2022 31/12/2042 True
1 2 90 31/12/2023 31/12/2052 False
2 3 50 31/12/2024 31/12/2035 False
df['inception'] = pd.to_datetime(df['inception'])
df['maturity'] = pd.to_datetime(df['maturity'])
df['y'] = df['inception'].dt.year
df1 = df.loc[df.index.repeat(df['maturity'].dt.year.sub(df['y']) + 1)].copy()
df1['counter'] = df1.groupby(level=0).cumcount()
df1['Date'] = df1['y'].add(df1['counter'])
df1['amount'] = df1['amount'].mask(df1['is_instalment'],
df1['amount'].sub(df1['counter'].mul(5)))
df2 = df1.pivot_table(index='Date',
columns='contract',
values='amount',
fill_value=0,
aggfunc='sum')
print (df2)
contract 1 2 3
Date
2022 100 0 0
2023 95 90 0
2024 90 90 50
2025 85 90 50
2026 80 90 50
2027 75 90 50
2028 70 90 50
2029 65 90 50
2030 60 90 50
2031 55 90 50
2032 50 90 50
2033 45 90 50
2034 40 90 50
2035 35 90 50
2036 30 90 0
2037 25 90 0
2038 20 90 0
2039 15 90 0
2040 10 90 0
2041 5 90 0
2042 0 90 0
2043 0 90 0
2044 0 90 0
2045 0 90 0
2046 0 90 0
2047 0 90 0
2048 0 90 0
2049 0 90 0
2050 0 90 0
2051 0 90 0
2052 0 90 0