给定如下df
:
df = [{'date': '1980-01-31 00:00:00',
'yoy_1': 0.001521525,
'yoy_2': 0.002116268,
'value': 190.3},
{'date': '1980-02-29 00:00:00',
'yoy_1': 0.001521525,
'yoy_2': 0.002116268,
'value': 174.9},
{'date': '1980-03-31 00:00:00',
'yoy_1': 0.001470155,
'yoy_2': 0.002116268,
'value': 163.2},
{'date': '1980-04-30 00:00:00',
'yoy_1': 0.001521525,
'yoy_2': 0.002116268,
'value': 168.4},
{'date': '1980-05-31 00:00:00',
'yoy_1': 0.001521525,
'yoy_2': 0.002116268,
'value': 168.6},
{'date': '1980-06-30 00:00:00',
'yoy_1': 0.001521525,
'yoy_2': 0.002116268,
'value': 168.2},
{'date': '1980-07-31 00:00:00',
'yoy_1': 0.001521525,
'yoy_2': 0.002116268,
'value': 163.5},
{'date': '1980-08-31 00:00:00',
'yoy_1': 0.019327965,
'yoy_2': 0.002116268,
'value': 161.6},
{'date': '1980-09-30 00:00:00',
'yoy_1': 0.001203869,
'yoy_2': 0.002116268,
'value': 172.9},
{'date': '1980-10-31 00:00:00',
'yoy_1': 0.101000481,
'yoy_2': 0.222560596,
'value': 166.5},
{'date': '1980-11-30 00:00:00',
'yoy_1': 0.001521525,
'yoy_2': 0.002116268,
'value': 175.2},
{'date': '1980-12-31 00:00:00',
'yoy_1': 0.102218761,
'yoy_2': 0.002116268,
'value': 197.7},
{'date': '1981-01-31 00:00:00',
'yoy_1': 0.001521527,
'yoy_2': 0.002116268,
'value': 212.1},
{'date': '1981-02-28 00:00:00',
'yoy_1': 0.001521525,
'yoy_2': 0.234524059,
'value': 177.9},
{'date': '1981-03-31 00:00:00',
'yoy_1': 0.001521525,
'yoy_2': 0.002116268,
'value': 182.9},
{'date': '1981-04-30 00:00:00',
'yoy_1': 0.001521525,
'yoy_2': 0.002116268,
'value': 184.2},
{'date': '1981-05-31 00:00:00',
'yoy_1': 0.001521525,
'yoy_2': 0.002116268,
'value': 184.0},
{'date': '1981-06-30 00:00:00',
'yoy_1': 0.001521525,
'yoy_2': 0.002116268,
'value': 182.4},
{'date': '1981-07-31 00:00:00',
'yoy_1': 0.001521525,
'yoy_2': 0.002116268,
'value': 175.6},
{'date': '1981-08-31 00:00:00',
'yoy_1': 0.001521525,
'yoy_2': 0.002116268,
'value': 172.0},
{'date': '1981-09-30 00:00:00',
'yoy_1': 0.001521525,
'yoy_2': 0.002116268,
'value': 184.9},
{'date': '1981-10-31 00:00:00',
'yoy_1': 0.001521525,
'yoy_2': 0.002116268,
'value': 184.7},
{'date': '1981-11-30 00:00:00',
'yoy_1': 0.001521525,
'yoy_2': 0.002116268,
'value': 195.1},
{'date': '1981-12-31 00:00:00',
'yoy_1': 0.001521525,
'yoy_2': 0.086862869,
'value': 224.8},
{'date': '1982-01-31 00:00:00',
'yoy_1': 0.001521525,
'yoy_2': 0.226102276,
'value': 233.6},
{'date': '1982-02-28 00:00:00',
'yoy_1': 0.001521525,
'yoy_2': 0.002116268,
'value': 182.0},
{'date': '1982-03-31 00:00:00',
'yoy_1': 0.001521525,
'yoy_2': 0.002116268,
'value': 206.6},
{'date': '1982-04-30 00:00:00',
'yoy_1': 0.001521525,
'yoy_2': 0.002116268,
'value': 202.2}]
输出:
date yoy_1 yoy_2 value
0 1980-01-31 00:00:00 0.001522 0.002116 190.3
1 1980-02-29 00:00:00 0.001522 0.002116 174.9
2 1980-03-31 00:00:00 0.001470 0.002116 163.2
3 1980-04-30 00:00:00 0.001522 0.002116 168.4
4 1980-05-31 00:00:00 0.001522 0.002116 168.6
5 1980-06-30 00:00:00 0.001522 0.002116 168.2
6 1980-07-31 00:00:00 0.001522 0.002116 163.5
7 1980-08-31 00:00:00 0.019328 0.002116 161.6
8 1980-09-30 00:00:00 0.001204 0.002116 172.9
9 1980-10-31 00:00:00 0.101000 0.222561 166.5
10 1980-11-30 00:00:00 0.001522 0.002116 175.2
11 1980-12-31 00:00:00 0.102219 0.002116 197.7
12 1981-01-31 00:00:00 0.001522 0.002116 212.1
13 1981-02-28 00:00:00 0.001522 0.234524 177.9
14 1981-03-31 00:00:00 0.001522 0.002116 182.9
15 1981-04-30 00:00:00 0.001522 0.002116 184.2
16 1981-05-31 00:00:00 0.001522 0.002116 184.0
17 1981-06-30 00:00:00 0.001522 0.002116 182.4
18 1981-07-31 00:00:00 0.001522 0.002116 175.6
19 1981-08-31 00:00:00 0.001522 0.002116 172.0
20 1981-09-30 00:00:00 0.001522 0.002116 184.9
21 1981-10-31 00:00:00 0.001522 0.002116 184.7
22 1981-11-30 00:00:00 0.001522 0.002116 195.1
23 1981-12-31 00:00:00 0.001522 0.086863 224.8
24 1982-01-31 00:00:00 0.001522 0.226102 233.6
25 1982-02-28 00:00:00 0.001522 0.002116 182.0
26 1982-03-31 00:00:00 0.001522 0.002116 206.6
27 1982-04-30 00:00:00 0.001522 0.002116 202.2
我希望基于从yoy
和value
开始的列来计算实际值,下面的代码我可以做到,但并不简洁。
即。,对于1981-01-31 00:00:00
,yoy_1_value
将由(1 + yoy_1) * value from 1980-01-31 00:00:00
计算;则CCD_ 7将由CCD_。
df['yoy_1_value'] = (1 + df['yoy_1']).mul(df['value'].shift(12))
df['yoy_2_value'] = (1 + df['yoy_2']).mul(df['value'].shift(12))
我该如何改进它应用它多个列名从yoy_
开始?谢谢
def cal_current_values(x):
return (1 + x).mul(df['value'].shift(12))
过滤yoy
列:
yoy_cols = [col for col in df if col.startswith('yoy')]
yoy_cols
预期结果:
date yoy_1 yoy_2 value yoy_1_value yoy_2_value
0 1980-01-31 00:00:00 0.001522 0.002116 190.3 NaN NaN
1 1980-02-29 00:00:00 0.001522 0.002116 174.9 NaN NaN
2 1980-03-31 00:00:00 0.001470 0.002116 163.2 NaN NaN
3 1980-04-30 00:00:00 0.001522 0.002116 168.4 NaN NaN
4 1980-05-31 00:00:00 0.001522 0.002116 168.6 NaN NaN
5 1980-06-30 00:00:00 0.001522 0.002116 168.2 NaN NaN
6 1980-07-31 00:00:00 0.001522 0.002116 163.5 NaN NaN
7 1980-08-31 00:00:00 0.019328 0.002116 161.6 NaN NaN
8 1980-09-30 00:00:00 0.001204 0.002116 172.9 NaN NaN
9 1980-10-31 00:00:00 0.101000 0.222561 166.5 NaN NaN
10 1980-11-30 00:00:00 0.001522 0.002116 175.2 NaN NaN
11 1980-12-31 00:00:00 0.102219 0.002116 197.7 NaN NaN
12 1981-01-31 00:00:00 0.001522 0.002116 212.1 190.589547 190.702726
13 1981-02-28 00:00:00 0.001522 0.234524 177.9 175.166115 215.918258
14 1981-03-31 00:00:00 0.001522 0.002116 182.9 163.448313 163.545375
15 1981-04-30 00:00:00 0.001522 0.002116 184.2 168.656225 168.756380
16 1981-05-31 00:00:00 0.001522 0.002116 184.0 168.856529 168.956803
17 1981-06-30 00:00:00 0.001522 0.002116 182.4 168.455921 168.555956
18 1981-07-31 00:00:00 0.001522 0.002116 175.6 163.748769 163.846010
19 1981-08-31 00:00:00 0.001522 0.002116 172.0 161.845878 161.941989
20 1981-09-30 00:00:00 0.001522 0.002116 184.9 173.163072 173.265903
21 1981-10-31 00:00:00 0.001522 0.002116 184.7 166.753334 166.852359
22 1981-11-30 00:00:00 0.001522 0.002116 195.1 175.466571 175.570770
23 1981-12-31 00:00:00 0.001522 0.086863 224.8 198.000805 214.872789
24 1982-01-31 00:00:00 0.001522 0.226102 233.6 212.422715 260.056293
25 1982-02-28 00:00:00 0.001522 0.002116 182.0 178.170679 178.276484
26 1982-03-31 00:00:00 0.001522 0.002116 206.6 183.178287 183.287065
27 1982-04-30 00:00:00 0.001522 0.002116 202.2 184.480265 184.589817
使用DataFrame.filter
和^
作为字符串的开头,通过列value
:添加了axis=0
用于正确的倍数
df = df.filter(regex='^yoy_').add(1).mul(df['value'].shift(12), axis=0)
print (df)
yoy_1 yoy_2
0 NaN NaN
1 NaN NaN
2 NaN NaN
3 NaN NaN
4 NaN NaN
5 NaN NaN
6 NaN NaN
7 NaN NaN
8 NaN NaN
9 NaN NaN
10 NaN NaN
11 NaN NaN
12 190.589547 190.702726
13 175.166115 215.918258
14 163.448313 163.545375
15 168.656225 168.756380
16 168.856529 168.956803
17 168.455921 168.555956
18 163.748769 163.846010
19 161.845878 161.941989
20 173.163072 173.265903
21 166.753334 166.852359
22 175.466571 175.570770
23 198.000805 214.872789
24 212.422715 260.056293
25 178.170679 178.276484
26 183.178287 183.287065
27 184.480265 184.589817
添加到原始:
df = df.join(df.filter(regex='^yoy_').add(1).mul(df['value'].shift(12), axis=0).add_suffix('_values'))