根据熊猫中上一年的同比变化和值计算多个列的当前值



给定如下df

df = [{'date': '1980-01-31 00:00:00',
'yoy_1': 0.001521525,
'yoy_2': 0.002116268,
'value': 190.3},
{'date': '1980-02-29 00:00:00',
'yoy_1': 0.001521525,
'yoy_2': 0.002116268,
'value': 174.9},
{'date': '1980-03-31 00:00:00',
'yoy_1': 0.001470155,
'yoy_2': 0.002116268,
'value': 163.2},
{'date': '1980-04-30 00:00:00',
'yoy_1': 0.001521525,
'yoy_2': 0.002116268,
'value': 168.4},
{'date': '1980-05-31 00:00:00',
'yoy_1': 0.001521525,
'yoy_2': 0.002116268,
'value': 168.6},
{'date': '1980-06-30 00:00:00',
'yoy_1': 0.001521525,
'yoy_2': 0.002116268,
'value': 168.2},
{'date': '1980-07-31 00:00:00',
'yoy_1': 0.001521525,
'yoy_2': 0.002116268,
'value': 163.5},
{'date': '1980-08-31 00:00:00',
'yoy_1': 0.019327965,
'yoy_2': 0.002116268,
'value': 161.6},
{'date': '1980-09-30 00:00:00',
'yoy_1': 0.001203869,
'yoy_2': 0.002116268,
'value': 172.9},
{'date': '1980-10-31 00:00:00',
'yoy_1': 0.101000481,
'yoy_2': 0.222560596,
'value': 166.5},
{'date': '1980-11-30 00:00:00',
'yoy_1': 0.001521525,
'yoy_2': 0.002116268,
'value': 175.2},
{'date': '1980-12-31 00:00:00',
'yoy_1': 0.102218761,
'yoy_2': 0.002116268,
'value': 197.7},
{'date': '1981-01-31 00:00:00',
'yoy_1': 0.001521527,
'yoy_2': 0.002116268,
'value': 212.1},
{'date': '1981-02-28 00:00:00',
'yoy_1': 0.001521525,
'yoy_2': 0.234524059,
'value': 177.9},
{'date': '1981-03-31 00:00:00',
'yoy_1': 0.001521525,
'yoy_2': 0.002116268,
'value': 182.9},
{'date': '1981-04-30 00:00:00',
'yoy_1': 0.001521525,
'yoy_2': 0.002116268,
'value': 184.2},
{'date': '1981-05-31 00:00:00',
'yoy_1': 0.001521525,
'yoy_2': 0.002116268,
'value': 184.0},
{'date': '1981-06-30 00:00:00',
'yoy_1': 0.001521525,
'yoy_2': 0.002116268,
'value': 182.4},
{'date': '1981-07-31 00:00:00',
'yoy_1': 0.001521525,
'yoy_2': 0.002116268,
'value': 175.6},
{'date': '1981-08-31 00:00:00',
'yoy_1': 0.001521525,
'yoy_2': 0.002116268,
'value': 172.0},
{'date': '1981-09-30 00:00:00',
'yoy_1': 0.001521525,
'yoy_2': 0.002116268,
'value': 184.9},
{'date': '1981-10-31 00:00:00',
'yoy_1': 0.001521525,
'yoy_2': 0.002116268,
'value': 184.7},
{'date': '1981-11-30 00:00:00',
'yoy_1': 0.001521525,
'yoy_2': 0.002116268,
'value': 195.1},
{'date': '1981-12-31 00:00:00',
'yoy_1': 0.001521525,
'yoy_2': 0.086862869,
'value': 224.8},
{'date': '1982-01-31 00:00:00',
'yoy_1': 0.001521525,
'yoy_2': 0.226102276,
'value': 233.6},
{'date': '1982-02-28 00:00:00',
'yoy_1': 0.001521525,
'yoy_2': 0.002116268,
'value': 182.0},
{'date': '1982-03-31 00:00:00',
'yoy_1': 0.001521525,
'yoy_2': 0.002116268,
'value': 206.6},
{'date': '1982-04-30 00:00:00',
'yoy_1': 0.001521525,
'yoy_2': 0.002116268,
'value': 202.2}]

输出:

date     yoy_1     yoy_2  value
0   1980-01-31 00:00:00  0.001522  0.002116  190.3
1   1980-02-29 00:00:00  0.001522  0.002116  174.9
2   1980-03-31 00:00:00  0.001470  0.002116  163.2
3   1980-04-30 00:00:00  0.001522  0.002116  168.4
4   1980-05-31 00:00:00  0.001522  0.002116  168.6
5   1980-06-30 00:00:00  0.001522  0.002116  168.2
6   1980-07-31 00:00:00  0.001522  0.002116  163.5
7   1980-08-31 00:00:00  0.019328  0.002116  161.6
8   1980-09-30 00:00:00  0.001204  0.002116  172.9
9   1980-10-31 00:00:00  0.101000  0.222561  166.5
10  1980-11-30 00:00:00  0.001522  0.002116  175.2
11  1980-12-31 00:00:00  0.102219  0.002116  197.7
12  1981-01-31 00:00:00  0.001522  0.002116  212.1
13  1981-02-28 00:00:00  0.001522  0.234524  177.9
14  1981-03-31 00:00:00  0.001522  0.002116  182.9
15  1981-04-30 00:00:00  0.001522  0.002116  184.2
16  1981-05-31 00:00:00  0.001522  0.002116  184.0
17  1981-06-30 00:00:00  0.001522  0.002116  182.4
18  1981-07-31 00:00:00  0.001522  0.002116  175.6
19  1981-08-31 00:00:00  0.001522  0.002116  172.0
20  1981-09-30 00:00:00  0.001522  0.002116  184.9
21  1981-10-31 00:00:00  0.001522  0.002116  184.7
22  1981-11-30 00:00:00  0.001522  0.002116  195.1
23  1981-12-31 00:00:00  0.001522  0.086863  224.8
24  1982-01-31 00:00:00  0.001522  0.226102  233.6
25  1982-02-28 00:00:00  0.001522  0.002116  182.0
26  1982-03-31 00:00:00  0.001522  0.002116  206.6
27  1982-04-30 00:00:00  0.001522  0.002116  202.2

我希望基于从yoyvalue开始的列来计算实际值,下面的代码我可以做到,但并不简洁。

即。,对于1981-01-31 00:00:00yoy_1_value将由(1 + yoy_1) * value from 1980-01-31 00:00:00计算;则CCD_ 7将由CCD_。

df['yoy_1_value'] = (1 + df['yoy_1']).mul(df['value'].shift(12))
df['yoy_2_value'] = (1 + df['yoy_2']).mul(df['value'].shift(12))

我该如何改进它应用它多个列名从yoy_开始?谢谢

def cal_current_values(x):
return (1 + x).mul(df['value'].shift(12))

过滤yoy列:

yoy_cols = [col for col in df if col.startswith('yoy')]
yoy_cols

预期结果:

date     yoy_1     yoy_2  value  yoy_1_value  yoy_2_value
0   1980-01-31 00:00:00  0.001522  0.002116  190.3          NaN          NaN
1   1980-02-29 00:00:00  0.001522  0.002116  174.9          NaN          NaN
2   1980-03-31 00:00:00  0.001470  0.002116  163.2          NaN          NaN
3   1980-04-30 00:00:00  0.001522  0.002116  168.4          NaN          NaN
4   1980-05-31 00:00:00  0.001522  0.002116  168.6          NaN          NaN
5   1980-06-30 00:00:00  0.001522  0.002116  168.2          NaN          NaN
6   1980-07-31 00:00:00  0.001522  0.002116  163.5          NaN          NaN
7   1980-08-31 00:00:00  0.019328  0.002116  161.6          NaN          NaN
8   1980-09-30 00:00:00  0.001204  0.002116  172.9          NaN          NaN
9   1980-10-31 00:00:00  0.101000  0.222561  166.5          NaN          NaN
10  1980-11-30 00:00:00  0.001522  0.002116  175.2          NaN          NaN
11  1980-12-31 00:00:00  0.102219  0.002116  197.7          NaN          NaN
12  1981-01-31 00:00:00  0.001522  0.002116  212.1   190.589547   190.702726
13  1981-02-28 00:00:00  0.001522  0.234524  177.9   175.166115   215.918258
14  1981-03-31 00:00:00  0.001522  0.002116  182.9   163.448313   163.545375
15  1981-04-30 00:00:00  0.001522  0.002116  184.2   168.656225   168.756380
16  1981-05-31 00:00:00  0.001522  0.002116  184.0   168.856529   168.956803
17  1981-06-30 00:00:00  0.001522  0.002116  182.4   168.455921   168.555956
18  1981-07-31 00:00:00  0.001522  0.002116  175.6   163.748769   163.846010
19  1981-08-31 00:00:00  0.001522  0.002116  172.0   161.845878   161.941989
20  1981-09-30 00:00:00  0.001522  0.002116  184.9   173.163072   173.265903
21  1981-10-31 00:00:00  0.001522  0.002116  184.7   166.753334   166.852359
22  1981-11-30 00:00:00  0.001522  0.002116  195.1   175.466571   175.570770
23  1981-12-31 00:00:00  0.001522  0.086863  224.8   198.000805   214.872789
24  1982-01-31 00:00:00  0.001522  0.226102  233.6   212.422715   260.056293
25  1982-02-28 00:00:00  0.001522  0.002116  182.0   178.170679   178.276484
26  1982-03-31 00:00:00  0.001522  0.002116  206.6   183.178287   183.287065
27  1982-04-30 00:00:00  0.001522  0.002116  202.2   184.480265   184.589817

使用DataFrame.filter^作为字符串的开头,通过列value:添加了axis=0用于正确的倍数

df = df.filter(regex='^yoy_').add(1).mul(df['value'].shift(12), axis=0)
print (df)
yoy_1       yoy_2
0          NaN         NaN
1          NaN         NaN
2          NaN         NaN
3          NaN         NaN
4          NaN         NaN
5          NaN         NaN
6          NaN         NaN
7          NaN         NaN
8          NaN         NaN
9          NaN         NaN
10         NaN         NaN
11         NaN         NaN
12  190.589547  190.702726
13  175.166115  215.918258
14  163.448313  163.545375
15  168.656225  168.756380
16  168.856529  168.956803
17  168.455921  168.555956
18  163.748769  163.846010
19  161.845878  161.941989
20  173.163072  173.265903
21  166.753334  166.852359
22  175.466571  175.570770
23  198.000805  214.872789
24  212.422715  260.056293
25  178.170679  178.276484
26  183.178287  183.287065
27  184.480265  184.589817

添加到原始:

df = df.join(df.filter(regex='^yoy_').add(1).mul(df['value'].shift(12), axis=0).add_suffix('_values'))

最新更新