我正在尝试在面对外部现金流的情况下计算时间序列的累计价格。
这是样本数据集:
reportdate fund mtd_return cashflow Desired Output
30/11/2018 Fund X -0.00860 15687713 15552798.98
31/12/2018 Fund X -0.00900 15412823.78
31/01/2019 Fund X 0.00920 15554621.76
28/02/2019 Fund X 0.00630 15652615.88
31/03/2019 Fund X 0.00700 15762184.19
30/04/2019 Fund X 0.01220 15954482.84
31/05/2019 Fund X 0.00060 1000000 16964655.53
30/06/2019 Fund X 0.00570 1200000 18268194.07
31/07/2019 Fund X 0.00450 18350400.94
31/08/2019 Fund X 0.00210 18388936.78
30/09/2019 Fund X 0.00530 18486398.15
31/10/2019 Fund X 0.00200 18523370.94
30/11/2019 Fund X 0.00430 18603021.44
31/12/2019 Fund X 0.00660 18725801.38
31/01/2020 Fund X 0.01070 18926167.45
29/02/2020 Fund X -0.00510 18829644.00
31/03/2020 Fund X -0.10700 16814872.09
30/04/2020 Fund X 0.02740 3400000 20768759.59
31/05/2020 Fund X 0.02180 2000000 23265118.55
30/06/2020 Fund X 0.02270 23793236.74
31/07/2020 Fund X 0.01120 24059720.99
31/08/2020 Fund X 0.01260 24362873.47
30/09/2020 Fund X 0.00750 24545595.02
31/10/2020 Fund X 0.00410 -8110576 16502402.68
30/11/2020 Fund X 0.02790 16962819.72
31/12/2020 Fund X 0.01230 17171462.40
在上面,期望输出列的计算方法是采用前一行的期望输出,加上当期的任何现金流,乘以1 + mtd_return。实际上,我正在寻找一种在面对外部现金流时计算复合回报的好方法。
多谢!
迈克
在python中实现此功能的任何帮助将不胜感激。
import pandas as pd
df = pd.read_csv('df7.txt', sep=',', header=0)
df['reportdate'] = pd.to_datetime(df['reportdate'])
df = df.fillna(0)
qqq = []
def func_data(x):
a = 0
ind = x.index[0] - 1
if x.index[0] > 0:
a = (qqq[ind] + x['cashflow']) * (1 + x['mtd_return'])
qqq.append(a.values[0])
else:
qqq.append(df.loc[0, 'cashflow'] * (1 + df.loc[0, 'mtd_return']))
return a
df.groupby(['reportdate']).apply(func_data)
df['new'] = qqq
print(df)
输出reportdate fund mtd_return cashflow Desired Output new
0 2018-11-30 Fund X -0.0086 15687713.0 15552798.98 1.555280e+07
1 2018-12-31 Fund X -0.0090 0.0 15412823.78 1.541282e+07
2 2019-01-31 Fund X 0.0092 0.0 15554621.76 1.555462e+07
3 2019-02-28 Fund X 0.0063 0.0 15652615.88 1.565262e+07
4 2019-03-31 Fund X 0.0070 0.0 15762184.19 1.576218e+07
5 2019-04-30 Fund X 0.0122 0.0 15954482.84 1.595448e+07
6 2019-05-31 Fund X 0.0006 1000000.0 16964655.53 1.696466e+07
7 2019-06-30 Fund X 0.0057 1200000.0 18268194.07 1.826819e+07
8 2019-07-31 Fund X 0.0045 0.0 18350400.94 1.835040e+07
9 2019-08-31 Fund X 0.0021 0.0 18388936.78 1.838894e+07
10 2019-09-30 Fund X 0.0053 0.0 18486398.15 1.848640e+07
11 2019-10-31 Fund X 0.0020 0.0 18523370.94 1.852337e+07
12 2019-11-30 Fund X 0.0043 0.0 18603021.44 1.860302e+07
13 2019-12-31 Fund X 0.0066 0.0 18725801.38 1.872580e+07
14 2020-01-31 Fund X 0.0107 0.0 18926167.45 1.892617e+07
15 2020-02-29 Fund X -0.0051 0.0 18829644.00 1.882964e+07
16 2020-03-31 Fund X -0.1070 0.0 16814872.09 1.681487e+07
17 2020-04-30 Fund X 0.0274 3400000.0 20768759.59 2.076876e+07
18 2020-05-31 Fund X 0.0218 2000000.0 23265118.55 2.326512e+07
19 2020-06-30 Fund X 0.0227 0.0 23793236.74 2.379324e+07
20 2020-07-31 Fund X 0.0112 0.0 24059720.99 2.405972e+07
21 2020-08-31 Fund X 0.0126 0.0 24362873.47 2.436287e+07
22 2020-09-30 Fund X 0.0075 0.0 24545595.02 2.454559e+07
23 2020-10-31 Fund X 0.0041 -8110576.0 16502402.68 1.650240e+07
24 2020-11-30 Fund X 0.0279 0.0 16962819.72 1.696282e+07
25 2020-12-31 Fund X 0.0123 0.0 17171462.40 1.717146e+07
使你的文件中所有的值用逗号分隔,空也(即逗号之间为空)。在pandas中读取一个文件并创建一个数据框架。Header =0表示第一行用作列标题。接下来,将'reportdate '列转换为日期时间格式,并将空值替换为零。数据按日期分组。为调用创建func_data函数。如果这是第一次调用,则计算else中的代码,其余代码在If中。计算结果被写入qqq数组,然后填充'new'列。