向前填充到特定的日期时间索引-如果日期时间索引出现在其他列表中

我正试图通过日期时间索引覆盖df中预先计算的产品权重。我的挑战是只覆盖到某个日期(在另一个df中列出(，然后再覆盖一次。数据示例：

data = {'Product 1 Weight':['0', '.15', '.19', '.2','.21','.25','.252','.255'],
'Product 2 Weight':['0', '0', '0', '0','0','0','0','0'],
'Product 3 Weight':['0', '0', '0', '0','0','.5','.551','.561']}
df = pd.DataFrame(data, index =['2020-04-01',
'2020-04-02',
'2020-04-03',
'2020-04-06',
'2020-04-07',
'2020-04-08',
'2020-04-09',
'2020-04-10'])
rebalances= pd.DataFrame({'Rebalance':['2020-04-02',
'2020-04-08',
'2020-04-10']})

在这个例子中，我想用2020-04-02的值覆盖2020-04-02-2020-04-07的所有产品的值。然后，我想用2020-04-08的值覆盖2020-04-09到2020-04-04的所有产品的值，以此类推。再平衡df将给我停止覆盖并开始另一个覆盖的日期。因此，我想要的最终输出看起来像：

data = {'Product 1 Weight':['0', '.15', '.15', '.15','.15','.25','.25','.255'],
'Product 2 Weight':['0', '0', '0', '0','0','0','0','0'],
'Product 3 Weight':['0', '0', '0', '0','0','.5','.5','.561']}
df = pd.DataFrame(data, index =['2020-04-01',
'2020-04-02',
'2020-04-03',
'2020-04-06',
'2020-04-07',
'2020-04-08',
'2020-04-09',
'2020-04-10'])

可能看起来完全是随机的，但对我目前的项目来说会很好。

我们可以mask类Product列中的值，其中对应的index不存在于Rebalance列中，然后ffill来前向填充和覆盖屏蔽的值。

m = df.index.to_series().isin(rebalances['Rebalance'])
out = df.mask(~m).ffill().fillna(df)

>>> out
Product 1 Weight Product 2 Weight Product 3 Weight
2020-04-01                0                0                0
2020-04-02              .15                0                0
2020-04-03              .15                0                0
2020-04-06              .15                0                0
2020-04-07              .15                0                0
2020-04-08              .25                0               .5
2020-04-09              .25                0               .5
2020-04-10             .255                0             .561

相关内容

最新更新

热门标签：