假设我有以下数据。
dates=['2020-12-01','2020-12-04','2020-12-05', '2020-12-01','2020-12-04','2020-12-05']
symbols=['ABC','ABC','ABC','DEF','DEF','DEF']
v=[1,3,5,7,9,10]
df= pd.DataFrame({'date':dates, 'g':symbols, 'v':v})
date g v
0 2020-12-01 ABC 1
1 2020-12-04 ABC 3
2 2020-12-05 ABC 5
3 2020-12-01 DEF 7
4 2020-12-04 DEF 9
5 2020-12-05 DEF 10
我想用以前的值填充缺失的日期(按字段"g"分组(例如,我希望在上面的例子中添加以下主菜:
2020-12-02 ABC 1
2020-12-03 ABC 1
2020-12-02 DEF 7
2020-12-03 DEF 7
我该怎么做?
答案主要是从下面的答案中借用的,除了用负值填充并使用负值替换为null进行正向填充。
此处为原始答案
dates=['2020-12-01','2020-12-04','2020-12-05', '2020-12-01','2020-12-04','2020-12-05']
symbols=['ABC','ABC','ABC','DEF','DEF','DEF']
v=[1,3,5,7,9,10]
df= pd.DataFrame({'date':dates, 'g':symbols, 'v':v})
df['date'] = pd.to_datetime(df['date'])
df = df.set_index(
['date', 'g']
).unstack(
fill_value=-999
).asfreq(
'D', fill_value=-999
).stack().sort_index(level=1).reset_index()
df.replace(-999, np.nan).ffill()