我有一个数据帧:
stock price
symbol DATE
ABC 2014-01-02 000001 6
2014-01-03 000001 7
2014-01-06 000001 8
XYZ 2015-07-02 000002 9
2015-07-04 000002 10
2015-07-06 000002 11
我想获得一个新的数据帧作为
stock price
symbol DATE
ABC 2014-01-02 000001 6
2014-01-03 000001 7
2014-01-04 000001 8
2014-01-05 000001 8
2014-01-06 000001 8
XYZ 2015-07-02 000002 9
2015-07-03 000002 10
2015-07-04 000002 10
2015-07-05 000002 11
2015-07-06 000002 11
怎么做?
将GroupBy.apply
中的自定义函数与Series.reindex
:一起使用
f = lambda x: x.reindex(pd.date_range(x.index.min(), x.index.max(), name='DATE')).bfill()
df = df.reset_index(level=0).groupby(['symbol','stock'])['price'].apply(f).reset_index()
print (df)
symbol stock DATE price
0 ABC 000001 2014-01-02 6
1 ABC 000001 2014-01-03 7
2 ABC 000001 2014-01-04 8
3 ABC 000001 2014-01-05 8
4 ABC 000001 2014-01-06 8
5 XYZ 000002 2015-07-02 9
6 XYZ 000002 2015-07-03 10
7 XYZ 000002 2015-07-04 10
8 XYZ 000002 2015-07-05 11
9 XYZ 000002 2015-07-06 11
您可以在分组中使用pandasresample
和asfreq
-这里的假设是DATE
列的日期时间数据类型:
(df
.reset_index('symbol')
.groupby('symbol')
.resample('1D')
.asfreq()
.drop(columns='symbol')
.bfill()
)
stock price
symbol DATE
ABC 2014-01-02 1.0 6.0
2014-01-03 1.0 7.0
2014-01-04 1.0 8.0
2014-01-05 1.0 8.0
2014-01-06 1.0 8.0
XYZ 2015-07-02 2.0 9.0
2015-07-03 2.0 10.0
2015-07-04 2.0 10.0
2015-07-05 2.0 11.0
2015-07-06 2.0 11.0