根据pandas数据帧中的索引获取的最新值

在每个会议日期，都会对未来3到4年的gdp增长做出新的预测。如果forecast_year的gdp增长预测与上一次meeting_date相似，则表中不会有新条目。

有没有一种简单的方法可以为所有meeting_dates添加这些缺失的forecast_year条目，引用截至最新meeting_date的最新gdp_growth(%)数据？

为了澄清，这里是输入表df_in:

><2.15><2.50><2.55>2008年02月20日<2.40>td style="text-align:center；">2008-05-21>td style="text align:right">2.852008年07月16日2008年11月19日2008-11-19年2008-11-19年

meeting_date	forecast_year	gdp_growth(%(
2007-11-20	2007	2.45
2007-11-20	2008
2007-11-20	2009
2007-11-20	2010
2008-02-20	2009
2008-02-20	2010	2010
2008-07-16	2010	2.75

通过pivot:的另一种方式

k = df1.pivot(*df1).ffill().stack().reset_index(name = 'GPD Growth (%)')
df = k[~(pd.to_datetime(k["meeting_date"]).dt.year.gt(k["forecast_year"]))]

输出：

meeting_date  forecast_year  GPD Growth (%)
0    2007-11-20           2007            2.45
1    2007-11-20           2008            2.15
2    2007-11-20           2009            2.50
3    2007-11-20           2010            2.55
5    2008-02-20           2008            1.65
6    2008-02-20           2009            2.40
7    2008-02-20           2010            2.75
9    2008-05-21           2008            0.75
10   2008-05-21           2009            2.40
11   2008-05-21           2010            2.85
13   2008-07-16           2008            1.30
14   2008-07-16           2009            2.40
15   2008-07-16           2010            2.75
17   2008-11-19           2008            0.15
18   2008-11-19           2009            0.45
19   2008-11-19           2010            2.75
20   2008-11-19           2011            3.20

尝试：

x = (
df.set_index(["meeting_date", "forecast_year"])
.unstack(level=1)
.ffill()
.stack()
.reset_index()
)
# remove rows where meeting_date > forecast_year
x = x[~(pd.to_datetime(x["meeting_date"]).dt.year > x["forecast_year"])]
print(x)

打印：

meeting_date  forecast_year  gdp_growth (%)
0    2007-11-20           2007            2.45
1    2007-11-20           2008            2.15
2    2007-11-20           2009            2.50
3    2007-11-20           2010            2.55
5    2008-02-20           2008            1.65
6    2008-02-20           2009            2.40
7    2008-02-20           2010            2.75
9    2008-05-21           2008            0.75
10   2008-05-21           2009            2.40
11   2008-05-21           2010            2.85
13   2008-07-16           2008            1.30
14   2008-07-16           2009            2.40
15   2008-07-16           2010            2.75
17   2008-11-19           2008            0.15
18   2008-11-19           2009            0.45
19   2008-11-19           2010            2.75
20   2008-11-19           2011            3.20

编辑：删除MultiIndex.from_product-不需要

相关内容

最新更新

热门标签：