如何根据日期列的1年滞后创建新的度量列?



我想创建一个引用日期列- 1年并显示相应值的新列:

import pandas as pd
Input DF
df = pd.DataFrame({'consumption': [0,1,3,5], 'date':[pd.to_datetime('2017-04-01'),
pd.to_datetime('2017-04-02'),
pd.to_datetime('2018-04-01'),
pd.to_datetime('2018-04-02')]})    
>>> df
consumption       date
0            2017-04-01
1            2017-04-02
3            2018-04-01
5            2018-04-02

Expected DF
df = pd.DataFrame({'consumption': [0,1,3,5], 
'prev_year_consumption': [np.NAN,np.NAN,0,1],
'date':[pd.to_datetime('2017-04-01'),
pd.to_datetime('2017-04-02'),
pd.to_datetime('2018-04-01'),
pd.to_datetime('2018-04-02')]})  
>>> df
consumption  prev_year_consumption       date
0                      NAN     2017-04-01
1                      NAN     2017-04-02
3                      0       2018-04-01
5                      1       2018-04-02

所以prev_year_consumption是消费列中的简单值,其中1年从日期中动态减去。

SQL中的我可能会这样写

SELECT df_past.consumption as prev_year_consumption, df_current.consumption
FROM df as df_current 
LEFT JOIN ON df df_past ON year(df_current.date) = year(df_past.date) - 1

感谢您的提示

pandas中的表示法类似。我们仍然在做自我merge,但是我们需要指定right_on(或left_on)的DateOffset为1年:

new_df = df.merge(
df,
left_on='date',
right_on=df['date'] + pd.offsets.DateOffset(years=1),
how='left'
)

new_df:

date  consumption_x     date_x  consumption_y     date_y
0 2017-04-01              0 2017-04-01            NaN        NaT
1 2017-04-02              1 2017-04-02            NaN        NaT
2 2018-04-01              3 2018-04-01            0.0 2017-04-01
3 2018-04-02              5 2018-04-02            1.0 2017-04-02

我们可以进一步droprename列来得到精确的输出:

new_df = df.merge(
df,
left_on='date',
right_on=df['date'] + pd.offsets.DateOffset(years=1),
how='left'
).drop(columns=['date_x', 'date_y']).rename(columns={
'consumption_y': 'prev_year_consumption'
})

new_df:

date  consumption_x  prev_year_consumption
0 2017-04-01              0                    NaN
1 2017-04-02              1                    NaN
2 2018-04-01              3                    0.0
3 2018-04-02              5                    1.0

最新更新