根据月的开始和结束过滤数据



给定一个date列的数据框,格式为:

Date      Group 
2020-05-18     1
2020-06-22     1
2019-07-11     1
2018-03-01     1
2021-01-21     2
2021-05-05     2
2021-09-11     2

和两个字符串;

Start = 2020-05 (indicating month start)
End = 2021-09   (indicating month end)

我想过滤掉数据,以便只有在开始日期和结束日期之间的日期在数据框中可用。

预期输出:

Date      Group 
2020-05-18     1
2020-06-22     1
2021-01-21     2
2021-05-05     2
2021-09-11     2
# Creating dummy data
d = {'dt':['2020-05-18',
'2020-06-22',
'2019-07-11',
'2018-03-01',
'2021-01-21',
'2021-05-05', 
'2021-09-11'], 
'group':[1,1,1,1,2,2,2]}
dt_df = pd.DataFrame(data=d)
dt_df
dt_df['dt'] = pd.to_datetime(dt_df['dt'])
dt_df

初输入:

0   2020-05-18
1   2020-06-22
2   2019-07-11
3   2018-03-01
4   2021-01-21
5   2021-05-05
6   2021-09-11
Name: dt, dtype: datetime64[ns]

Start = '2020-05' 
End = '2021-09' 
Start = pd.to_datetime(Start)
End = pd.to_datetime(End)
End = End+np.timedelta64(1, 'M')

使用loc只选择StartEnd之间的日期。

dt_df.loc[(dt_df['dt'] - Start >= np.timedelta64(0,'D')) & (dt_df['dt'] - End <= np.timedelta64(0, 'D'))]

输出:

dt  group
0 2020-05-18      1
1 2020-06-22      1
4 2021-01-21      2
5 2021-05-05      2
6 2021-09-11      2

相关内容

  • 没有找到相关文章

最新更新