在计算出每日最高值后,每月对数据帧进行分组



我有一个数据帧,它有两列,一列是由日期组成的日期时间数据帧,另一列由数量组成。看起来像这样,

Date   Quantity
0   2019-01-05  10
1   2019-01-10  15
2   2019-01-22  14
3   2019-02-03  12
4   2019-05-11  25
5   2019-05-21  4
6   2019-07-08  1
7   2019-07-30  15
8   2019-09-05  31
9   2019-09-10  44
10  2019-09-25  8
11  2019-12-09  10
12  2020-04-11  111
13  2020-04-17  5
14  2020-06-05  17
15  2020-06-16  12
16  2020-06-22  14

我想制作另一个数据帧。它应该由两列组成,一列是月/年,另一列是直到最高。我基本上想计算当月之前的最高数量值,并使用月/年对其进行分组。我想要的正是

Month/Year  Till Highest
0   Jan/2019    15
1   Feb/2019    15
2   May/2019    25
3   Jul/2019    25
4   Sep/2019    44
5   Dec/2019    44
6   Apr/2020    111
7   Jun/2020    111

在我的情况下,数据集是巨大的,我在指定的时间线内几乎每个月和每年的每一天都有读数。在这里,我制作了一个虚拟数据集来展示我想要的东西。

请帮我做这个。提前感谢:(

请参阅注释代码:

(df
# convert date to monthly period (2019-01)
.assign(Date=pd.to_datetime(df['Date']).dt.to_period('M'))
# period and max quantity per month
.groupby('Date')
.agg(**{'Month/Year': ('Date', 'first'),
'Till highest': ('Quantity', 'max')})
# format periods as Jan/2019 and get cumulated max quantity
.assign(**{
'Month/Year': lambda d: d['Month/Year'].dt.strftime('%b/%Y'),
'Till highest': lambda d: d['Till highest'].cummax()
})
# drop the groupby index
.reset_index(drop=True)
)

输出:

Month/Year  Till highest
0   Jan/2019            15
1   Feb/2019            15
2   May/2019            25
3   Jul/2019            25
4   Sep/2019            44
5   Dec/2019            44
6   Apr/2020           111
7   Jun/2020           111

在R中,您可以使用cummax:

df=data.frame(Date=c("2019-01-05","2019-01-10","2019-01-22","2019-02-03","2019-05-11","2019-05-21","2019-07-08","2019-07-30","2019-09-05","2019-09-10","2019-09-25","2019-12-09","2020-04-11","2020-04-17","2020-06-05","2020-06-16","2020-06-22"),Quantity=c(10,15,14,12,25,4,1,15,31,44,8,10,111,5,17,12,14))
data.frame(`Month/Year`=unique(format(as.Date(df$Date),"%b/%Y")),
`Till Highest`=cummax(tapply(df$Quantity,sub("-..$","",df$Date),max)),
check.names=F,row.names=NULL)
Month/Year Till Highest
1   Jan/2019           15
2   Feb/2019           15
3   May/2019           25
4   Jul/2019           25
5   Sep/2019           44
6   Dec/2019           44
7   Apr/2020          111
8   Jun/2020          111

最新更新