r-使用日期序列,将一个数字除以四分之一的天数



数据帧和一些变量:

library(tidyverse)
library(lubridate)
budget_2020_q4 <- 1000000
budget_2021_q1 <- 2000000
budget_2021_q2 <- 3000000
budget_2021_q3 <- 3000000
budget_2021_q4 <- 2000000

calendar <- data.frame(
cohort = seq('2020-10-01' %>% ymd, '2021-12-31' %>% ymd, by = '1 days')) %>% 
mutate(Quarter = quarter(cohort, with_year = T))

我现在有一个显示日期和这些日期所在季度的数据框:

calendar %>% head
cohort Quarter
1 2020-10-01  2020.4
2 2020-10-02  2020.4
3 2020-10-03  2020.4
4 2020-10-04  2020.4
5 2020-10-05  2020.4
6 2020-10-06  2020.4

我也知道每个季度的频率:

calendar$Quarter %>% table
.
2020.4 2021.1 2021.2 2021.3 2021.4 
92     90     91     92     92

我想修改一个新的专栏"daily_budget",它将该季度的预算除以该季度的日期频率。

例如,2020年第四季度的预算为1000000,第四季度有92天,因此每日预算为1000000/92=10869.57

mutate(Quarter = quarter(cohort, with_year = T))之后,我可以以某种方式将此计算集成到我的dplyr操作管道中吗?

首先,让我们把预算放在一个数据框架中:

budgets <- c(budget_2020_q4 = 1000000,
budget_2021_q1 = 2000000,
budget_2021_q2 = 3000000,
budget_2021_q3 = 3000000,
budget_2021_q4 = 2000000) %>% 
enframe(name = "Quarter", value = "budget") %>% 
mutate(Quarter = as.numeric(str_replace(str_remove(Quarter, "budget_"), "_q", ".")))

然后,这是一个count的问题(tidyverse对table的替代方案(,即每个Quarter的行数,将预算相加并除以两者:

calendar %>% 
add_count(Quarter) %>% 
left_join(budgets, by = "Quarter") %>% 
mutate(budget_by_day = budget / n)

这就产生了

cohort Quarter  n budget budget_by_day
1   2020-10-01  2020.4 92  1e+06      10869.57
2   2020-10-02  2020.4 92  1e+06      10869.57
3   2020-10-03  2020.4 92  1e+06      10869.57
4   2020-10-04  2020.4 92  1e+06      10869.57
5   2020-10-05  2020.4 92  1e+06      10869.57
6   2020-10-06  2020.4 92  1e+06      10869.57
7   2020-10-07  2020.4 92  1e+06      10869.57
8   2020-10-08  2020.4 92  1e+06      10869.57
9   2020-10-09  2020.4 92  1e+06      10869.57
10  2020-10-10  2020.4 92  1e+06      10869.57
...

最新更新