r-在没有显式for循环的情况下,在组和行上应用递归函数



为标题道歉;我不知道该如何描述这个问题,所以我将举个例子,希望能提供一个reprex。

假设我有以下数据帧:

tmp <- data.frame(Campaign = rep(c("Toy", "Clothing"), each = 10), 
                  CPC = rep(c(6.40, 10.30), each = 10), 
                  CompoundGrowth = rep(c(0.005, 0.0123), each = 10), 
                  Month = rep(seq(as.Date("2013-01-01"), 
                                  as.Date("2013-01-01") + months(9), by = "month"), 2)
                 )
OUT:
-----------------------------------------------------------
    Campaign  CPC CompoundGrowth      Month
1       Toy  6.4         0.0050 2013-01-01
2       Toy  6.4         0.0050 2013-02-01
3       Toy  6.4         0.0050 2013-03-01
4       Toy  6.4         0.0050 2013-04-01
5       Toy  6.4         0.0050 2013-05-01
6       Toy  6.4         0.0050 2013-06-01
7       Toy  6.4         0.0050 2013-07-01
8       Toy  6.4         0.0050 2013-08-01
9       Toy  6.4         0.0050 2013-09-01
10      Toy  6.4         0.0050 2013-10-01
11 Clothing 10.3         0.0123 2013-01-01
12 Clothing 10.3         0.0123 2013-02-01
13 Clothing 10.3         0.0123 2013-03-01
14 Clothing 10.3         0.0123 2013-04-01
15 Clothing 10.3         0.0123 2013-05-01
16 Clothing 10.3         0.0123 2013-06-01
17 Clothing 10.3         0.0123 2013-07-01
18 Clothing 10.3         0.0123 2013-08-01
19 Clothing 10.3         0.0123 2013-09-01
20 Clothing 10.3         0.0123 2013-10-01

我想添加一个专栏ProjCPC

ProjCPC = ProjCPC + ProjCPC * CompoundGrowth 

其中第一行将是

CPC + CPC * CompoundGrowth. 

换句话说,

f(x) = f(x)*c + f(x).

我真的很想让工作

projection_func <- function(current_projection, cpc_growth){
    current_projection = current_projection * cpc_growth + current_projection
    return(current_projection)
}
tmp %>% dplyr::group_by(Campaign, Month) %>% 
    dplyr::rowwise() %>%
    mutate(CPC = ifelse(is.na(lag(CPC)), projection_func(CPC, CompoundGrowth),
                                      projection_func(lag(CPC), CompoundGrowth)
    )
)

但它只做第一个计算:

   Campaign   CPC CompoundGrowth Month     
   <fct>    <dbl>          <dbl> <date>    
 1 Toy       6.43         0.005  2013-01-01
 2 Toy       6.43         0.005  2013-02-01
 3 Toy       6.43         0.005  2013-03-01
 4 Toy       6.43         0.005  2013-04-01
 5 Toy       6.43         0.005  2013-05-01
 6 Toy       6.43         0.005  2013-06-01
 7 Toy       6.43         0.005  2013-07-01
 8 Toy       6.43         0.005  2013-08-01
 9 Toy       6.43         0.005  2013-09-01
10 Toy       6.43         0.005  2013-10-01
11 Clothing 10.4          0.0123 2013-01-01
12 Clothing 10.4          0.0123 2013-02-01
13 Clothing 10.4          0.0123 2013-03-01
14 Clothing 10.4          0.0123 2013-04-01
15 Clothing 10.4          0.0123 2013-05-01
16 Clothing 10.4          0.0123 2013-06-01
17 Clothing 10.4          0.0123 2013-07-01
18 Clothing 10.4          0.0123 2013-08-01
19 Clothing 10.4          0.0123 2013-09-01
20 Clothing 10.4          0.0123 2013-10-01

我期望的输出:

   Campaign  expected CompoundGrowth      Month
1       Toy  6.432000         0.0050 2013-01-01
2       Toy  6.464160         0.0050 2013-02-01
3       Toy  6.496481         0.0050 2013-03-01
4       Toy  6.528963         0.0050 2013-04-01
5       Toy  6.561608         0.0050 2013-05-01
6       Toy  6.594416         0.0050 2013-06-01
7       Toy  6.627388         0.0050 2013-07-01
8       Toy  6.660525         0.0050 2013-08-01
9       Toy  6.693828         0.0050 2013-09-01
10      Toy  6.727297         0.0050 2013-10-01
11 Clothing 10.426690         0.0123 2013-01-01
12 Clothing 10.554938         0.0123 2013-02-01
13 Clothing 10.684764         0.0123 2013-03-01
14 Clothing 10.816187         0.0123 2013-04-01
15 Clothing 10.949226         0.0123 2013-05-01
16 Clothing 11.083901         0.0123 2013-06-01
17 Clothing 11.220233         0.0123 2013-07-01
18 Clothing 11.358242         0.0123 2013-08-01
19 Clothing 11.497948         0.0123 2013-09-01
20 Clothing 11.639373         0.0123 2013-10-01

用于生成上述内容的代码(试图避免这种解决方案(:

for(i in seq_along(tmp$Campaign)){
    if(i%%10 %in% c(1,10)){
        starter = projection_func(tmp$CPC[i], tmp$CompoundGrowth[i])
    }
    else{
        starter = projection_func(starter, tmp$CompoundGrowth[i])
    }
    tmp$expected[i] <- starter
}

由于每次计算都取决于上次计算的结果,因此可以从purrr使用accumalate

library(dplyr)
tmp %>%
  group_by(Campaign) %>%
  mutate(ProjCPC = purrr::accumulate(CompoundGrowth, projection_func, 
                  .init = first(CPC))[-1]) 
#   Campaign  CPC CompoundGrowth      Month   ProjCPC
#1       Toy  6.4         0.0050 2013-01-01  6.432000
#2       Toy  6.4         0.0050 2013-02-01  6.464160
#3       Toy  6.4         0.0050 2013-03-01  6.496481
#4       Toy  6.4         0.0050 2013-04-01  6.528963
#5       Toy  6.4         0.0050 2013-05-01  6.561608
#6       Toy  6.4         0.0050 2013-06-01  6.594416
#7       Toy  6.4         0.0050 2013-07-01  6.627388
#8       Toy  6.4         0.0050 2013-08-01  6.660525
#9       Toy  6.4         0.0050 2013-09-01  6.693828
#10      Toy  6.4         0.0050 2013-10-01  6.727297
#11 Clothing 10.3         0.0123 2013-01-01 10.426690
#12 Clothing 10.3         0.0123 2013-02-01 10.554938
#13 Clothing 10.3         0.0123 2013-03-01 10.684764
#14 Clothing 10.3         0.0123 2013-04-01 10.816187
#15 Clothing 10.3         0.0123 2013-05-01 10.949226
#16 Clothing 10.3         0.0123 2013-06-01 11.083901
#17 Clothing 10.3         0.0123 2013-07-01 11.220233
#18 Clothing 10.3         0.0123 2013-08-01 11.358242
#19 Clothing 10.3         0.0123 2013-09-01 11.497948
#20 Clothing 10.3         0.0123 2013-10-01 11.639373

最新更新