我有一个如下类型的数据框架。这只是数据集的一个小样本。它的行数很高。
df <- data.frame(team = c("A", "A", "A", "A", "A", "B"),
Skill = c("TSICS", "TSICS", "TSICS", "TSICS", "COHORT", "COHORT"),
Date = c("1/1/2020","7/1/2020", "14/1/2020", "21/1/2020", "7/1/2020", "21/2/2020" ),
Value = (10,20,30,5,6,23)
)
>df
Team Skill Date Value
A TSICS 1/1/2020 10
A TSICS 7/1/2020 20
A TSICS 14/1/2020 30
A TSICS 21/1/2020 5
A COHORT 7/1/2020 6
B COHORT 21/2/2020 23
我想创建一个新的列Result,它是Result = Value[team= A, Skill = TSICS] + Value[team= A, Skill = Cohort]对应该团队和Date(对应日期前1周)。
**请注意:特定日期的值应该从前一个日期中选择。例如:对于团队A: Skill = TSICS, 2020年1月21日的值= 30,即之前的日期值。生成的表的格式应为
Team Skill Date Value Result
A TSICS 1/1/2020 10 0 (no previous date values for TSICS(A) & Cohort(A))
A TSICS 7/1/2020 20 10 (only previous date value for TSICS(A))
A TSICS 14/1/2020 30 26 (sum of previous date values for TSICS(A) & Cohort(A))
A TSICS 21/1/2020 5 30 (previous date value of TSICS(A))
A COHORT 7/1/2020 6 10 (previous date value of TSICS(A))
B COHORT 21/2/2020 23 0 (no previous date value)
我们可以使用滞后函数来获得之前的日期值吗?日期格式为dd-mm-yyyy
你可以试试:
library(tidyverse)
library(lubridate)
df <- df %>%
mutate(Date = dmy(Date))
df %>%
group_split(team) %>%
map(., ~.x %>%
rowwise() %>%
mutate(Result = sum(ifelse(.$Date >= (Date - 8) & .$Date < Date, .$Value, 0)))) %>%
bind_rows()
A tibble: 6 x 5
# Rowwise:
team Skill Date Value Result
<fct> <fct> <date> <dbl> <dbl>
1 A TSICS 2020-01-01 10 0
2 A TSICS 2020-01-07 20 10
3 A TSICS 2020-01-14 30 26
4 A TSICS 2020-01-21 5 30
5 A COHORT 2020-01-07 6 10
6 B COHORT 2020-02-21 23 0