我有这样的数据
structure(list(id = c(1, 1, 2, 2, 2), time = c(1834, 4809, 18,
333, 387), nh_source = c(0, 0, 1, 0, 0), admi_source = c(19,
19, 85, 19, 88), disdest = c(85, 29, 56, 85, 39)), class = "data.frame", row.names = c(NA,
-5L))
,我想对id进行分组,并检查列disdest中的前一个值是否为56或85,列admisorc中的下一个值是否为19,然后向列nh_source列添加1。我希望df看起来像这样
structure(list(id2 = c(1, 1, 2, 2, 2), time = c(1834, 4809, 18,
333, 387), nh_source2 = c(0, 1, 1, 1, 0), admi_source = c(19,
19, 85, 19, 88), disdest = c(85, 29, 56, 85, 39)), class = "data.frame", row.names = c(NA,
-5L))
按'id'分组后创建lag
逻辑条件,并添加到'nh_source' (TRUE
->1和FALSE
->
library(dplyr)
df1 %>%
group_by(id) %>%
mutate(nh_source = nh_source +
(admi_source == 19 & lag(disdest) %in% c(56, 85))) %>%
ungroup
与产出
# A tibble: 5 x 5
id time nh_source admi_source disdest
1 1 1834 0 19 85
2 1 4809 1 19 29
3 2 18 1 85 56
4 2 333 1 19 85
5 2 387 0 88 39