我想修改以下代码:
library(dplyr)
df1<-df7 %>%
group_by(SAMPN,PERNO) %>%
mutate(loop = lag(cumsum(TPURP== "(2) All other home activities" ), default = 1))
我想在 (2( 所有其他家庭活动或 (1( 在家工作(有偿(或 (24( 循环旅行时循环更改。 只是 (24( 循环行程有点不同,它有自己的索引,因此在每一行中,TPURP 为 (24( 循环的循环行程索引 循环更改并在下一行再次更改
dput(Nontest[2893:2913,1:4])
structure(list(SAMPN = c(1626, 1626, 1626, 1626, 1626, 1626,
1626, 1626, 1639, 1639, 1639, 1639, 1639, 1639, 1639, 1640, 1640,
1640, 1643, 1643, 1643), PERNO = c(1, 1, 2, 2, 2, 3, 3, 4, 1,
1, 1, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1), PLANO = c(5, 6, 3, 4, 5,
2, 3, 2, 2, 3, 4, 2, 3, 4, 5, 2, 3, 4, 2, 3, 4), TPURP = structure(c(22L,
9L, 22L, 13L, 1L, 5L, 19L, 5L, 3L, 2L, 22L, 22L, 3L, 13L, 2L,
20L, 15L, 2L, 13L, 13L, 2L), .Label = c("(1) Working at home (for pay)",
"(2) All other home activities", "(3) Work/Job", "(4) All other activities at work",
"(5) Attending class", "(6) All other activities at school",
"(7) Change type of transportation/transfer", "(8) Dropped off passenger",
"(9) Picked up passenger", "(10) Other, specify - transportation",
"(11) Work/Business related", "(12) Service Private Vehicle",
"(13) Routine Shopping", "(14) Shopping for major purchases",
"(15) Household errands", "(16) Personal Business", "(17) Eat meal outside of home",
"(18) Health care", "(19) Civic/Religious activities", "(20) Recreation/Entertainment",
"(21) Visit friends/relative", "(24) Loop trip", "(97) Other, specify"
), class = "factor")), row.names = c(9914L, 9915L, 9916L, 9917L,
9918L, 9919L, 9920L, 9922L, 9974L, 9975L, 9976L, 9977L, 9978L,
9979L, 9980L, 9981L, 9982L, 9983L, 9992L, 9993L, 9994L), class = "data.frame")
输出
SAMPN PERNO PLANO TPURP loop
9914 1626 1 5 (24) Loop trip 1
9915 1626 1 6 (9) Picked up passenger 2
9916 1626 2 3 (24) Loop trip 1
9917 1626 2 4 (13) Routine Shopping 2
9918 1626 2 5 (1) Working at home (for pay) 2
9919 1626 3 2 (5) Attending class 1
9920 1626 3 3 (19) Civic/Religious activities 1
9922 1626 4 2 (5) Attending class 1
9974 1639 1 2 (3) Work/Job 1
9975 1639 1 3 (2) All other home activities 1
9976 1639 1 4 (24) Loop trip 2
9977 1639 2 2 (24) Loop trip 1
9978 1639 2 3 (3) Work/Job 2
9979 1639 2 4 (13) Routine Shopping 2
9980 1639 2 5 (2) All other home activities 2
9981 1640 1 2 (20) Recreation/Entertainment 1
9982 1640 1 3 (15) Household errands 1
9983 1640 1 4 (2) All other home activities 1
9992 1643 1 2 (13) Routine Shopping 1
9993 1643 1 3 (13) Routine Shopping 1
9994 1643 1 4 (2) All other home activities 1
我只是把我想要的放出来。 如果您修复我的代码而不是新方法,那就太好了。 但如果无法修复,请提供其他方法
更多解释数据 (24( 环路行程
structure(list(SAMPN = c(1626, 1626, 1626, 1626, NA, 1626), PERNO = c(1,
1, 1, 1, NA, 2), PLANO = c(4, 5, 6, 7, NA, 2), TPURP = structure(c(22L,
22L, 9L, 2L, NA, 22L), .Label = c("(1) Working at home (for pay)",
"(2) All other home activities", "(3) Work/Job", "(4) All other activities at work",
"(5) Attending class", "(6) All other activities at school",
"(7) Change type of transportation/transfer", "(8) Dropped off passenger",
"(9) Picked up passenger", "(10) Other, specify - transportation",
"(11) Work/Business related", "(12) Service Private Vehicle",
"(13) Routine Shopping", "(14) Shopping for major purchases",
"(15) Household errands", "(16) Personal Business", "(17) Eat meal outside of home",
"(18) Health care", "(19) Civic/Religious activities", "(20) Recreation/Entertainment",
"(21) Visit friends/relative", "(24) Loop trip", "(97) Other, specify"
), class = "factor")), class = c("grouped_df", "tbl_df", "tbl",
"data.frame"), row.names = c(NA, -6L), groups = structure(list(
SAMPN = c(1626, 1626, NA), PERNO = c(1, 2, NA), .rows = list(
1:4, 6L, 5L)), row.names = c(NA, -3L), class = c("tbl_df",
"tbl", "data.frame"), .drop = TRUE))
输出
SAMPN PERNO PLANO TPURP loop
<dbl> <dbl> <dbl> <fct>
1 1626 1 4 (24) Loop trip 1
2 1626 1 5 (24) Loop trip 2
3 1626 1 6 (9) Picked up passenger 3
4 1626 1 7 (2) All other home activities 3
5 NA NA NA NA NA
6 1626 2 2 (24) Loop trip 1
每个环路行程都有自己的索引
我们创建一个向量(vals
(包含所有要增加计数、group_by
SAMPN
和PERNO
的活动,并使用lag
和cummax
为每个组创建一个loop
计数。
vals <- c("(2) All other home activities", "(24) Loop trip",
"(1) Working at home (for pay)")
library(dplyr)
df7 %>%
group_by(SAMPN,PERNO) %>%
mutate(loop = cummax(lag(1 + (TPURP %in% vals), default = 1)))
# SAMPN PERNO PLANO TPURP loop
# <dbl> <dbl> <dbl> <fct> <dbl>
# 1 1626 1 5 (24) Loop trip 1
# 2 1626 1 6 (9) Picked up passenger 2
# 3 1626 2 3 (24) Loop trip 1
# 4 1626 2 4 (13) Routine Shopping 2
# 5 1626 2 5 (1) Working at home (for pay) 2
# 6 1626 3 2 (5) Attending class 1
# 7 1626 3 3 (19) Civic/Religious activities 1
# 8 1626 4 2 (5) Attending class 1
# 9 1639 1 2 (3) Work/Job 1
#10 1639 1 3 (2) All other home activities 1
# … with 11 more rows