我有一个数据框架,总结了不同设备的部署历史,以序列号标识。随着时间的推移,设备(serno
)可以被不同的项目使用,但在任何给定时间只能部署在一个项目上。我生成了一个名为used.elsewhere
的新列,用于标识该行中的血清是否在df中被复制。例如:
project = c("a","b","c","c","c","d","e")
serno = c(1,2,2,2,3,3,3)
deployed = c(T,T,F,F,F,F,T)
used.elsewhere = c(F,T,T,T,T,T,T)
data.frame(project,serno,deployed,used.elsewhere)
project serno deployed used.elsewhere
1 a 1 TRUE FALSE
2 b 2 TRUE TRUE
3 c 2 FALSE TRUE
4 c 2 FALSE TRUE
5 c 3 FALSE TRUE
6 d 3 FALSE TRUE
7 e 3 TRUE TRUE
我想生成一个新列,如果没有部署serno值并且在其他地方使用,则该列指示部署serno的项目:
project = c("a","b","c","c","c","d","e")
serno = c(1,2,2,2,3,3,3)
deployed = c(T,T,F,F,F,F,T)
used.elsewhere = c(F,T,T,T,T,T,T)
other.project = c(NA, NA, "b", "b", "e", "e", NA)
project serno deployed used.elsewhere other.project
1 a 1 TRUE FALSE <NA>
2 b 2 TRUE TRUE <NA>
3 c 2 FALSE TRUE b
4 c 2 FALSE TRUE b
5 c 3 FALSE TRUE e
6 d 3 FALSE TRUE e
7 e 3 TRUE TRUE <NA>
我假设我可以使用如下的ifelse
语句,但我不确定如何完成它。
df %>%
mutate(other.project = ifelse(deployed == F & used.elsewhere == T, ...
提前感谢!
如果您先使用group_by(serno)
,则可以在同一组中包含project
,其中deployed
为TRUE。
library(dplyr)
df %>%
group_by(serno) %>%
mutate(other.project = ifelse(
deployed == FALSE & used.elsewhere == TRUE,
project[deployed],
NA
))
project serno deployed used.elsewhere other.project
<chr> <dbl> <lgl> <lgl> <chr>
1 a 1 TRUE FALSE NA
2 b 2 TRUE TRUE NA
3 c 2 FALSE TRUE b
4 c 2 FALSE TRUE b
5 c 3 FALSE TRUE e
6 d 3 FALSE TRUE e
7 e 3 TRUE TRUE NA
我建议构建一个部署映射,然后将它与您的df:
连接起来。library(dplyr)
deployments <- df %>%
filter(deployed == TRUE) %>%
select(serno, other.project = project)
df %>%
left_join(deployments)
然后你只需要确保你在other.project
列中有你想要的NAs(即,通过重新编码other.project == project
)。