我在R
中有一个data.frame()
,它包含3列:
id<-c(12312, 12312, 12312, 48373, 345632, 223452)
id2<-c(1928277, 17665363, 8282922, 82827722, 1231233,12312333)
description<-c(Positive, Negative, Indetermined, Positive, Negative, Positive)
我想通过id
删除重复的行,description
中的值为Indetermined
。
这似乎是filter()
的一个问题,所以:
library(dplyr)
df %>%
mutate(count = 1) %>% # count all ids
group_by(id) %>%
mutate(count = sum(count),Duplicate = count>1) %>% # count how often each id occurs and mark duplicates
ungroup() %>%
filter(!Duplicate & description == "Indetermined") # filter out duplicates that are "indetermined"
这不是最好的方法,但这应该能奏效。
(d <- tibble(id,id2,description))
d[!d$id %in% (d$id[d$description == "Indetermined"]),]