如何使用dplyr计算R中不同的非缺失行


library(dplyr)
mydat <- data.frame(id = c(123, 111, 234, "none", 123, 384, "none"),
id2 = c(1, 1, 1, 2, 2, 3, 4))
> mydat
id id2
1  123   1
2  111   1
3  234   1
4 none   2
5  123   2
6  384   3
7 none   4

我想计算medal中每个id2的唯一id的数量。然而,对于id,也就是none,我不想计数。

> mydat %>% group_by(id2) %>% summarise(count = n_distinct(id))
# A tibble: 4 × 2
id2 count
<dbl> <int>
1     1     3
2     2     2
3     3     1
4     4     1

使用此错误地计数none。所需输出应为

> mydat %>% group_by(id2) %>% summarise(count = n_distinct(id))
# A tibble: 4 × 2
id2 count
<dbl> <int>
1     1     3
2     2     1
3     3     1
4     4     0
mydat %>% group_by(id2) %>% 
summarise(
count = n_distinct(id),
wanted = n_distinct(id[id != "none"])
)
# # A tibble: 4 × 3
#     id2 count wanted
#   <dbl> <int>  <int>
# 1     1     3      3
# 2     2     2      1
# 3     3     1      1
# 4     4     1      0

最新更新