r-筛选所有组共享值的数据



我可能脑子里有个屁,但我想做的只是filter组,products对所有class组都是通用的。我应该注意,根据我的真实数据,我不能硬编码filter。这是一个filter(product %in% c("apple","lemon")不会工作,因为我有成千上万的产品。

示例代码数据

其中applelemon对于所有classes 是公共的

df <- data.frame(
class = c("A", "A", "A", "A", "A",
"B", "B", "B", "B", "B",
"C", "C", "C", "C", "C"),
product = c("apple", "lemon", "banana"," orange", "papaya",
"apple", "lemon", "lime", "blueberry", "raspberry",
"apple", "lemon", "mango", "strawberry", "pear")
)
df
class    product
1      A      apple
2      A      lemon
3      A     banana
4      A     orange
5      A     papaya
6      B      apple
7      B      lemon
8      B       lime
9      B  blueberry
10     B  raspberry
11     C      apple
12     C      lemon
13     C      mango
14     C strawberry
15     C       pear

所需输出

class    product
1      A      apple
2      A      lemon
3      B      apple
4      B      lemon
5      C      apple
6      C      lemon

如果一个组中不会有重复,看看下面的代码是否适用于

library(dplyr)
df %>% group_by(product) %>% mutate(cnt = n()) %>% group_by(class) %>% 
mutate(g = group_indices()) %>% ungroup() %>% 
filter(cnt >= max(g)) %>% select(-cnt)
# A tibble: 6 x 3
class product     g
<chr> <chr>   <int>
1 A     apple       1
2 A     lemon       1
3 B     apple       2
4 B     lemon       2
5 C     apple       3
6 C     lemon       3

最新更新